Sean Anderson | 445fd22 | 2020-06-24 06:41:24 -0400 | [diff] [blame^] | 1 | .. SPDX-License-Identifier: GPL-2.0+ |
| 2 | .. Copyright (C) 2020 Sean Anderson <seanga2@gmail.com> |
| 3 | |
| 4 | Maix Bit |
| 5 | ======== |
| 6 | |
| 7 | Several of the Sipeed Maix series of boards cotain the Kendryte K210 processor, |
| 8 | a 64-bit RISC-V CPU. This processor contains several peripherals to accelerate |
| 9 | neural network processing and other "ai" tasks. This includes a "KPU" neural |
| 10 | network processor, an audio processor supporting beamforming reception, and a |
| 11 | digital video port supporting capture and output at VGA resolution. Other |
| 12 | peripherals include 8M of SRAM (accessible with and without caching); remappable |
| 13 | pins, including 40 GPIOs; AES, FFT, and SHA256 accelerators; a DMA controller; |
| 14 | and I2C, I2S, and SPI controllers. Maix peripherals vary, but include spi flash; |
| 15 | on-board usb-serial bridges; ports for cameras, displays, and sd cards; and |
| 16 | ESP32 chips. Currently, only the Sipeed Maix Bit V2.0 (bitm) is supported, but |
| 17 | the boards are fairly similar. |
| 18 | |
| 19 | Documentation for Maix boards is available from |
| 20 | `Sipeed's website <http://dl.sipeed.com/MAIX/HDK/>`_. |
| 21 | Documentation for the Kendryte K210 is available from |
| 22 | `Kendryte's website <https://kendryte.com/downloads/>`_. However, hardware |
| 23 | details are rather lacking, so most technical reference has been taken from the |
| 24 | `standalone sdk <https://github.com/kendryte/kendryte-standalone-sdk>`_. |
| 25 | |
| 26 | Build and boot steps |
| 27 | -------------------- |
| 28 | |
| 29 | To build u-boot, run |
| 30 | |
| 31 | .. code-block:: none |
| 32 | |
| 33 | make sipeed_maix_bitm_defconfig |
| 34 | make CROSS_COMPILE=<your cross compile prefix> |
| 35 | |
| 36 | To flash u-boot to a maix bit, run |
| 37 | |
| 38 | .. code-block:: none |
| 39 | |
| 40 | kflash -tp /dev/<your tty here> -B bit_mic u-boot-dtb.bin |
| 41 | |
| 42 | Boot output should look like the following: |
| 43 | |
| 44 | .. code-block:: none |
| 45 | |
| 46 | U-Boot 2020.04-rc2-00087-g2221cc09c1-dirty (Feb 28 2020 - 13:53:09 -0500) |
| 47 | |
| 48 | DRAM: 8 MiB |
| 49 | In: serial@38000000 |
| 50 | Out: serial@38000000 |
| 51 | Err: serial@38000000 |
| 52 | => |
| 53 | |
| 54 | Loading Images |
| 55 | ^^^^^^^^^^^^^^ |
| 56 | |
| 57 | To load a kernel, transfer it over serial. |
| 58 | |
| 59 | .. code-block:: none |
| 60 | |
| 61 | => loady 80000000 1500000 |
| 62 | ## Switch baudrate to 1500000 bps and press ENTER ... |
| 63 | |
| 64 | *** baud: 1500000 |
| 65 | |
| 66 | *** baud: 1500000 *** |
| 67 | ## Ready for binary (ymodem) download to 0x80000000 at 1500000 bps... |
| 68 | C |
| 69 | *** file: loader.bin |
| 70 | $ sz -vv loader.bin |
| 71 | Sending: loader.bin |
| 72 | Bytes Sent:2478208 BPS:72937 |
| 73 | Sending: |
| 74 | Ymodem sectors/kbytes sent: 0/ 0k |
| 75 | Transfer complete |
| 76 | |
| 77 | *** exit status: 0 *** |
| 78 | ## Total Size = 0x0025d052 = 2478162 Bytes |
| 79 | ## Switch baudrate to 115200 bps and press ESC ... |
| 80 | |
| 81 | *** baud: 115200 |
| 82 | |
| 83 | *** baud: 115200 *** |
| 84 | => |
| 85 | |
| 86 | Running Programs |
| 87 | ^^^^^^^^^^^^^^^^ |
| 88 | |
| 89 | Binaries |
| 90 | """""""" |
| 91 | |
| 92 | To run a bare binary, use the ``go`` command: |
| 93 | |
| 94 | .. code-block:: none |
| 95 | |
| 96 | => loady |
| 97 | ## Ready for binary (ymodem) download to 0x80000000 at 115200 bps... |
| 98 | C |
| 99 | *** file: ./examples/standalone/hello_world.bin |
| 100 | $ sz -vv ./examples/standalone/hello_world.bin |
| 101 | Sending: hello_world.bin |
| 102 | Bytes Sent: 4864 BPS:649 |
| 103 | Sending: |
| 104 | Ymodem sectors/kbytes sent: 0/ 0k |
| 105 | Transfer complete |
| 106 | |
| 107 | *** exit status: 0 *** |
| 108 | (CAN) packets, 5 retries |
| 109 | ## Total Size = 0x000012f8 = 4856 Bytes |
| 110 | => go 80000000 |
| 111 | ## Starting application at 0x80000000 ... |
| 112 | Example expects ABI version 9 |
| 113 | Actual U-Boot ABI version 9 |
| 114 | Hello World |
| 115 | argc = 1 |
| 116 | argv[0] = "80000000" |
| 117 | argv[1] = "<NULL>" |
| 118 | Hit any key to exit ... |
| 119 | |
| 120 | Legacy Images |
| 121 | """"""""""""" |
| 122 | |
| 123 | To run legacy images, use the ``bootm`` command: |
| 124 | |
| 125 | .. code-block:: none |
| 126 | |
| 127 | $ tools/mkimage -A riscv -O u-boot -T standalone -C none -a 80000000 -e 80000000 -d examples/standalone/hello_world.bin hello_world.img |
| 128 | Image Name: |
| 129 | Created: Thu Mar 5 12:04:10 2020 |
| 130 | Image Type: RISC-V U-Boot Standalone Program (uncompressed) |
| 131 | Data Size: 4856 Bytes = 4.74 KiB = 0.00 MiB |
| 132 | Load Address: 80000000 |
| 133 | Entry Point: 80000000 |
| 134 | |
| 135 | $ picocom -b 115200 /dev/ttyUSB0i |
| 136 | => loady |
| 137 | ## Ready for binary (ymodem) download to 0x80000000 at 115200 bps... |
| 138 | C |
| 139 | *** file: hello_world.img |
| 140 | $ sz -vv hello_world.img |
| 141 | Sending: hello_world.img |
| 142 | Bytes Sent: 4992 BPS:665 |
| 143 | Sending: |
| 144 | Ymodem sectors/kbytes sent: 0/ 0k |
| 145 | Transfer complete |
| 146 | |
| 147 | *** exit status: 0 *** |
| 148 | CAN) packets, 3 retries |
| 149 | ## Total Size = 0x00001338 = 4920 Bytes |
| 150 | => bootm |
| 151 | ## Booting kernel from Legacy Image at 80000000 ... |
| 152 | Image Name: |
| 153 | Image Type: RISC-V U-Boot Standalone Program (uncompressed) |
| 154 | Data Size: 4856 Bytes = 4.7 KiB |
| 155 | Load Address: 80000000 |
| 156 | Entry Point: 80000000 |
| 157 | Verifying Checksum ... OK |
| 158 | Loading Standalone Program |
| 159 | Example expects ABI version 9 |
| 160 | Actual U-Boot ABI version 9 |
| 161 | Hello World |
| 162 | argc = 0 |
| 163 | argv[0] = "<NULL>" |
| 164 | Hit any key to exit ... |
| 165 | |
| 166 | Over- and Under-clocking |
| 167 | ------------------------ |
| 168 | |
| 169 | To change the clock speed of the K210, you will need to enable |
| 170 | ``CONFIG_CLK_K210_SET_RATE`` and edit the board's device tree. To do this, add a |
| 171 | section to ``arch/riscv/arch/riscv/dts/k210-maix-bit.dts`` like the following: |
| 172 | |
| 173 | .. code-block:: none |
| 174 | |
| 175 | &sysclk { |
| 176 | assigned-clocks = <&sysclk K210_CLK_PLL0>; |
| 177 | assigned-clock-rates = <800000000>; |
| 178 | }; |
| 179 | |
| 180 | There are three PLLs on the K210: PLL0 is the parent of most of the components, |
| 181 | including the CPU and RAM. PLL1 is the parent of the neural network coprocessor. |
| 182 | PLL2 is the parent of the sound processing devices. Note that child clocks of |
| 183 | PLL0 and PLL2 run at *half* the speed of the PLLs. For example, if PLL0 is |
| 184 | running at 800 MHz, then the CPU will run at 400 MHz. This is the example given |
| 185 | above. The CPU can be overclocked to around 600 MHz, and underclocked to 26 MHz. |
| 186 | |
| 187 | It is possible to set PLL2's parent to PLL0. The plls are more accurate when |
| 188 | converting between similar frequencies. This makes it easier to get an accurate |
| 189 | frequency for I2S. As an example, consider sampling an I2S device at 44.1 kHz. |
| 190 | On this device, the I2S serial clock runs at 64 times the sample rate. |
| 191 | Therefore, we would like to run PLL2 at an even multiple of 2.8224 MHz. If |
| 192 | PLL2's parent is IN0, we could use a frequency of 390 MHz (the same as the CPU's |
| 193 | default speed). Dividing by 138 yields a serial clock of about 2.8261 MHz. This |
| 194 | results in a sample rate of 44.158 kHz---around 50 Hz or .1% too fast. If, |
| 195 | instead, we set PLL2's parent to PLL1 running at 390 MHz, and request a rate of |
| 196 | 2.8224 * 136 = 383.8464 MHz, the achieved rate is 383.90625 MHz. Dividing by 136 |
| 197 | yields a serial clock of about 2.8228 MHz. This results in a sample rate of |
| 198 | 44.107 kHz---just 7 Hz or .02% too fast. This configuration is shown in the |
| 199 | following example: |
| 200 | |
| 201 | .. code-block:: none |
| 202 | |
| 203 | &sysclk { |
| 204 | assigned-clocks = <&sysclk K210_CLK_PLL1>, <&sysclk K210_CLK_PLL2>; |
| 205 | assigned-clock-parents = <0>, <&sysclk K210_CLK_PLL1>; |
| 206 | assigned-clock-rates = <390000000>, <383846400>; |
| 207 | }; |
| 208 | |
| 209 | There are a couple of quirks to the PLLs. First, there are more frequency ratios |
| 210 | just above and below 1.0, but there is a small gap around 1.0. To be explicit, |
| 211 | if the input frequency is 100 MHz, it would be impossible to have an output of |
| 212 | 99 or 101 MHz. In addition, there is a maximum frequency for the internal VCO, |
| 213 | so higher input/output frequencies will be less accurate than lower ones. |
| 214 | |
| 215 | Technical Details |
| 216 | ----------------- |
| 217 | |
| 218 | Boot Sequence |
| 219 | ^^^^^^^^^^^^^ |
| 220 | |
| 221 | 1. ``RESET`` pin is deasserted. |
| 222 | 2. Both harts begin executing at ``0x00001000``. |
| 223 | 3. Both harts jump to firmware at ``0x88000000``. |
| 224 | 4. One hart is chosen as a boot hart. |
| 225 | 5. Firmware reads value of pin ``IO_16`` (ISP). |
| 226 | |
| 227 | * If the pin is low, enter ISP mode. This mode allows loading data to ram, |
| 228 | writing it to flash, and booting from specific addresses. |
| 229 | * If the pin is high, continue boot. |
| 230 | 6. Firmware reads the next stage from flash (SPI3) to address ``0x80000000``. |
| 231 | |
| 232 | * If byte 0 is 1, the next stage is decrypted using the built-in AES |
| 233 | accelerator and the one-time programmable, 128-bit AES key. |
| 234 | * Bytes 1 to 4 hold the length of the next stage. |
| 235 | * The SHA-256 sum of the next stage is automatically calculated, and verified |
| 236 | against the 32 bytes following the next stage. |
| 237 | 7. The boot hart sends an IPI to the other hart telling it to jump to the next |
| 238 | stage. |
| 239 | 8. The boot hart jumps to ``0x80000000``. |
| 240 | |
| 241 | Memory Map |
| 242 | ^^^^^^^^^^ |
| 243 | |
| 244 | ========== ========= =========== |
| 245 | Address Size Description |
| 246 | ========== ========= =========== |
| 247 | 0x00000000 0x1000 debug |
| 248 | 0x00001000 0x1000 rom |
| 249 | 0x02000000 0xC000 clint |
| 250 | 0x0C000000 0x4000000 plic |
| 251 | 0x38000000 0x1000 uarths |
| 252 | 0x38001000 0x1000 gpiohs |
| 253 | 0x40000000 0x400000 sram0 (non-cached) |
| 254 | 0x40400000 0x200000 sram1 (non-cached) |
| 255 | 0x40600000 0x200000 airam (non-cached) |
| 256 | 0x40800000 0xC00000 kpu |
| 257 | 0x42000000 0x400000 fft |
| 258 | 0x50000000 0x1000 dmac |
| 259 | 0x50200000 0x200000 apb0 |
| 260 | 0x50200000 0x80 gpio |
| 261 | 0x50210000 0x100 uart0 |
| 262 | 0x50220000 0x100 uart1 |
| 263 | 0x50230000 0x100 uart2 |
| 264 | 0x50240000 0x100 spi slave |
| 265 | 0x50250000 0x200 i2s0 |
| 266 | 0x50250200 0x200 apu |
| 267 | 0x50260000 0x200 i2s1 |
| 268 | 0x50270000 0x200 i2s2 |
| 269 | 0x50280000 0x100 i2c0 |
| 270 | 0x50290000 0x100 i2c1 |
| 271 | 0x502A0000 0x100 i2c2 |
| 272 | 0x502B0000 0x100 fpioa |
| 273 | 0x502C0000 0x100 sha256 |
| 274 | 0x502D0000 0x100 timer0 |
| 275 | 0x502E0000 0x100 timer1 |
| 276 | 0x502F0000 0x100 timer2 |
| 277 | 0x50400000 0x200000 apb1 |
| 278 | 0x50400000 0x100 wdt0 |
| 279 | 0x50410000 0x100 wdt1 |
| 280 | 0x50420000 0x100 otp control |
| 281 | 0x50430000 0x100 dvp |
| 282 | 0x50440000 0x100 sysctl |
| 283 | 0x50450000 0x100 aes |
| 284 | 0x50460000 0x100 rtc |
| 285 | 0x52000000 0x4000000 apb2 |
| 286 | 0x52000000 0x100 spi0 |
| 287 | 0x53000000 0x100 spi1 |
| 288 | 0x54000000 0x200 spi3 |
| 289 | 0x80000000 0x400000 sram0 (cached) |
| 290 | 0x80400000 0x200000 sram1 (cached) |
| 291 | 0x80600000 0x200000 airam (cached) |
| 292 | 0x88000000 0x20000 otp |
| 293 | 0x88000000 0xC200 firmware |
| 294 | 0x8801C000 0x1000 riscv priv spec 1.9 config |
| 295 | 0x8801D000 0x2000 flattened device tree (contains only addresses and |
| 296 | interrupts) |
| 297 | 0x8801f000 0x1000 credits |
| 298 | ========== ========= =========== |