Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0+ |
| 2 | .. Copyright (c) 2013 The Chromium OS Authors. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 3 | |
| 4 | Tracing in U-Boot |
| 5 | ================= |
| 6 | |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 7 | U-Boot supports a simple tracing feature which allows a record of execution |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 8 | to be collected and sent to a host machine for analysis. At present the |
| 9 | main use for this is to profile boot time. |
| 10 | |
| 11 | |
| 12 | Overview |
| 13 | -------- |
| 14 | |
| 15 | The trace feature uses GCC's instrument-functions feature to trace all |
| 16 | function entry/exit points. These are then recorded in a memory buffer. |
| 17 | The memory buffer can be saved to the host over a network link using |
| 18 | tftpput or by writing to an attached memory device such as MMC. |
| 19 | |
| 20 | On the host, the file is first converted with a tool called 'proftool', |
| 21 | which extracts useful information from it. The resulting trace output |
| 22 | resembles that emitted by Linux's ftrace feature, so can be visually |
| 23 | displayed by pytimechart. |
| 24 | |
| 25 | |
| 26 | Quick-start using Sandbox |
| 27 | ------------------------- |
| 28 | |
| 29 | Sandbox is a build of U-Boot that can run under Linux so it is a convenient |
| 30 | way of trying out tracing before you use it on your actual board. To do |
| 31 | this, follow these steps: |
| 32 | |
Simon Glass | 84b7711 | 2021-11-24 09:26:39 -0700 | [diff] [blame] | 33 | Add the following to config/sandbox_defconfig |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 34 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 35 | .. code-block:: c |
| 36 | |
Simon Glass | 84b7711 | 2021-11-24 09:26:39 -0700 | [diff] [blame] | 37 | CONFIG_TRACE=y |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 38 | |
| 39 | Build sandbox U-Boot with tracing enabled: |
| 40 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 41 | .. code-block:: console |
| 42 | |
| 43 | $ make FTRACE=1 O=sandbox sandbox_config |
| 44 | $ make FTRACE=1 O=sandbox |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 45 | |
| 46 | Run sandbox, wait for a bit of trace information to appear, and then capture |
| 47 | a trace: |
| 48 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 49 | .. code-block:: console |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 50 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 51 | $ ./sandbox/u-boot |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 52 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 53 | U-Boot 2013.04-rc2-00100-ga72fcef (Apr 17 2013 - 19:25:24) |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 54 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 55 | DRAM: 128 MiB |
| 56 | trace: enabled |
| 57 | Using default environment |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 58 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 59 | In: serial |
| 60 | Out: serial |
| 61 | Err: serial |
| 62 | =>trace stats |
| 63 | 671,406 function sites |
| 64 | 69,712 function calls |
| 65 | 0 untracked function calls |
| 66 | 73,373 traced function calls |
| 67 | 16 maximum observed call depth |
| 68 | 15 call depth limit |
| 69 | 66,491 calls not traced due to depth |
| 70 | =>trace stats |
| 71 | 671,406 function sites |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 72 | 1,279,450 function calls |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 73 | 0 untracked function calls |
| 74 | 950,490 traced function calls (333217 dropped due to overflow) |
| 75 | 16 maximum observed call depth |
| 76 | 15 call depth limit |
| 77 | 1,275,767 calls not traced due to depth |
| 78 | =>trace calls 0 e00000 |
| 79 | Call list dumped to 00000000, size 0xae0a40 |
| 80 | =>print |
| 81 | baudrate=115200 |
| 82 | profbase=0 |
| 83 | profoffset=ae0a40 |
| 84 | profsize=e00000 |
| 85 | stderr=serial |
| 86 | stdin=serial |
| 87 | stdout=serial |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 88 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 89 | Environment size: 117/8188 bytes |
| 90 | =>host save host 0 trace 0 ${profoffset} |
| 91 | 11405888 bytes written in 10 ms (1.1 GiB/s) |
| 92 | =>reset |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 93 | |
| 94 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 95 | Then run proftool to convert the trace information to ftrace format |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 96 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 97 | .. code-block:: console |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 98 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 99 | $ ./sandbox/tools/proftool -m sandbox/System.map -p trace dump-ftrace >trace.txt |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 100 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 101 | Finally run pytimechart to display it |
| 102 | |
| 103 | .. code-block:: console |
| 104 | |
| 105 | $ pytimechart trace.txt |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 106 | |
| 107 | Using this tool you can zoom and pan across the trace, with the function |
| 108 | calls on the left and little marks representing the start and end of each |
| 109 | function. |
| 110 | |
| 111 | |
| 112 | CONFIG Options |
| 113 | -------------- |
| 114 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 115 | CONFIG_TRACE |
| 116 | Enables the trace feature in U-Boot. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 117 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 118 | CONFIG_CMD_TRACE |
| 119 | Enables the trace command. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 120 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 121 | CONFIG_TRACE_BUFFER_SIZE |
| 122 | Size of trace buffer to allocate for U-Boot. This buffer is |
| 123 | used after relocation, as a place to put function tracing |
| 124 | information. The address of the buffer is determined by |
| 125 | the relocation code. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 126 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 127 | CONFIG_TRACE_EARLY |
| 128 | Define this to start tracing early, before relocation. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 129 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 130 | CONFIG_TRACE_EARLY_SIZE |
| 131 | Size of 'early' trace buffer. Before U-Boot has relocated |
| 132 | it doesn't have a proper trace buffer. On many boards |
| 133 | you can define an area of memory to use for the trace |
| 134 | buffer until the 'real' trace buffer is available after |
| 135 | relocation. The contents of this buffer are then copied to |
| 136 | the real buffer. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 137 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 138 | CONFIG_TRACE_EARLY_ADDR |
| 139 | Address of early trace buffer |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 140 | |
| 141 | |
| 142 | Building U-Boot with Tracing Enabled |
| 143 | ------------------------------------ |
| 144 | |
| 145 | Pass 'FTRACE=1' to the U-Boot Makefile to actually instrument the code. |
| 146 | This is kept as a separate option so that it is easy to enable/disable |
| 147 | instrumenting from the command line instead of having to change board |
| 148 | config files. |
| 149 | |
| 150 | |
| 151 | Collecting Trace Data |
| 152 | --------------------- |
| 153 | |
| 154 | When you run U-Boot on your board it will collect trace data up to the |
| 155 | limit of the trace buffer size you have specified. Once that is exhausted |
| 156 | no more data will be collected. |
| 157 | |
| 158 | Collecting trace data has an affect on execution time/performance. You |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 159 | will notice this particularly with trivial functions - the overhead of |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 160 | recording their execution may even exceed their normal execution time. |
| 161 | In practice this doesn't matter much so long as you are aware of the |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 162 | effect. Once you have done your optimizations, turn off tracing before |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 163 | doing end-to-end timing. |
| 164 | |
| 165 | The best time to start tracing is right at the beginning of U-Boot. The |
| 166 | best time to stop tracing is right at the end. In practice it is hard |
| 167 | to achieve these ideals. |
| 168 | |
| 169 | This implementation enables tracing early in board_init_f(). This means |
| 170 | that it captures most of the board init process, missing only the |
| 171 | early architecture-specific init. However, it also misses the entire |
| 172 | SPL stage if there is one. |
| 173 | |
| 174 | U-Boot typically ends with a 'bootm' command which loads and runs an |
| 175 | OS. There is useful trace data in the execution of that bootm |
| 176 | command. Therefore this implementation provides a way to collect trace |
| 177 | data after bootm has finished processing, but just before it jumps to |
| 178 | the OS. In practical terms, U-Boot runs the 'fakegocmd' environment |
| 179 | variable at this point. This variable should have a short script which |
| 180 | collects the trace data and writes it somewhere. |
| 181 | |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 182 | Trace data collection relies on a microsecond timer, accessed through |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 183 | timer_get_us(). So the first think you should do is make sure that |
| 184 | this produces sensible results for your board. Suitable sources for |
| 185 | this timer include high resolution timers, PWMs or profile timers if |
| 186 | available. Most modern SOCs have a suitable timer for this. Make sure |
| 187 | that you mark this timer (and anything it calls) with |
| 188 | __attribute__((no_instrument_function)) so that the trace library can |
| 189 | use it without causing an infinite loop. |
| 190 | |
| 191 | |
| 192 | Commands |
| 193 | -------- |
| 194 | |
| 195 | The trace command has variable sub-commands: |
| 196 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 197 | stats |
| 198 | Display tracing statistics |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 199 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 200 | pause |
| 201 | Pause tracing |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 202 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 203 | resume |
| 204 | Resume tracing |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 205 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 206 | funclist [<addr> <size>] |
| 207 | Dump a list of functions into the buffer |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 208 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 209 | calls [<addr> <size>] |
| 210 | Dump function call trace into buffer |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 211 | |
| 212 | If the address and size are not given, these are obtained from environment |
| 213 | variables (see below). In any case the environment variables are updated |
| 214 | after the command runs. |
| 215 | |
| 216 | |
| 217 | Environment Variables |
| 218 | --------------------- |
| 219 | |
| 220 | The following are used: |
| 221 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 222 | profbase |
| 223 | Base address of trace output buffer |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 224 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 225 | profoffset |
| 226 | Offset of first unwritten byte in trace output buffer |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 227 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 228 | profsize |
| 229 | Size of trace output buffer |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 230 | |
| 231 | All of these are set by the 'trace calls' command. |
| 232 | |
| 233 | These variables keep track of the amount of data written to the trace |
| 234 | output buffer by the 'trace' command. The trace commands which write data |
| 235 | to the output buffer can use these to specify the buffer to write to, and |
| 236 | update profoffset each time. This allows successive commands to append data |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 237 | to the same buffer, for example:: |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 238 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 239 | => trace funclist 10000 e00000 |
| 240 | => trace calls |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 241 | |
| 242 | (the latter command appends more data to the buffer). |
| 243 | |
| 244 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 245 | fakegocmd |
| 246 | Specifies commands to run just before booting the OS. This |
| 247 | is a useful time to write the trace data to the host for |
| 248 | processing. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 249 | |
| 250 | |
| 251 | Writing Out Trace Data |
| 252 | ---------------------- |
| 253 | |
| 254 | Once the trace data is in an output buffer in memory there are various ways |
| 255 | to transmit it to the host. Notably you can use tftput to send the data |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 256 | over a network link:: |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 257 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 258 | fakegocmd=trace pause; usb start; set autoload n; bootp; |
| 259 | trace calls 10000000 1000000; |
| 260 | tftpput ${profbase} ${profoffset} 192.168.1.4:/tftpboot/calls |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 261 | |
| 262 | This starts up USB (to talk to an attached USB Ethernet dongle), writes |
| 263 | a trace log to address 10000000 and sends it to a host machine using |
| 264 | TFTP. After this, U-Boot will boot the OS normally, albeit a little |
| 265 | later. |
| 266 | |
| 267 | |
| 268 | Converting Trace Output Data |
| 269 | ---------------------------- |
| 270 | |
| 271 | The trace output data is kept in a binary format which is not documented |
| 272 | here. To convert it into something useful, you can use proftool. |
| 273 | |
| 274 | This tool must be given the U-Boot map file and the trace data received |
| 275 | from running that U-Boot. It produces a text output file. |
| 276 | |
| 277 | Options |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 278 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 279 | -m <map_file> |
| 280 | Specify U-Boot map file |
| 281 | |
| 282 | -p <trace_file> |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 283 | Specify profile/trace file |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 284 | |
| 285 | Commands: |
| 286 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 287 | dump-ftrace |
| 288 | Write a text dump of the file in Linux ftrace format to stdout |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 289 | |
| 290 | |
| 291 | Viewing the Trace Data |
| 292 | ---------------------- |
| 293 | |
| 294 | You can use pytimechart for this (sudo apt-get pytimechart might work on |
| 295 | your Debian-style machine, and use your favourite search engine to obtain |
| 296 | documentation). It expects the file to have a .txt extension. The program |
| 297 | has terse user interface but is very convenient for viewing U-Boot |
| 298 | profile information. |
| 299 | |
| 300 | |
| 301 | Workflow Suggestions |
| 302 | -------------------- |
| 303 | |
| 304 | The following suggestions may be helpful if you are trying to reduce boot |
| 305 | time: |
| 306 | |
| 307 | 1. Enable CONFIG_BOOTSTAGE and CONFIG_BOOTSTAGE_REPORT. This should get |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 308 | you are helpful overall snapshot of the boot time. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 309 | |
| 310 | 2. Build U-Boot with tracing and run it. Note the difference in boot time |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 311 | (it is common for tracing to add 10% to the time) |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 312 | |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 313 | 3. Collect the trace information as described above. Use this to find where |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 314 | all the time is being spent. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 315 | |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 316 | 4. Take a look at that code and see if you can optimize it. Perhaps it is |
| 317 | possible to speed up the initialization of a device, or remove an unused |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 318 | feature. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 319 | |
| 320 | 5. Rebuild, run and collect again. Compare your results. |
| 321 | |
| 322 | 6. Keep going until you run out of steam, or your boot is fast enough. |
| 323 | |
| 324 | |
| 325 | Configuring Trace |
| 326 | ----------------- |
| 327 | |
| 328 | There are a few parameters in the code that you may want to consider. |
| 329 | There is a function call depth limit (set to 15 by default). When the |
| 330 | stack depth goes above this then no tracing information is recorded. |
| 331 | The maximum depth reached is recorded and displayed by the 'trace stats' |
| 332 | command. |
| 333 | |
| 334 | |
| 335 | Future Work |
| 336 | ----------- |
| 337 | |
| 338 | Tracing could be a little tidier in some areas, for example providing |
| 339 | run-time configuration options for trace. |
| 340 | |
| 341 | Some other features that might be useful: |
| 342 | |
| 343 | - Trace filter to select which functions are recorded |
| 344 | - Sample-based profiling using a timer interrupt |
| 345 | - Better control over trace depth |
| 346 | - Compression of trace information |
| 347 | |
| 348 | |
| 349 | Simon Glass <sjg@chromium.org> |
| 350 | April 2013 |