Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0+ |
| 2 | .. Copyright (c) 2013 The Chromium OS Authors. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 3 | |
| 4 | Tracing in U-Boot |
| 5 | ================= |
| 6 | |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 7 | U-Boot supports a simple tracing feature which allows a record of execution |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 8 | to be collected and sent to a host machine for analysis. At present the |
| 9 | main use for this is to profile boot time. |
| 10 | |
| 11 | |
| 12 | Overview |
| 13 | -------- |
| 14 | |
| 15 | The trace feature uses GCC's instrument-functions feature to trace all |
| 16 | function entry/exit points. These are then recorded in a memory buffer. |
| 17 | The memory buffer can be saved to the host over a network link using |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 18 | tftpput or by writing to an attached storage device such as MMC. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 19 | |
| 20 | On the host, the file is first converted with a tool called 'proftool', |
| 21 | which extracts useful information from it. The resulting trace output |
| 22 | resembles that emitted by Linux's ftrace feature, so can be visually |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 23 | displayed by kernelshark (see kernelshark_) and used with |
| 24 | 'trace-cmd report' (see trace_cmd_). |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 25 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 26 | It is also possible to produce a flame graph for use with flamegraph.pl |
| 27 | (see flamegraph_pl_). |
| 28 | |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 29 | |
| 30 | Quick-start using Sandbox |
| 31 | ------------------------- |
| 32 | |
| 33 | Sandbox is a build of U-Boot that can run under Linux so it is a convenient |
| 34 | way of trying out tracing before you use it on your actual board. To do |
| 35 | this, follow these steps: |
| 36 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 37 | Add the following to `config/sandbox_defconfig`: |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 38 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 39 | .. code-block:: c |
| 40 | |
Simon Glass | 84b7711 | 2021-11-24 09:26:39 -0700 | [diff] [blame] | 41 | CONFIG_TRACE=y |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 42 | |
| 43 | Build sandbox U-Boot with tracing enabled: |
| 44 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 45 | .. code-block:: console |
| 46 | |
| 47 | $ make FTRACE=1 O=sandbox sandbox_config |
| 48 | $ make FTRACE=1 O=sandbox |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 49 | |
| 50 | Run sandbox, wait for a bit of trace information to appear, and then capture |
| 51 | a trace: |
| 52 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 53 | .. code-block:: console |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 54 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 55 | $ ./sandbox/u-boot |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 56 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 57 | U-Boot 2013.04-rc2-00100-ga72fcef (Apr 17 2013 - 19:25:24) |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 58 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 59 | DRAM: 128 MiB |
| 60 | trace: enabled |
| 61 | Using default environment |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 62 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 63 | In: serial |
| 64 | Out: serial |
| 65 | Err: serial |
| 66 | =>trace stats |
| 67 | 671,406 function sites |
| 68 | 69,712 function calls |
| 69 | 0 untracked function calls |
| 70 | 73,373 traced function calls |
| 71 | 16 maximum observed call depth |
| 72 | 15 call depth limit |
| 73 | 66,491 calls not traced due to depth |
| 74 | =>trace stats |
| 75 | 671,406 function sites |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 76 | 1,279,450 function calls |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 77 | 0 untracked function calls |
| 78 | 950,490 traced function calls (333217 dropped due to overflow) |
| 79 | 16 maximum observed call depth |
| 80 | 15 call depth limit |
| 81 | 1,275,767 calls not traced due to depth |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 82 | =>trace calls 1000000 e00000 |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 83 | Call list dumped to 00000000, size 0xae0a40 |
| 84 | =>print |
| 85 | baudrate=115200 |
| 86 | profbase=0 |
| 87 | profoffset=ae0a40 |
| 88 | profsize=e00000 |
| 89 | stderr=serial |
| 90 | stdin=serial |
| 91 | stdout=serial |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 92 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 93 | Environment size: 117/8188 bytes |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 94 | =>host save hostfs - 1000000 trace ${profoffset} |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 95 | 11405888 bytes written in 10 ms (1.1 GiB/s) |
| 96 | =>reset |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 97 | |
| 98 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 99 | Then run proftool to convert the trace information to ftrace format |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 100 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 101 | .. code-block:: console |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 102 | |
Pavel Skripkin | cdfcfe4 | 2023-04-12 22:05:31 +0300 | [diff] [blame] | 103 | $ ./sandbox/tools/proftool -m sandbox/System.map -t trace dump-ftrace -o trace.dat |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 104 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 105 | Finally run kernelshark to display it (note it only works with `.dat` files!): |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 106 | |
| 107 | .. code-block:: console |
| 108 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 109 | $ kernelshark trace.dat |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 110 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 111 | Using this tool you can view the trace records and see the timestamp for each |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 112 | function. |
| 113 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 114 | .. image:: pics/kernelshark.png |
| 115 | :width: 800 |
| 116 | :alt: Kernelshark showing function-trace records |
| 117 | |
| 118 | |
| 119 | To see the records on the console, use trace-cmd: |
| 120 | |
| 121 | .. code-block:: console |
| 122 | |
| 123 | $ trace-cmd report trace.dat | less |
| 124 | cpus=1 |
| 125 | u-boot-1 [000] 3.116364: function: initf_malloc |
| 126 | u-boot-1 [000] 3.116375: function: initf_malloc |
| 127 | u-boot-1 [000] 3.116386: function: initf_bootstage |
| 128 | u-boot-1 [000] 3.116396: function: bootstage_init |
| 129 | u-boot-1 [000] 3.116408: function: malloc |
| 130 | u-boot-1 [000] 3.116418: function: malloc_simple |
| 131 | u-boot-1 [000] 3.116429: function: alloc_simple |
| 132 | u-boot-1 [000] 3.116441: function: alloc_simple |
| 133 | u-boot-1 [000] 3.116449: function: malloc_simple |
| 134 | u-boot-1 [000] 3.116457: function: malloc |
| 135 | |
| 136 | Note that `pytimechart` is obsolete so cannot be used anymore. |
| 137 | |
| 138 | There is a -f option available to select a function graph: |
| 139 | |
| 140 | .. code-block:: console |
| 141 | |
Puhan Zhou | 10af36c | 2023-08-13 13:16:19 +0800 | [diff] [blame] | 142 | $ ./sandbox/tools/proftool -m sandbox/System.map -t trace -f funcgraph dump-ftrace -o trace.dat |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 143 | |
| 144 | Again, you can use kernelshark or trace-cmd to look at the output. In this case |
| 145 | you will see the time taken by each function shown against its exit record. |
| 146 | |
| 147 | .. image:: pics/kernelshark_fg.png |
| 148 | :width: 800 |
| 149 | :alt: Kernelshark showing function-graph records |
| 150 | |
| 151 | .. code-block:: console |
| 152 | |
| 153 | $ trace-cmd report trace.dat | less |
| 154 | cpus=1 |
| 155 | u-boot-1 [000] 3.116364: funcgraph_entry: 0.011 us | initf_malloc(); |
| 156 | u-boot-1 [000] 3.116386: funcgraph_entry: | initf_bootstage() { |
| 157 | u-boot-1 [000] 3.116396: funcgraph_entry: | bootstage_init() { |
| 158 | u-boot-1 [000] 3.116408: funcgraph_entry: | malloc() { |
| 159 | u-boot-1 [000] 3.116418: funcgraph_entry: | malloc_simple() { |
| 160 | u-boot-1 [000] 3.116429: funcgraph_entry: 0.012 us | alloc_simple(); |
| 161 | u-boot-1 [000] 3.116449: funcgraph_exit: 0.031 us | } |
| 162 | u-boot-1 [000] 3.116457: funcgraph_exit: 0.049 us | } |
| 163 | u-boot-1 [000] 3.116466: funcgraph_entry: 0.063 us | memset(); |
| 164 | u-boot-1 [000] 3.116539: funcgraph_exit: 0.143 us | } |
| 165 | |
Jerome Forissier | 967566b | 2024-12-13 13:45:38 +0100 | [diff] [blame^] | 166 | The `trace wipe` command may be used to clear the trace buffer. It leaves |
| 167 | tracing in its current enable state. This command is convenient when tracing a |
| 168 | single command, for example: |
| 169 | |
| 170 | .. code-block:: console |
| 171 | |
| 172 | => trace pause; trace wipe |
| 173 | => trace resume; dhcp; trace pause |
| 174 | => trace stats |
| 175 | ... |
| 176 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 177 | Flame graph |
| 178 | ----------- |
| 179 | |
| 180 | Some simple flame graph options are available as well, using the dump-flamegraph |
| 181 | command: |
| 182 | |
| 183 | .. code-block:: console |
| 184 | |
Puhan Zhou | 10af36c | 2023-08-13 13:16:19 +0800 | [diff] [blame] | 185 | $ ./sandbox/tools/proftool -m sandbox/System.map -t trace dump-flamegraph -o trace.fg |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 186 | $ flamegraph.pl trace.fg >trace.svg |
| 187 | |
| 188 | You can load the .svg file into a viewer. If you use Chrome (and some other |
| 189 | programs) you can click around and zoom in and out. |
| 190 | |
| 191 | .. image:: pics/flamegraph.png |
| 192 | :width: 800 |
| 193 | :alt: Chrome showing the flamegraph.pl output |
| 194 | |
| 195 | .. image:: pics/flamegraph_zoom.png |
| 196 | :width: 800 |
| 197 | :alt: Chrome showing zooming into the flamegraph.pl output |
| 198 | |
| 199 | |
| 200 | A timing variant is also available, which gives an idea of how much time is |
| 201 | spend in each call stack: |
| 202 | |
| 203 | .. code-block:: console |
| 204 | |
Puhan Zhou | 10af36c | 2023-08-13 13:16:19 +0800 | [diff] [blame] | 205 | $ ./sandbox/tools/proftool -m sandbox/System.map -t trace dump-flamegraph -f timing -o trace.fg |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 206 | $ flamegraph.pl trace.fg >trace.svg |
| 207 | |
| 208 | Note that trace collection does slow down execution so the timings will be |
| 209 | inflated. They should be used to guide optimisation. For accurate boot timings, |
| 210 | use bootstage. |
| 211 | |
| 212 | .. image:: pics/flamegraph_timing.png |
| 213 | :width: 800 |
| 214 | :alt: Chrome showing flamegraph.pl output with timing |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 215 | |
| 216 | CONFIG Options |
| 217 | -------------- |
| 218 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 219 | CONFIG_TRACE |
| 220 | Enables the trace feature in U-Boot. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 221 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 222 | CONFIG_CMD_TRACE |
| 223 | Enables the trace command. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 224 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 225 | CONFIG_TRACE_BUFFER_SIZE |
| 226 | Size of trace buffer to allocate for U-Boot. This buffer is |
| 227 | used after relocation, as a place to put function tracing |
| 228 | information. The address of the buffer is determined by |
| 229 | the relocation code. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 230 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 231 | CONFIG_TRACE_EARLY |
| 232 | Define this to start tracing early, before relocation. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 233 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 234 | CONFIG_TRACE_EARLY_SIZE |
| 235 | Size of 'early' trace buffer. Before U-Boot has relocated |
| 236 | it doesn't have a proper trace buffer. On many boards |
| 237 | you can define an area of memory to use for the trace |
| 238 | buffer until the 'real' trace buffer is available after |
| 239 | relocation. The contents of this buffer are then copied to |
| 240 | the real buffer. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 241 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 242 | CONFIG_TRACE_EARLY_ADDR |
| 243 | Address of early trace buffer |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 244 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 245 | CONFIG_TRACE_CALL_DEPTH_LIMIT |
| 246 | Sets the limit on trace call-depth. For a broad view, 10 is typically |
| 247 | sufficient. Setting this too large creates enormous traces and distorts |
| 248 | the overall timing considerable. |
| 249 | |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 250 | |
| 251 | Building U-Boot with Tracing Enabled |
| 252 | ------------------------------------ |
| 253 | |
| 254 | Pass 'FTRACE=1' to the U-Boot Makefile to actually instrument the code. |
| 255 | This is kept as a separate option so that it is easy to enable/disable |
| 256 | instrumenting from the command line instead of having to change board |
| 257 | config files. |
| 258 | |
| 259 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 260 | Board requirements |
| 261 | ------------------ |
| 262 | |
| 263 | Trace data collection relies on a microsecond timer, accessed through |
| 264 | `timer_get_us()`. So the first thing you should do is make sure that |
| 265 | this produces sensible results for your board. Suitable sources for |
| 266 | this timer include high resolution timers, PWMs or profile timers if |
| 267 | available. Most modern SOCs have a suitable timer for this. |
| 268 | |
| 269 | See `add_ftrace()` for where `timer_get_us()` is called. The `notrace` |
| 270 | attribute must be used on each function called by `timer_get_us()` since |
| 271 | recursive calls to `add_ftrace()` will cause a fault:: |
| 272 | |
| 273 | trace: recursion detected, disabling |
| 274 | |
| 275 | You cannot use driver model to obtain the microsecond timer, since tracing |
| 276 | may be enabled before driver model is set up. Instead, provide a low-level |
| 277 | function which accesses the timer, setting it up if needed. |
| 278 | |
| 279 | |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 280 | Collecting Trace Data |
| 281 | --------------------- |
| 282 | |
| 283 | When you run U-Boot on your board it will collect trace data up to the |
| 284 | limit of the trace buffer size you have specified. Once that is exhausted |
| 285 | no more data will be collected. |
| 286 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 287 | Collecting trace data affects execution time and performance. You |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 288 | will notice this particularly with trivial functions - the overhead of |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 289 | recording their execution may even exceed their normal execution time. |
| 290 | In practice this doesn't matter much so long as you are aware of the |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 291 | effect. Once you have done your optimizations, turn off tracing before |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 292 | doing end-to-end timing using bootstage. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 293 | |
| 294 | The best time to start tracing is right at the beginning of U-Boot. The |
| 295 | best time to stop tracing is right at the end. In practice it is hard |
| 296 | to achieve these ideals. |
| 297 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 298 | This implementation enables tracing early in `board_init_r()`, or |
| 299 | `board_init_f()` when `TRACE_EARLY` is enabled. This means |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 300 | that it captures most of the board init process, missing only the |
| 301 | early architecture-specific init. However, it also misses the entire |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 302 | SPL stage if there is one. At present tracing is not supported in SPL. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 303 | |
| 304 | U-Boot typically ends with a 'bootm' command which loads and runs an |
| 305 | OS. There is useful trace data in the execution of that bootm |
| 306 | command. Therefore this implementation provides a way to collect trace |
| 307 | data after bootm has finished processing, but just before it jumps to |
| 308 | the OS. In practical terms, U-Boot runs the 'fakegocmd' environment |
| 309 | variable at this point. This variable should have a short script which |
| 310 | collects the trace data and writes it somewhere. |
| 311 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 312 | Controlling the trace |
| 313 | --------------------- |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 314 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 315 | U-Boot provides a command-line interface to the trace system for controlling |
| 316 | tracing and accessing the trace data. See :doc:`../usage/cmd/trace`. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 317 | |
| 318 | |
| 319 | Environment Variables |
| 320 | --------------------- |
| 321 | |
| 322 | The following are used: |
| 323 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 324 | profbase |
| 325 | Base address of trace output buffer |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 326 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 327 | profoffset |
| 328 | Offset of first unwritten byte in trace output buffer |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 329 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 330 | profsize |
| 331 | Size of trace output buffer |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 332 | |
| 333 | All of these are set by the 'trace calls' command. |
| 334 | |
| 335 | These variables keep track of the amount of data written to the trace |
| 336 | output buffer by the 'trace' command. The trace commands which write data |
| 337 | to the output buffer can use these to specify the buffer to write to, and |
| 338 | update profoffset each time. This allows successive commands to append data |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 339 | to the same buffer, for example:: |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 340 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 341 | => trace funclist 10000 e00000 |
| 342 | => trace calls |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 343 | |
| 344 | (the latter command appends more data to the buffer). |
| 345 | |
| 346 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 347 | fakegocmd |
| 348 | Specifies commands to run just before booting the OS. This |
| 349 | is a useful time to write the trace data to the host for |
| 350 | processing. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 351 | |
| 352 | |
| 353 | Writing Out Trace Data |
| 354 | ---------------------- |
| 355 | |
| 356 | Once the trace data is in an output buffer in memory there are various ways |
| 357 | to transmit it to the host. Notably you can use tftput to send the data |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 358 | over a network link:: |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 359 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 360 | fakegocmd=trace pause; usb start; set autoload n; bootp; |
| 361 | trace calls 10000000 1000000; |
| 362 | tftpput ${profbase} ${profoffset} 192.168.1.4:/tftpboot/calls |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 363 | |
| 364 | This starts up USB (to talk to an attached USB Ethernet dongle), writes |
| 365 | a trace log to address 10000000 and sends it to a host machine using |
| 366 | TFTP. After this, U-Boot will boot the OS normally, albeit a little |
| 367 | later. |
| 368 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 369 | For a filesystem you may do something like:: |
| 370 | |
| 371 | trace calls 10000000 1000000; |
| 372 | save mmc 1:1 10000000 /trace ${profoffset} |
| 373 | |
| 374 | The trace buffer format is internal to the trace system. It consists of a |
| 375 | header, a call count for each function site, followed by a list of trace |
| 376 | records, once for each function call. |
| 377 | |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 378 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 379 | Converting Trace Output Data (proftool) |
| 380 | --------------------------------------- |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 381 | |
| 382 | The trace output data is kept in a binary format which is not documented |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 383 | here. See the `trace.h` header file if you are interested. To convert it into |
| 384 | something useful, you can use proftool. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 385 | |
| 386 | This tool must be given the U-Boot map file and the trace data received |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 387 | from running that U-Boot. It produces a binary output file. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 388 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 389 | It is also possible to provide a configuration file to indicate which functions |
| 390 | should be included or dropped during conversion. This file consists of lines |
| 391 | like:: |
| 392 | |
| 393 | include-func <regex> |
| 394 | exclude-func <regex> |
| 395 | |
| 396 | where <regex> is a regular expression matched against function names. It |
| 397 | allows some functions to be dropped from the trace when producing ftrace |
| 398 | records. |
| 399 | |
| 400 | Options: |
| 401 | |
| 402 | -c <config_file> |
| 403 | Specify the optional configuration file, to control which functions are |
| 404 | included in the output. |
| 405 | |
| 406 | -f <format> |
| 407 | Specifies the format to use (see below) |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 408 | |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 409 | -m <map_file> |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 410 | Specify U-Boot map file (`System.map`) |
| 411 | |
| 412 | -o <output file> |
| 413 | Specify the output filename |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 414 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 415 | -t <trace_file> |
| 416 | Specify trace file, the data saved from U-Boot |
| 417 | |
| 418 | -v <0-4> |
| 419 | Specify the verbosity, where 0 is the minimum and 4 is for debugging. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 420 | |
| 421 | Commands: |
| 422 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 423 | dump-ftrace: |
| 424 | Write a binary dump of the file in Linux ftrace format. Two options are |
| 425 | available: |
| 426 | |
| 427 | function |
| 428 | write function-call records (caller/callee) |
| 429 | |
| 430 | funcgraph |
| 431 | write function entry/exit records (graph) |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 432 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 433 | This format can be used with kernelshark_ and trace_cmd_. |
| 434 | |
| 435 | dump-flamegraph |
| 436 | Write a list of stack records useful for producing a flame graph. Two |
| 437 | options are available: |
| 438 | |
| 439 | calls |
| 440 | create a flamegraph of stack frames |
| 441 | |
| 442 | timing |
| 443 | create a flamegraph of microseconds for each stack frame |
| 444 | |
| 445 | This format can be used with flamegraph_pl_. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 446 | |
| 447 | Viewing the Trace Data |
| 448 | ---------------------- |
| 449 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 450 | You can use kernelshark_ for a GUI, but note that version 2.0.x was broken. If |
| 451 | you have that version you could try building it from source. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 452 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 453 | The file must have a .dat extension or it is ignored. The program has terse |
| 454 | user interface but is very convenient for viewing U-Boot profile information. |
| 455 | |
| 456 | Also available is trace_cmd_ which provides a command-line interface. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 457 | |
| 458 | Workflow Suggestions |
| 459 | -------------------- |
| 460 | |
| 461 | The following suggestions may be helpful if you are trying to reduce boot |
| 462 | time: |
| 463 | |
| 464 | 1. Enable CONFIG_BOOTSTAGE and CONFIG_BOOTSTAGE_REPORT. This should get |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 465 | you are helpful overall snapshot of the boot time. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 466 | |
| 467 | 2. Build U-Boot with tracing and run it. Note the difference in boot time |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 468 | (it is common for tracing to add 10% to the time) |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 469 | |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 470 | 3. Collect the trace information as described above. Use this to find where |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 471 | all the time is being spent. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 472 | |
Heinrich Schuchardt | 8963f19 | 2021-11-16 18:38:47 +0100 | [diff] [blame] | 473 | 4. Take a look at that code and see if you can optimize it. Perhaps it is |
| 474 | possible to speed up the initialization of a device, or remove an unused |
Heinrich Schuchardt | cc26f74 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 475 | feature. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 476 | |
| 477 | 5. Rebuild, run and collect again. Compare your results. |
| 478 | |
| 479 | 6. Keep going until you run out of steam, or your boot is fast enough. |
| 480 | |
| 481 | |
| 482 | Configuring Trace |
| 483 | ----------------- |
| 484 | |
| 485 | There are a few parameters in the code that you may want to consider. |
| 486 | There is a function call depth limit (set to 15 by default). When the |
| 487 | stack depth goes above this then no tracing information is recorded. |
| 488 | The maximum depth reached is recorded and displayed by the 'trace stats' |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 489 | command. While it might be tempting to set the depth limit quite high, this |
| 490 | can dramatically increase the size of the trace output as well as the execution |
| 491 | time. |
Simon Glass | c44b800 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 492 | |
| 493 | |
| 494 | Future Work |
| 495 | ----------- |
| 496 | |
| 497 | Tracing could be a little tidier in some areas, for example providing |
| 498 | run-time configuration options for trace. |
| 499 | |
| 500 | Some other features that might be useful: |
| 501 | |
| 502 | - Trace filter to select which functions are recorded |
| 503 | - Sample-based profiling using a timer interrupt |
| 504 | - Better control over trace depth |
| 505 | - Compression of trace information |
| 506 | |
| 507 | |
Simon Glass | b9d1997 | 2023-01-15 14:16:01 -0700 | [diff] [blame] | 508 | .. sectionauthor:: Simon Glass <sjg@chromium.org> |
| 509 | .. April 2013 |
| 510 | .. Updated January 2023 |
| 511 | |
| 512 | .. _kernelshark: https://kernelshark.org/ |
| 513 | .. _trace_cmd: https://www.trace-cmd.org/ |
| 514 | .. _flamegraph_pl: https://github.com/brendangregg/FlameGraph/blob/master/flamegraph.pl |