Paul Beesley | fc9ee36 | 2019-03-07 15:47:15 +0000 | [diff] [blame] | 1 | Firmware Design |
| 2 | =============== |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 3 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 4 | Trusted Firmware-A (TF-A) implements a subset of the Trusted Board Boot |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 5 | Requirements (TBBR) Platform Design Document (PDD) for Arm reference |
| 6 | platforms. |
| 7 | |
| 8 | The TBB sequence starts when the platform is powered on and runs up |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 9 | to the stage where it hands-off control to firmware running in the normal |
| 10 | world in DRAM. This is the cold boot path. |
| 11 | |
Manish V Badarkhe | 9d24e9b | 2023-06-15 09:14:33 +0100 | [diff] [blame] | 12 | TF-A also implements the `PSCI`_ as a runtime service. PSCI is the interface |
| 13 | from normal world software to firmware implementing power management use-cases |
| 14 | (for example, secondary CPU boot, hotplug and idle). Normal world software can |
| 15 | access TF-A runtime services via the Arm SMC (Secure Monitor Call) instruction. |
| 16 | The SMC instruction must be used as mandated by the SMC Calling Convention |
| 17 | (`SMCCC`_). |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 18 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 19 | TF-A implements a framework for configuring and managing interrupts generated |
| 20 | in either security state. The details of the interrupt management framework |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 21 | and its design can be found in :ref:`Interrupt Management Framework`. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 22 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 23 | TF-A also implements a library for setting up and managing the translation |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 24 | tables. The details of this library can be found in |
| 25 | :ref:`Translation (XLAT) Tables Library`. |
Antonio Nino Diaz | b5d6809 | 2017-05-23 11:49:22 +0100 | [diff] [blame] | 26 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 27 | TF-A can be built to support either AArch64 or AArch32 execution state. |
Zelalem Aweke | 023b1a4 | 2021-10-21 13:59:45 -0500 | [diff] [blame] | 28 | |
Harrison Mutai | 3005be0 | 2023-05-12 09:45:14 +0100 | [diff] [blame] | 29 | .. note:: |
| 30 | The descriptions in this chapter are for the Arm TrustZone architecture. |
| 31 | For changes to the firmware design for the `Arm Confidential Compute |
| 32 | Architecture (Arm CCA)`_ please refer to the chapter :ref:`Realm Management |
| 33 | Extension (RME)`. |
Zelalem Aweke | 023b1a4 | 2021-10-21 13:59:45 -0500 | [diff] [blame] | 34 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 35 | Cold boot |
| 36 | --------- |
| 37 | |
| 38 | The cold boot path starts when the platform is physically turned on. If |
| 39 | ``COLD_BOOT_SINGLE_CPU=0``, one of the CPUs released from reset is chosen as the |
| 40 | primary CPU, and the remaining CPUs are considered secondary CPUs. The primary |
| 41 | CPU is chosen through platform-specific means. The cold boot path is mainly |
| 42 | executed by the primary CPU, other than essential CPU initialization executed by |
| 43 | all CPUs. The secondary CPUs are kept in a safe platform-specific state until |
| 44 | the primary CPU has performed enough initialization to boot them. |
| 45 | |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 46 | Refer to the :ref:`CPU Reset` for more information on the effect of the |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 47 | ``COLD_BOOT_SINGLE_CPU`` platform build option. |
| 48 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 49 | The cold boot path in this implementation of TF-A depends on the execution |
| 50 | state. For AArch64, it is divided into five steps (in order of execution): |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 51 | |
| 52 | - Boot Loader stage 1 (BL1) *AP Trusted ROM* |
| 53 | - Boot Loader stage 2 (BL2) *Trusted Boot Firmware* |
| 54 | - Boot Loader stage 3-1 (BL31) *EL3 Runtime Software* |
| 55 | - Boot Loader stage 3-2 (BL32) *Secure-EL1 Payload* (optional) |
| 56 | - Boot Loader stage 3-3 (BL33) *Non-trusted Firmware* |
| 57 | |
| 58 | For AArch32, it is divided into four steps (in order of execution): |
| 59 | |
| 60 | - Boot Loader stage 1 (BL1) *AP Trusted ROM* |
| 61 | - Boot Loader stage 2 (BL2) *Trusted Boot Firmware* |
| 62 | - Boot Loader stage 3-2 (BL32) *EL3 Runtime Software* |
| 63 | - Boot Loader stage 3-3 (BL33) *Non-trusted Firmware* |
| 64 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 65 | Arm development platforms (Fixed Virtual Platforms (FVPs) and Juno) implement a |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 66 | combination of the following types of memory regions. Each bootloader stage uses |
| 67 | one or more of these memory regions. |
| 68 | |
| 69 | - Regions accessible from both non-secure and secure states. For example, |
| 70 | non-trusted SRAM, ROM and DRAM. |
| 71 | - Regions accessible from only the secure state. For example, trusted SRAM and |
| 72 | ROM. The FVPs also implement the trusted DRAM which is statically |
| 73 | configured. Additionally, the Base FVPs and Juno development platform |
| 74 | configure the TrustZone Controller (TZC) to create a region in the DRAM |
| 75 | which is accessible only from the secure state. |
| 76 | |
| 77 | The sections below provide the following details: |
| 78 | |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 79 | - dynamic configuration of Boot Loader stages |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 80 | - initialization and execution of the first three stages during cold boot |
| 81 | - specification of the EL3 Runtime Software (BL31 for AArch64 and BL32 for |
| 82 | AArch32) entrypoint requirements for use by alternative Trusted Boot |
| 83 | Firmware in place of the provided BL1 and BL2 |
| 84 | |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 85 | Dynamic Configuration during cold boot |
| 86 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 87 | |
| 88 | Each of the Boot Loader stages may be dynamically configured if required by the |
| 89 | platform. The Boot Loader stage may optionally specify a firmware |
| 90 | configuration file and/or hardware configuration file as listed below: |
| 91 | |
Manish V Badarkhe | ece96fd | 2020-06-13 09:42:28 +0100 | [diff] [blame] | 92 | - FW_CONFIG - The firmware configuration file. Holds properties shared across |
| 93 | all BLx images. |
| 94 | An example is the "dtb-registry" node, which contains the information about |
| 95 | the other device tree configurations (load-address, size, image_id). |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 96 | - HW_CONFIG - The hardware configuration file. Can be shared by all Boot Loader |
| 97 | stages and also by the Normal World Rich OS. |
| 98 | - TB_FW_CONFIG - Trusted Boot Firmware configuration file. Shared between BL1 |
| 99 | and BL2. |
| 100 | - SOC_FW_CONFIG - SoC Firmware configuration file. Used by BL31. |
| 101 | - TOS_FW_CONFIG - Trusted OS Firmware configuration file. Used by Trusted OS |
| 102 | (BL32). |
| 103 | - NT_FW_CONFIG - Non Trusted Firmware configuration file. Used by Non-trusted |
| 104 | firmware (BL33). |
| 105 | |
| 106 | The Arm development platforms use the Flattened Device Tree format for the |
| 107 | dynamic configuration files. |
| 108 | |
| 109 | Each Boot Loader stage can pass up to 4 arguments via registers to the next |
| 110 | stage. BL2 passes the list of the next images to execute to the *EL3 Runtime |
| 111 | Software* (BL31 for AArch64 and BL32 for AArch32) via `arg0`. All the other |
| 112 | arguments are platform defined. The Arm development platforms use the following |
| 113 | convention: |
| 114 | |
| 115 | - BL1 passes the address of a meminfo_t structure to BL2 via ``arg1``. This |
| 116 | structure contains the memory layout available to BL2. |
| 117 | - When dynamic configuration files are present, the firmware configuration for |
| 118 | the next Boot Loader stage is populated in the first available argument and |
| 119 | the generic hardware configuration is passed the next available argument. |
| 120 | For example, |
| 121 | |
Manish V Badarkhe | ece96fd | 2020-06-13 09:42:28 +0100 | [diff] [blame] | 122 | - FW_CONFIG is loaded by BL1, then its address is passed in ``arg0`` to BL2. |
| 123 | - TB_FW_CONFIG address is retrieved by BL2 from FW_CONFIG device tree. |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 124 | - If HW_CONFIG is loaded by BL1, then its address is passed in ``arg2`` to |
| 125 | BL2. Note, ``arg1`` is already used for meminfo_t. |
| 126 | - If SOC_FW_CONFIG is loaded by BL2, then its address is passed in ``arg1`` |
| 127 | to BL31. Note, ``arg0`` is used to pass the list of executable images. |
| 128 | - Similarly, if HW_CONFIG is loaded by BL1 or BL2, then its address is |
| 129 | passed in ``arg2`` to BL31. |
| 130 | - For other BL3x images, if the firmware configuration file is loaded by |
| 131 | BL2, then its address is passed in ``arg0`` and if HW_CONFIG is loaded |
| 132 | then its address is passed in ``arg1``. |
Nishant Sharma | e9d8c01 | 2023-10-13 11:23:50 +0100 | [diff] [blame] | 133 | - In case SPMC_AT_EL3 is enabled, populate the BL32 image base, size and max |
| 134 | limit in the entry point information, since there is no platform function |
| 135 | to retrieve these in generic code. We choose ``arg2``, ``arg3`` and |
| 136 | ``arg4`` since the generic code uses ``arg1`` for stashing the SP manifest |
| 137 | size. The SPMC setup uses these arguments to update SP manifest with |
| 138 | actual SP's base address and it size. |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 139 | - In case of the Arm FVP platform, FW_CONFIG address passed in ``arg1`` to |
| 140 | BL31/SP_MIN, and the SOC_FW_CONFIG and HW_CONFIG details are retrieved |
| 141 | from FW_CONFIG device tree. |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 142 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 143 | BL1 |
| 144 | ~~~ |
| 145 | |
| 146 | This stage begins execution from the platform's reset vector at EL3. The reset |
| 147 | address is platform dependent but it is usually located in a Trusted ROM area. |
| 148 | The BL1 data section is copied to trusted SRAM at runtime. |
| 149 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 150 | On the Arm development platforms, BL1 code starts execution from the reset |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 151 | vector defined by the constant ``BL1_RO_BASE``. The BL1 data section is copied |
| 152 | to the top of trusted SRAM as defined by the constant ``BL1_RW_BASE``. |
| 153 | |
| 154 | The functionality implemented by this stage is as follows. |
| 155 | |
| 156 | Determination of boot path |
| 157 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 158 | |
| 159 | Whenever a CPU is released from reset, BL1 needs to distinguish between a warm |
| 160 | boot and a cold boot. This is done using platform-specific mechanisms (see the |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 161 | ``plat_get_my_entrypoint()`` function in the :ref:`Porting Guide`). In the case |
| 162 | of a warm boot, a CPU is expected to continue execution from a separate |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 163 | entrypoint. In the case of a cold boot, the secondary CPUs are placed in a safe |
| 164 | platform-specific state (see the ``plat_secondary_cold_boot_setup()`` function in |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 165 | the :ref:`Porting Guide`) while the primary CPU executes the remaining cold boot |
| 166 | path as described in the following sections. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 167 | |
| 168 | This step only applies when ``PROGRAMMABLE_RESET_ADDRESS=0``. Refer to the |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 169 | :ref:`CPU Reset` for more information on the effect of the |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 170 | ``PROGRAMMABLE_RESET_ADDRESS`` platform build option. |
| 171 | |
| 172 | Architectural initialization |
| 173 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 174 | |
| 175 | BL1 performs minimal architectural initialization as follows. |
| 176 | |
| 177 | - Exception vectors |
| 178 | |
| 179 | BL1 sets up simple exception vectors for both synchronous and asynchronous |
| 180 | exceptions. The default behavior upon receiving an exception is to populate |
| 181 | a status code in the general purpose register ``X0/R0`` and call the |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 182 | ``plat_report_exception()`` function (see the :ref:`Porting Guide`). The |
| 183 | status code is one of: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 184 | |
| 185 | For AArch64: |
| 186 | |
| 187 | :: |
| 188 | |
| 189 | 0x0 : Synchronous exception from Current EL with SP_EL0 |
| 190 | 0x1 : IRQ exception from Current EL with SP_EL0 |
| 191 | 0x2 : FIQ exception from Current EL with SP_EL0 |
| 192 | 0x3 : System Error exception from Current EL with SP_EL0 |
| 193 | 0x4 : Synchronous exception from Current EL with SP_ELx |
| 194 | 0x5 : IRQ exception from Current EL with SP_ELx |
| 195 | 0x6 : FIQ exception from Current EL with SP_ELx |
| 196 | 0x7 : System Error exception from Current EL with SP_ELx |
| 197 | 0x8 : Synchronous exception from Lower EL using aarch64 |
| 198 | 0x9 : IRQ exception from Lower EL using aarch64 |
| 199 | 0xa : FIQ exception from Lower EL using aarch64 |
| 200 | 0xb : System Error exception from Lower EL using aarch64 |
| 201 | 0xc : Synchronous exception from Lower EL using aarch32 |
| 202 | 0xd : IRQ exception from Lower EL using aarch32 |
| 203 | 0xe : FIQ exception from Lower EL using aarch32 |
| 204 | 0xf : System Error exception from Lower EL using aarch32 |
| 205 | |
| 206 | For AArch32: |
| 207 | |
| 208 | :: |
| 209 | |
| 210 | 0x10 : User mode |
| 211 | 0x11 : FIQ mode |
| 212 | 0x12 : IRQ mode |
| 213 | 0x13 : SVC mode |
| 214 | 0x16 : Monitor mode |
| 215 | 0x17 : Abort mode |
| 216 | 0x1a : Hypervisor mode |
| 217 | 0x1b : Undefined mode |
| 218 | 0x1f : System mode |
| 219 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 220 | The ``plat_report_exception()`` implementation on the Arm FVP port programs |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 221 | the Versatile Express System LED register in the following format to |
Paul Beesley | 1fbc97b | 2019-01-11 18:26:51 +0000 | [diff] [blame] | 222 | indicate the occurrence of an unexpected exception: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 223 | |
| 224 | :: |
| 225 | |
| 226 | SYS_LED[0] - Security state (Secure=0/Non-Secure=1) |
| 227 | SYS_LED[2:1] - Exception Level (EL3=0x3, EL2=0x2, EL1=0x1, EL0=0x0) |
| 228 | For AArch32 it is always 0x0 |
| 229 | SYS_LED[7:3] - Exception Class (Sync/Async & origin). This is the value |
| 230 | of the status code |
| 231 | |
| 232 | A write to the LED register reflects in the System LEDs (S6LED0..7) in the |
| 233 | CLCD window of the FVP. |
| 234 | |
| 235 | BL1 does not expect to receive any exceptions other than the SMC exception. |
| 236 | For the latter, BL1 installs a simple stub. The stub expects to receive a |
| 237 | limited set of SMC types (determined by their function IDs in the general |
| 238 | purpose register ``X0/R0``): |
| 239 | |
| 240 | - ``BL1_SMC_RUN_IMAGE``: This SMC is raised by BL2 to make BL1 pass control |
| 241 | to EL3 Runtime Software. |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 242 | - All SMCs listed in section "BL1 SMC Interface" in the :ref:`Firmware Update (FWU)` |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 243 | Design Guide are supported for AArch64 only. These SMCs are currently |
| 244 | not supported when BL1 is built for AArch32. |
| 245 | |
| 246 | Any other SMC leads to an assertion failure. |
| 247 | |
| 248 | - CPU initialization |
| 249 | |
| 250 | BL1 calls the ``reset_handler()`` function which in turn calls the CPU |
| 251 | specific reset handler function (see the section: "CPU specific operations |
| 252 | framework"). |
| 253 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 254 | Platform initialization |
| 255 | ^^^^^^^^^^^^^^^^^^^^^^^ |
| 256 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 257 | On Arm platforms, BL1 performs the following platform initializations: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 258 | |
| 259 | - Enable the Trusted Watchdog. |
| 260 | - Initialize the console. |
| 261 | - Configure the Interconnect to enable hardware coherency. |
| 262 | - Enable the MMU and map the memory it needs to access. |
| 263 | - Configure any required platform storage to load the next bootloader image |
| 264 | (BL2). |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 265 | - If the BL1 dynamic configuration file, ``TB_FW_CONFIG``, is available, then |
| 266 | load it to the platform defined address and make it available to BL2 via |
| 267 | ``arg0``. |
Soby Mathew | d969a7e | 2018-06-11 16:40:36 +0100 | [diff] [blame] | 268 | - Configure the system timer and program the `CNTFRQ_EL0` for use by NS-BL1U |
| 269 | and NS-BL2U firmware update images. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 270 | |
| 271 | Firmware Update detection and execution |
| 272 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 273 | |
| 274 | After performing platform setup, BL1 common code calls |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 275 | ``bl1_plat_get_next_image_id()`` to determine if :ref:`Firmware Update (FWU)` is |
| 276 | required or to proceed with the normal boot process. If the platform code |
| 277 | returns ``BL2_IMAGE_ID`` then the normal boot sequence is executed as described |
| 278 | in the next section, else BL1 assumes that :ref:`Firmware Update (FWU)` is |
| 279 | required and execution passes to the first image in the |
| 280 | :ref:`Firmware Update (FWU)` process. In either case, BL1 retrieves a descriptor |
| 281 | of the next image by calling ``bl1_plat_get_image_desc()``. The image descriptor |
| 282 | contains an ``entry_point_info_t`` structure, which BL1 uses to initialize the |
| 283 | execution state of the next image. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 284 | |
| 285 | BL2 image load and execution |
| 286 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 287 | |
| 288 | In the normal boot flow, BL1 execution continues as follows: |
| 289 | |
| 290 | #. BL1 prints the following string from the primary CPU to indicate successful |
| 291 | execution of the BL1 stage: |
| 292 | |
| 293 | :: |
| 294 | |
| 295 | "Booting Trusted Firmware" |
| 296 | |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 297 | #. BL1 loads a BL2 raw binary image from platform storage, at a |
| 298 | platform-specific base address. Prior to the load, BL1 invokes |
| 299 | ``bl1_plat_handle_pre_image_load()`` which allows the platform to update or |
| 300 | use the image information. If the BL2 image file is not present or if |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 301 | there is not enough free trusted SRAM the following error message is |
| 302 | printed: |
| 303 | |
| 304 | :: |
| 305 | |
| 306 | "Failed to load BL2 firmware." |
| 307 | |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 308 | #. BL1 invokes ``bl1_plat_handle_post_image_load()`` which again is intended |
| 309 | for platforms to take further action after image load. This function must |
| 310 | populate the necessary arguments for BL2, which may also include the memory |
| 311 | layout. Further description of the memory layout can be found later |
| 312 | in this document. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 313 | |
| 314 | #. BL1 passes control to the BL2 image at Secure EL1 (for AArch64) or at |
| 315 | Secure SVC mode (for AArch32), starting from its load address. |
| 316 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 317 | BL2 |
| 318 | ~~~ |
| 319 | |
| 320 | BL1 loads and passes control to BL2 at Secure-EL1 (for AArch64) or at Secure |
| 321 | SVC mode (for AArch32) . BL2 is linked against and loaded at a platform-specific |
| 322 | base address (more information can be found later in this document). |
| 323 | The functionality implemented by BL2 is as follows. |
| 324 | |
| 325 | Architectural initialization |
| 326 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 327 | |
| 328 | For AArch64, BL2 performs the minimal architectural initialization required |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 329 | for subsequent stages of TF-A and normal world software. EL1 and EL0 are given |
Peng Fan | 9632c9c | 2020-08-21 10:47:17 +0800 | [diff] [blame] | 330 | access to Floating Point and Advanced SIMD registers by setting the |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 331 | ``CPACR.FPEN`` bits. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 332 | |
| 333 | For AArch32, the minimal architectural initialization required for subsequent |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 334 | stages of TF-A and normal world software is taken care of in BL1 as both BL1 |
| 335 | and BL2 execute at PL1. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 336 | |
| 337 | Platform initialization |
| 338 | ^^^^^^^^^^^^^^^^^^^^^^^ |
| 339 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 340 | On Arm platforms, BL2 performs the following platform initializations: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 341 | |
| 342 | - Initialize the console. |
| 343 | - Configure any required platform storage to allow loading further bootloader |
| 344 | images. |
| 345 | - Enable the MMU and map the memory it needs to access. |
| 346 | - Perform platform security setup to allow access to controlled components. |
| 347 | - Reserve some memory for passing information to the next bootloader image |
| 348 | EL3 Runtime Software and populate it. |
| 349 | - Define the extents of memory available for loading each subsequent |
| 350 | bootloader image. |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 351 | - If BL1 has passed TB_FW_CONFIG dynamic configuration file in ``arg0``, |
| 352 | then parse it. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 353 | |
| 354 | Image loading in BL2 |
| 355 | ^^^^^^^^^^^^^^^^^^^^ |
| 356 | |
Roberto Vargas | 025946a | 2018-09-24 17:20:48 +0100 | [diff] [blame] | 357 | BL2 generic code loads the images based on the list of loadable images |
| 358 | provided by the platform. BL2 passes the list of executable images |
| 359 | provided by the platform to the next handover BL image. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 360 | |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 361 | The list of loadable images provided by the platform may also contain |
| 362 | dynamic configuration files. The files are loaded and can be parsed as |
| 363 | needed in the ``bl2_plat_handle_post_image_load()`` function. These |
| 364 | configuration files can be passed to next Boot Loader stages as arguments |
| 365 | by updating the corresponding entrypoint information in this function. |
| 366 | |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 367 | SCP_BL2 (System Control Processor Firmware) image load |
| 368 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 369 | |
| 370 | Some systems have a separate System Control Processor (SCP) for power, clock, |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 371 | reset and system control. BL2 loads the optional SCP_BL2 image from platform |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 372 | storage into a platform-specific region of secure memory. The subsequent |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 373 | handling of SCP_BL2 is platform specific. For example, on the Juno Arm |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 374 | development platform port the image is transferred into SCP's internal memory |
| 375 | using the Boot Over MHU (BOM) protocol after being loaded in the trusted SRAM |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 376 | memory. The SCP executes SCP_BL2 and signals to the Application Processor (AP) |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 377 | for BL2 execution to continue. |
| 378 | |
| 379 | EL3 Runtime Software image load |
| 380 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 381 | |
| 382 | BL2 loads the EL3 Runtime Software image from platform storage into a platform- |
| 383 | specific address in trusted SRAM. If there is not enough memory to load the |
Roberto Vargas | 025946a | 2018-09-24 17:20:48 +0100 | [diff] [blame] | 384 | image or image is missing it leads to an assertion failure. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 385 | |
| 386 | AArch64 BL32 (Secure-EL1 Payload) image load |
| 387 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 388 | |
| 389 | BL2 loads the optional BL32 image from platform storage into a platform- |
| 390 | specific region of secure memory. The image executes in the secure world. BL2 |
| 391 | relies on BL31 to pass control to the BL32 image, if present. Hence, BL2 |
| 392 | populates a platform-specific area of memory with the entrypoint/load-address |
| 393 | of the BL32 image. The value of the Saved Processor Status Register (``SPSR``) |
| 394 | for entry into BL32 is not determined by BL2, it is initialized by the |
| 395 | Secure-EL1 Payload Dispatcher (see later) within BL31, which is responsible for |
| 396 | managing interaction with BL32. This information is passed to BL31. |
| 397 | |
| 398 | BL33 (Non-trusted Firmware) image load |
| 399 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 400 | |
| 401 | BL2 loads the BL33 image (e.g. UEFI or other test or boot software) from |
| 402 | platform storage into non-secure memory as defined by the platform. |
| 403 | |
| 404 | BL2 relies on EL3 Runtime Software to pass control to BL33 once secure state |
| 405 | initialization is complete. Hence, BL2 populates a platform-specific area of |
| 406 | memory with the entrypoint and Saved Program Status Register (``SPSR``) of the |
| 407 | normal world software image. The entrypoint is the load address of the BL33 |
| 408 | image. The ``SPSR`` is determined as specified in Section 5.13 of the |
Manish V Badarkhe | 9d24e9b | 2023-06-15 09:14:33 +0100 | [diff] [blame] | 409 | `PSCI`_. This information is passed to the EL3 Runtime Software. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 410 | |
| 411 | AArch64 BL31 (EL3 Runtime Software) execution |
| 412 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 413 | |
| 414 | BL2 execution continues as follows: |
| 415 | |
| 416 | #. BL2 passes control back to BL1 by raising an SMC, providing BL1 with the |
| 417 | BL31 entrypoint. The exception is handled by the SMC exception handler |
| 418 | installed by BL1. |
| 419 | |
| 420 | #. BL1 turns off the MMU and flushes the caches. It clears the |
| 421 | ``SCTLR_EL3.M/I/C`` bits, flushes the data cache to the point of coherency |
| 422 | and invalidates the TLBs. |
| 423 | |
| 424 | #. BL1 passes control to BL31 at the specified entrypoint at EL3. |
| 425 | |
Roberto Vargas | b158427 | 2017-11-20 13:36:10 +0000 | [diff] [blame] | 426 | Running BL2 at EL3 execution level |
| 427 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 428 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 429 | Some platforms have a non-TF-A Boot ROM that expects the next boot stage |
| 430 | to execute at EL3. On these platforms, TF-A BL1 is a waste of memory |
| 431 | as its only purpose is to ensure TF-A BL2 is entered at S-EL1. To avoid |
Roberto Vargas | b158427 | 2017-11-20 13:36:10 +0000 | [diff] [blame] | 432 | this waste, a special mode enables BL2 to execute at EL3, which allows |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 433 | a non-TF-A Boot ROM to load and jump directly to BL2. This mode is selected |
Arvind Ram Prakash | 11b9b49 | 2022-11-22 14:41:00 -0600 | [diff] [blame] | 434 | when the build flag RESET_TO_BL2 is enabled. |
| 435 | The main differences in this mode are: |
Roberto Vargas | b158427 | 2017-11-20 13:36:10 +0000 | [diff] [blame] | 436 | |
| 437 | #. BL2 includes the reset code and the mailbox mechanism to differentiate |
| 438 | cold boot and warm boot. It runs at EL3 doing the arch |
| 439 | initialization required for EL3. |
| 440 | |
| 441 | #. BL2 does not receive the meminfo information from BL1 anymore. This |
| 442 | information can be passed by the Boot ROM or be internal to the |
| 443 | BL2 image. |
| 444 | |
| 445 | #. Since BL2 executes at EL3, BL2 jumps directly to the next image, |
| 446 | instead of invoking the RUN_IMAGE SMC call. |
| 447 | |
| 448 | |
| 449 | We assume 3 different types of BootROM support on the platform: |
| 450 | |
| 451 | #. The Boot ROM always jumps to the same address, for both cold |
| 452 | and warm boot. In this case, we will need to keep a resident part |
| 453 | of BL2 whose memory cannot be reclaimed by any other image. The |
| 454 | linker script defines the symbols __TEXT_RESIDENT_START__ and |
| 455 | __TEXT_RESIDENT_END__ that allows the platform to configure |
| 456 | correctly the memory map. |
| 457 | #. The platform has some mechanism to indicate the jump address to the |
| 458 | Boot ROM. Platform code can then program the jump address with |
| 459 | psci_warmboot_entrypoint during cold boot. |
| 460 | #. The platform has some mechanism to program the reset address using |
| 461 | the PROGRAMMABLE_RESET_ADDRESS feature. Platform code can then |
| 462 | program the reset address with psci_warmboot_entrypoint during |
| 463 | cold boot, bypassing the boot ROM for warm boot. |
| 464 | |
| 465 | In the last 2 cases, no part of BL2 needs to remain resident at |
| 466 | runtime. In the first 2 cases, we expect the Boot ROM to be able to |
| 467 | differentiate between warm and cold boot, to avoid loading BL2 again |
| 468 | during warm boot. |
| 469 | |
| 470 | This functionality can be tested with FVP loading the image directly |
| 471 | in memory and changing the address where the system jumps at reset. |
| 472 | For example: |
| 473 | |
Dimitris Papastamos | 2583649 | 2018-06-11 11:07:58 +0100 | [diff] [blame] | 474 | -C cluster0.cpu0.RVBAR=0x4022000 |
| 475 | --data cluster0.cpu0=bl2.bin@0x4022000 |
Roberto Vargas | b158427 | 2017-11-20 13:36:10 +0000 | [diff] [blame] | 476 | |
| 477 | With this configuration, FVP is like a platform of the first case, |
| 478 | where the Boot ROM jumps always to the same address. For simplification, |
| 479 | BL32 is loaded in DRAM in this case, to avoid other images reclaiming |
| 480 | BL2 memory. |
| 481 | |
| 482 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 483 | AArch64 BL31 |
| 484 | ~~~~~~~~~~~~ |
| 485 | |
| 486 | The image for this stage is loaded by BL2 and BL1 passes control to BL31 at |
| 487 | EL3. BL31 executes solely in trusted SRAM. BL31 is linked against and |
| 488 | loaded at a platform-specific base address (more information can be found later |
| 489 | in this document). The functionality implemented by BL31 is as follows. |
| 490 | |
| 491 | Architectural initialization |
| 492 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 493 | |
| 494 | Currently, BL31 performs a similar architectural initialization to BL1 as |
| 495 | far as system register settings are concerned. Since BL1 code resides in ROM, |
| 496 | architectural initialization in BL31 allows override of any previous |
| 497 | initialization done by BL1. |
| 498 | |
| 499 | BL31 initializes the per-CPU data framework, which provides a cache of |
| 500 | frequently accessed per-CPU data optimised for fast, concurrent manipulation |
| 501 | on different CPUs. This buffer includes pointers to per-CPU contexts, crash |
| 502 | buffer, CPU reset and power down operations, PSCI data, platform data and so on. |
| 503 | |
| 504 | It then replaces the exception vectors populated by BL1 with its own. BL31 |
| 505 | exception vectors implement more elaborate support for handling SMCs since this |
| 506 | is the only mechanism to access the runtime services implemented by BL31 (PSCI |
| 507 | for example). BL31 checks each SMC for validity as specified by the |
Sandrine Bailleux | d9202df | 2020-04-17 14:06:52 +0200 | [diff] [blame] | 508 | `SMC Calling Convention`_ before passing control to the required SMC |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 509 | handler routine. |
| 510 | |
| 511 | BL31 programs the ``CNTFRQ_EL0`` register with the clock frequency of the system |
| 512 | counter, which is provided by the platform. |
| 513 | |
| 514 | Platform initialization |
| 515 | ^^^^^^^^^^^^^^^^^^^^^^^ |
| 516 | |
| 517 | BL31 performs detailed platform initialization, which enables normal world |
| 518 | software to function correctly. |
| 519 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 520 | On Arm platforms, this consists of the following: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 521 | |
| 522 | - Initialize the console. |
| 523 | - Configure the Interconnect to enable hardware coherency. |
| 524 | - Enable the MMU and map the memory it needs to access. |
| 525 | - Initialize the generic interrupt controller. |
| 526 | - Initialize the power controller device. |
| 527 | - Detect the system topology. |
| 528 | |
| 529 | Runtime services initialization |
| 530 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 531 | |
| 532 | BL31 is responsible for initializing the runtime services. One of them is PSCI. |
| 533 | |
| 534 | As part of the PSCI initializations, BL31 detects the system topology. It also |
| 535 | initializes the data structures that implement the state machine used to track |
| 536 | the state of power domain nodes. The state can be one of ``OFF``, ``RUN`` or |
| 537 | ``RETENTION``. All secondary CPUs are initially in the ``OFF`` state. The cluster |
| 538 | that the primary CPU belongs to is ``ON``; any other cluster is ``OFF``. It also |
| 539 | initializes the locks that protect them. BL31 accesses the state of a CPU or |
| 540 | cluster immediately after reset and before the data cache is enabled in the |
| 541 | warm boot path. It is not currently possible to use 'exclusive' based spinlocks, |
| 542 | therefore BL31 uses locks based on Lamport's Bakery algorithm instead. |
| 543 | |
| 544 | The runtime service framework and its initialization is described in more |
| 545 | detail in the "EL3 runtime services framework" section below. |
| 546 | |
| 547 | Details about the status of the PSCI implementation are provided in the |
| 548 | "Power State Coordination Interface" section below. |
| 549 | |
| 550 | AArch64 BL32 (Secure-EL1 Payload) image initialization |
| 551 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 552 | |
| 553 | If a BL32 image is present then there must be a matching Secure-EL1 Payload |
| 554 | Dispatcher (SPD) service (see later for details). During initialization |
| 555 | that service must register a function to carry out initialization of BL32 |
| 556 | once the runtime services are fully initialized. BL31 invokes such a |
| 557 | registered function to initialize BL32 before running BL33. This initialization |
| 558 | is not necessary for AArch32 SPs. |
| 559 | |
| 560 | Details on BL32 initialization and the SPD's role are described in the |
Paul Beesley | d2fcc4e | 2019-05-29 13:59:40 +0100 | [diff] [blame] | 561 | :ref:`firmware_design_sel1_spd` section below. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 562 | |
| 563 | BL33 (Non-trusted Firmware) execution |
| 564 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 565 | |
| 566 | EL3 Runtime Software initializes the EL2 or EL1 processor context for normal- |
| 567 | world cold boot, ensuring that no secure state information finds its way into |
| 568 | the non-secure execution state. EL3 Runtime Software uses the entrypoint |
| 569 | information provided by BL2 to jump to the Non-trusted firmware image (BL33) |
| 570 | at the highest available Exception Level (EL2 if available, otherwise EL1). |
| 571 | |
| 572 | Using alternative Trusted Boot Firmware in place of BL1 & BL2 (AArch64 only) |
| 573 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 574 | |
| 575 | Some platforms have existing implementations of Trusted Boot Firmware that |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 576 | would like to use TF-A BL31 for the EL3 Runtime Software. To enable this |
| 577 | firmware architecture it is important to provide a fully documented and stable |
| 578 | interface between the Trusted Boot Firmware and BL31. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 579 | |
| 580 | Future changes to the BL31 interface will be done in a backwards compatible |
| 581 | way, and this enables these firmware components to be independently enhanced/ |
| 582 | updated to develop and exploit new functionality. |
| 583 | |
| 584 | Required CPU state when calling ``bl31_entrypoint()`` during cold boot |
| 585 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 586 | |
| 587 | This function must only be called by the primary CPU. |
| 588 | |
| 589 | On entry to this function the calling primary CPU must be executing in AArch64 |
| 590 | EL3, little-endian data access, and all interrupt sources masked: |
| 591 | |
| 592 | :: |
| 593 | |
| 594 | PSTATE.EL = 3 |
| 595 | PSTATE.RW = 1 |
| 596 | PSTATE.DAIF = 0xf |
| 597 | SCTLR_EL3.EE = 0 |
| 598 | |
| 599 | X0 and X1 can be used to pass information from the Trusted Boot Firmware to the |
| 600 | platform code in BL31: |
| 601 | |
| 602 | :: |
| 603 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 604 | X0 : Reserved for common TF-A information |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 605 | X1 : Platform specific information |
| 606 | |
| 607 | BL31 zero-init sections (e.g. ``.bss``) should not contain valid data on entry, |
| 608 | these will be zero filled prior to invoking platform setup code. |
| 609 | |
| 610 | Use of the X0 and X1 parameters |
| 611 | ''''''''''''''''''''''''''''''' |
| 612 | |
| 613 | The parameters are platform specific and passed from ``bl31_entrypoint()`` to |
| 614 | ``bl31_early_platform_setup()``. The value of these parameters is never directly |
| 615 | used by the common BL31 code. |
| 616 | |
| 617 | The convention is that ``X0`` conveys information regarding the BL31, BL32 and |
| 618 | BL33 images from the Trusted Boot firmware and ``X1`` can be used for other |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 619 | platform specific purpose. This convention allows platforms which use TF-A's |
| 620 | BL1 and BL2 images to transfer additional platform specific information from |
| 621 | Secure Boot without conflicting with future evolution of TF-A using ``X0`` to |
| 622 | pass a ``bl31_params`` structure. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 623 | |
| 624 | BL31 common and SPD initialization code depends on image and entrypoint |
| 625 | information about BL33 and BL32, which is provided via BL31 platform APIs. |
| 626 | This information is required until the start of execution of BL33. This |
| 627 | information can be provided in a platform defined manner, e.g. compiled into |
| 628 | the platform code in BL31, or provided in a platform defined memory location |
| 629 | by the Trusted Boot firmware, or passed from the Trusted Boot Firmware via the |
| 630 | Cold boot Initialization parameters. This data may need to be cleaned out of |
| 631 | the CPU caches if it is provided by an earlier boot stage and then accessed by |
| 632 | BL31 platform code before the caches are enabled. |
| 633 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 634 | TF-A's BL2 implementation passes a ``bl31_params`` structure in |
| 635 | ``X0`` and the Arm development platforms interpret this in the BL31 platform |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 636 | code. |
| 637 | |
| 638 | MMU, Data caches & Coherency |
| 639 | '''''''''''''''''''''''''''' |
| 640 | |
| 641 | BL31 does not depend on the enabled state of the MMU, data caches or |
| 642 | interconnect coherency on entry to ``bl31_entrypoint()``. If these are disabled |
| 643 | on entry, these should be enabled during ``bl31_plat_arch_setup()``. |
| 644 | |
| 645 | Data structures used in the BL31 cold boot interface |
| 646 | '''''''''''''''''''''''''''''''''''''''''''''''''''' |
| 647 | |
Harrison Mutai | 5b0366b | 2024-01-30 14:21:12 +0000 | [diff] [blame] | 648 | In the cold boot flow, ``entry_point_info`` is used to represent the execution |
| 649 | state of an image; that is, the state of general purpose registers, PC, and |
| 650 | SPSR. |
| 651 | |
| 652 | There are two variants of this structure, for AArch64: |
| 653 | |
| 654 | .. code:: c |
| 655 | |
| 656 | typedef struct entry_point_info { |
| 657 | param_header_t h; |
| 658 | uintptr_t pc; |
| 659 | uint32_t spsr; |
| 660 | |
| 661 | aapcs64_params_t args; |
| 662 | } |
| 663 | |
| 664 | and, AArch32: |
| 665 | |
| 666 | .. code:: c |
| 667 | |
| 668 | typedef struct entry_point_info { |
| 669 | param_header_t h; |
| 670 | uintptr_t pc; |
| 671 | uint32_t spsr; |
| 672 | |
| 673 | uintptr_t lr_svc; |
| 674 | aapcs32_params_t args; |
| 675 | } entry_point_info_t; |
| 676 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 677 | These structures are designed to support compatibility and independent |
| 678 | evolution of the structures and the firmware images. For example, a version of |
| 679 | BL31 that can interpret the BL3x image information from different versions of |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 680 | BL2, a platform that uses an extended entry_point_info structure to convey |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 681 | additional register information to BL31, or a ELF image loader that can convey |
| 682 | more details about the firmware images. |
| 683 | |
| 684 | To support these scenarios the structures are versioned and sized, which enables |
| 685 | BL31 to detect which information is present and respond appropriately. The |
| 686 | ``param_header`` is defined to capture this information: |
| 687 | |
| 688 | .. code:: c |
| 689 | |
| 690 | typedef struct param_header { |
| 691 | uint8_t type; /* type of the structure */ |
| 692 | uint8_t version; /* version of this structure */ |
| 693 | uint16_t size; /* size of this structure in bytes */ |
Harrison Mutai | 5b0366b | 2024-01-30 14:21:12 +0000 | [diff] [blame] | 694 | uint32_t attr; /* attributes */ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 695 | } param_header_t; |
| 696 | |
Harrison Mutai | 5b0366b | 2024-01-30 14:21:12 +0000 | [diff] [blame] | 697 | In `entry_point_info`, Bits 0 and 5 of ``attr`` field are used to encode the |
| 698 | security state; in other words, whether the image is to be executed in Secure, |
| 699 | Non-Secure, or Realm mode. |
| 700 | |
| 701 | Other structures using this format are ``image_info`` and ``bl31_params``. The |
| 702 | code that allocates and populates these structures must set the header fields |
| 703 | appropriately, the ``SET_PARAM_HEAD()`` macro is defined to simplify this |
| 704 | action. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 705 | |
| 706 | Required CPU state for BL31 Warm boot initialization |
| 707 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 708 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 709 | When requesting a CPU power-on, or suspending a running CPU, TF-A provides |
| 710 | the platform power management code with a Warm boot initialization |
| 711 | entry-point, to be invoked by the CPU immediately after the reset handler. |
| 712 | On entry to the Warm boot initialization function the calling CPU must be in |
| 713 | AArch64 EL3, little-endian data access and all interrupt sources masked: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 714 | |
| 715 | :: |
| 716 | |
| 717 | PSTATE.EL = 3 |
| 718 | PSTATE.RW = 1 |
| 719 | PSTATE.DAIF = 0xf |
| 720 | SCTLR_EL3.EE = 0 |
| 721 | |
| 722 | The PSCI implementation will initialize the processor state and ensure that the |
| 723 | platform power management code is then invoked as required to initialize all |
| 724 | necessary system, cluster and CPU resources. |
| 725 | |
| 726 | AArch32 EL3 Runtime Software entrypoint interface |
| 727 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 728 | |
| 729 | To enable this firmware architecture it is important to provide a fully |
| 730 | documented and stable interface between the Trusted Boot Firmware and the |
| 731 | AArch32 EL3 Runtime Software. |
| 732 | |
| 733 | Future changes to the entrypoint interface will be done in a backwards |
| 734 | compatible way, and this enables these firmware components to be independently |
| 735 | enhanced/updated to develop and exploit new functionality. |
| 736 | |
| 737 | Required CPU state when entering during cold boot |
| 738 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 739 | |
| 740 | This function must only be called by the primary CPU. |
| 741 | |
| 742 | On entry to this function the calling primary CPU must be executing in AArch32 |
| 743 | EL3, little-endian data access, and all interrupt sources masked: |
| 744 | |
| 745 | :: |
| 746 | |
| 747 | PSTATE.AIF = 0x7 |
| 748 | SCTLR.EE = 0 |
| 749 | |
| 750 | R0 and R1 are used to pass information from the Trusted Boot Firmware to the |
| 751 | platform code in AArch32 EL3 Runtime Software: |
| 752 | |
| 753 | :: |
| 754 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 755 | R0 : Reserved for common TF-A information |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 756 | R1 : Platform specific information |
| 757 | |
| 758 | Use of the R0 and R1 parameters |
| 759 | ''''''''''''''''''''''''''''''' |
| 760 | |
| 761 | The parameters are platform specific and the convention is that ``R0`` conveys |
| 762 | information regarding the BL3x images from the Trusted Boot firmware and ``R1`` |
| 763 | can be used for other platform specific purpose. This convention allows |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 764 | platforms which use TF-A's BL1 and BL2 images to transfer additional platform |
| 765 | specific information from Secure Boot without conflicting with future |
| 766 | evolution of TF-A using ``R0`` to pass a ``bl_params`` structure. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 767 | |
| 768 | The AArch32 EL3 Runtime Software is responsible for entry into BL33. This |
| 769 | information can be obtained in a platform defined manner, e.g. compiled into |
| 770 | the AArch32 EL3 Runtime Software, or provided in a platform defined memory |
| 771 | location by the Trusted Boot firmware, or passed from the Trusted Boot Firmware |
| 772 | via the Cold boot Initialization parameters. This data may need to be cleaned |
| 773 | out of the CPU caches if it is provided by an earlier boot stage and then |
| 774 | accessed by AArch32 EL3 Runtime Software before the caches are enabled. |
| 775 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 776 | When using AArch32 EL3 Runtime Software, the Arm development platforms pass a |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 777 | ``bl_params`` structure in ``R0`` from BL2 to be interpreted by AArch32 EL3 Runtime |
| 778 | Software platform code. |
| 779 | |
| 780 | MMU, Data caches & Coherency |
| 781 | '''''''''''''''''''''''''''' |
| 782 | |
| 783 | AArch32 EL3 Runtime Software must not depend on the enabled state of the MMU, |
| 784 | data caches or interconnect coherency in its entrypoint. They must be explicitly |
| 785 | enabled if required. |
| 786 | |
| 787 | Data structures used in cold boot interface |
| 788 | ''''''''''''''''''''''''''''''''''''''''''' |
| 789 | |
| 790 | The AArch32 EL3 Runtime Software cold boot interface uses ``bl_params`` instead |
| 791 | of ``bl31_params``. The ``bl_params`` structure is based on the convention |
| 792 | described in AArch64 BL31 cold boot interface section. |
| 793 | |
| 794 | Required CPU state for warm boot initialization |
| 795 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 796 | |
| 797 | When requesting a CPU power-on, or suspending a running CPU, AArch32 EL3 |
| 798 | Runtime Software must ensure execution of a warm boot initialization entrypoint. |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 799 | If TF-A BL1 is used and the PROGRAMMABLE_RESET_ADDRESS build flag is false, |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 800 | then AArch32 EL3 Runtime Software must ensure that BL1 branches to the warm |
| 801 | boot entrypoint by arranging for the BL1 platform function, |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 802 | plat_get_my_entrypoint(), to return a non-zero value. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 803 | |
| 804 | In this case, the warm boot entrypoint must be in AArch32 EL3, little-endian |
| 805 | data access and all interrupt sources masked: |
| 806 | |
| 807 | :: |
| 808 | |
| 809 | PSTATE.AIF = 0x7 |
| 810 | SCTLR.EE = 0 |
| 811 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 812 | The warm boot entrypoint may be implemented by using TF-A |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 813 | ``psci_warmboot_entrypoint()`` function. In that case, the platform must fulfil |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 814 | the pre-requisites mentioned in the |
| 815 | :ref:`PSCI Library Integration guide for Armv8-A AArch32 systems`. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 816 | |
| 817 | EL3 runtime services framework |
| 818 | ------------------------------ |
| 819 | |
| 820 | Software executing in the non-secure state and in the secure state at exception |
| 821 | levels lower than EL3 will request runtime services using the Secure Monitor |
| 822 | Call (SMC) instruction. These requests will follow the convention described in |
| 823 | the SMC Calling Convention PDD (`SMCCC`_). The `SMCCC`_ assigns function |
| 824 | identifiers to each SMC request and describes how arguments are passed and |
| 825 | returned. |
| 826 | |
| 827 | The EL3 runtime services framework enables the development of services by |
| 828 | different providers that can be easily integrated into final product firmware. |
| 829 | The following sections describe the framework which facilitates the |
| 830 | registration, initialization and use of runtime services in EL3 Runtime |
| 831 | Software (BL31). |
| 832 | |
| 833 | The design of the runtime services depends heavily on the concepts and |
| 834 | definitions described in the `SMCCC`_, in particular SMC Function IDs, Owning |
| 835 | Entity Numbers (OEN), Fast and Yielding calls, and the SMC32 and SMC64 calling |
| 836 | conventions. Please refer to that document for more detailed explanation of |
| 837 | these terms. |
| 838 | |
| 839 | The following runtime services are expected to be implemented first. They have |
| 840 | not all been instantiated in the current implementation. |
| 841 | |
| 842 | #. Standard service calls |
| 843 | |
| 844 | This service is for management of the entire system. The Power State |
| 845 | Coordination Interface (`PSCI`_) is the first set of standard service calls |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 846 | defined by Arm (see PSCI section later). |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 847 | |
| 848 | #. Secure-EL1 Payload Dispatcher service |
| 849 | |
| 850 | If a system runs a Trusted OS or other Secure-EL1 Payload (SP) then |
| 851 | it also requires a *Secure Monitor* at EL3 to switch the EL1 processor |
| 852 | context between the normal world (EL1/EL2) and trusted world (Secure-EL1). |
| 853 | The Secure Monitor will make these world switches in response to SMCs. The |
| 854 | `SMCCC`_ provides for such SMCs with the Trusted OS Call and Trusted |
| 855 | Application Call OEN ranges. |
| 856 | |
| 857 | The interface between the EL3 Runtime Software and the Secure-EL1 Payload is |
| 858 | not defined by the `SMCCC`_ or any other standard. As a result, each |
| 859 | Secure-EL1 Payload requires a specific Secure Monitor that runs as a runtime |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 860 | service - within TF-A this service is referred to as the Secure-EL1 Payload |
| 861 | Dispatcher (SPD). |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 862 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 863 | TF-A provides a Test Secure-EL1 Payload (TSP) and its associated Dispatcher |
| 864 | (TSPD). Details of SPD design and TSP/TSPD operation are described in the |
Paul Beesley | d2fcc4e | 2019-05-29 13:59:40 +0100 | [diff] [blame] | 865 | :ref:`firmware_design_sel1_spd` section below. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 866 | |
| 867 | #. CPU implementation service |
| 868 | |
| 869 | This service will provide an interface to CPU implementation specific |
| 870 | services for a given platform e.g. access to processor errata workarounds. |
| 871 | This service is currently unimplemented. |
| 872 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 873 | Additional services for Arm Architecture, SiP and OEM calls can be implemented. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 874 | Each implemented service handles a range of SMC function identifiers as |
| 875 | described in the `SMCCC`_. |
| 876 | |
| 877 | Registration |
| 878 | ~~~~~~~~~~~~ |
| 879 | |
| 880 | A runtime service is registered using the ``DECLARE_RT_SVC()`` macro, specifying |
| 881 | the name of the service, the range of OENs covered, the type of service and |
| 882 | initialization and call handler functions. This macro instantiates a ``const struct rt_svc_desc`` for the service with these details (see ``runtime_svc.h``). |
Chris Kay | 33bfc5e | 2023-02-14 11:30:04 +0000 | [diff] [blame] | 883 | This structure is allocated in a special ELF section ``.rt_svc_descs``, enabling |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 884 | the framework to find all service descriptors included into BL31. |
| 885 | |
| 886 | The specific service for a SMC Function is selected based on the OEN and call |
| 887 | type of the Function ID, and the framework uses that information in the service |
| 888 | descriptor to identify the handler for the SMC Call. |
| 889 | |
| 890 | The service descriptors do not include information to identify the precise set |
| 891 | of SMC function identifiers supported by this service implementation, the |
| 892 | security state from which such calls are valid nor the capability to support |
| 893 | 64-bit and/or 32-bit callers (using SMC32 or SMC64). Responding appropriately |
| 894 | to these aspects of a SMC call is the responsibility of the service |
| 895 | implementation, the framework is focused on integration of services from |
| 896 | different providers and minimizing the time taken by the framework before the |
| 897 | service handler is invoked. |
| 898 | |
| 899 | Details of the parameters, requirements and behavior of the initialization and |
| 900 | call handling functions are provided in the following sections. |
| 901 | |
| 902 | Initialization |
| 903 | ~~~~~~~~~~~~~~ |
| 904 | |
| 905 | ``runtime_svc_init()`` in ``runtime_svc.c`` initializes the runtime services |
| 906 | framework running on the primary CPU during cold boot as part of the BL31 |
| 907 | initialization. This happens prior to initializing a Trusted OS and running |
| 908 | Normal world boot firmware that might in turn use these services. |
| 909 | Initialization involves validating each of the declared runtime service |
| 910 | descriptors, calling the service initialization function and populating the |
| 911 | index used for runtime lookup of the service. |
| 912 | |
| 913 | The BL31 linker script collects all of the declared service descriptors into a |
| 914 | single array and defines symbols that allow the framework to locate and traverse |
| 915 | the array, and determine its size. |
| 916 | |
| 917 | The framework does basic validation of each descriptor to halt firmware |
| 918 | initialization if service declaration errors are detected. The framework does |
| 919 | not check descriptors for the following error conditions, and may behave in an |
| 920 | unpredictable manner under such scenarios: |
| 921 | |
| 922 | #. Overlapping OEN ranges |
| 923 | #. Multiple descriptors for the same range of OENs and ``call_type`` |
| 924 | #. Incorrect range of owning entity numbers for a given ``call_type`` |
| 925 | |
| 926 | Once validated, the service ``init()`` callback is invoked. This function carries |
| 927 | out any essential EL3 initialization before servicing requests. The ``init()`` |
| 928 | function is only invoked on the primary CPU during cold boot. If the service |
| 929 | uses per-CPU data this must either be initialized for all CPUs during this call, |
| 930 | or be done lazily when a CPU first issues an SMC call to that service. If |
| 931 | ``init()`` returns anything other than ``0``, this is treated as an initialization |
| 932 | error and the service is ignored: this does not cause the firmware to halt. |
| 933 | |
| 934 | The OEN and call type fields present in the SMC Function ID cover a total of |
| 935 | 128 distinct services, but in practice a single descriptor can cover a range of |
| 936 | OENs, e.g. SMCs to call a Trusted OS function. To optimize the lookup of a |
| 937 | service handler, the framework uses an array of 128 indices that map every |
| 938 | distinct OEN/call-type combination either to one of the declared services or to |
| 939 | indicate the service is not handled. This ``rt_svc_descs_indices[]`` array is |
| 940 | populated for all of the OENs covered by a service after the service ``init()`` |
| 941 | function has reported success. So a service that fails to initialize will never |
| 942 | have it's ``handle()`` function invoked. |
| 943 | |
| 944 | The following figure shows how the ``rt_svc_descs_indices[]`` index maps the SMC |
| 945 | Function ID call type and OEN onto a specific service handler in the |
| 946 | ``rt_svc_descs[]`` array. |
| 947 | |
| 948 | |Image 1| |
| 949 | |
Madhukar Pappireddy | 86350ae | 2020-07-29 09:37:25 -0500 | [diff] [blame] | 950 | .. _handling-an-smc: |
| 951 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 952 | Handling an SMC |
| 953 | ~~~~~~~~~~~~~~~ |
| 954 | |
| 955 | When the EL3 runtime services framework receives a Secure Monitor Call, the SMC |
| 956 | Function ID is passed in W0 from the lower exception level (as per the |
| 957 | `SMCCC`_). If the calling register width is AArch32, it is invalid to invoke an |
| 958 | SMC Function which indicates the SMC64 calling convention: such calls are |
| 959 | ignored and return the Unknown SMC Function Identifier result code ``0xFFFFFFFF`` |
| 960 | in R0/X0. |
| 961 | |
| 962 | Bit[31] (fast/yielding call) and bits[29:24] (owning entity number) of the SMC |
| 963 | Function ID are combined to index into the ``rt_svc_descs_indices[]`` array. The |
| 964 | resulting value might indicate a service that has no handler, in this case the |
| 965 | framework will also report an Unknown SMC Function ID. Otherwise, the value is |
| 966 | used as a further index into the ``rt_svc_descs[]`` array to locate the required |
| 967 | service and handler. |
| 968 | |
| 969 | The service's ``handle()`` callback is provided with five of the SMC parameters |
| 970 | directly, the others are saved into memory for retrieval (if needed) by the |
| 971 | handler. The handler is also provided with an opaque ``handle`` for use with the |
| 972 | supporting library for parameter retrieval, setting return values and context |
Olivier Deprez | 33dd845 | 2022-10-11 15:38:27 +0200 | [diff] [blame] | 973 | manipulation. The ``flags`` parameter indicates the security state of the caller |
| 974 | and the state of the SVE hint bit per the SMCCCv1.3. The framework finally sets |
| 975 | up the execution stack for the handler, and invokes the services ``handle()`` |
| 976 | function. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 977 | |
Madhukar Pappireddy | 20be077 | 2019-11-09 23:28:08 -0600 | [diff] [blame] | 978 | On return from the handler the result registers are populated in X0-X7 as needed |
| 979 | before restoring the stack and CPU state and returning from the original SMC. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 980 | |
Jeenu Viswambharan | cbb40d5 | 2017-10-18 14:30:53 +0100 | [diff] [blame] | 981 | Exception Handling Framework |
| 982 | ---------------------------- |
| 983 | |
johpow01 | 7402f07 | 2020-07-28 13:07:25 -0500 | [diff] [blame] | 984 | Please refer to the :ref:`Exception Handling Framework` document. |
Jeenu Viswambharan | cbb40d5 | 2017-10-18 14:30:53 +0100 | [diff] [blame] | 985 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 986 | Power State Coordination Interface |
| 987 | ---------------------------------- |
| 988 | |
| 989 | TODO: Provide design walkthrough of PSCI implementation. |
| 990 | |
Roberto Vargas | d963e3e | 2017-09-12 10:28:35 +0100 | [diff] [blame] | 991 | The PSCI v1.1 specification categorizes APIs as optional and mandatory. All the |
| 992 | mandatory APIs in PSCI v1.1, PSCI v1.0 and in PSCI v0.2 draft specification |
Manish V Badarkhe | 9d24e9b | 2023-06-15 09:14:33 +0100 | [diff] [blame] | 993 | `PSCI`_ are implemented. The table lists the PSCI v1.1 APIs and their support |
| 994 | in generic code. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 995 | |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 996 | An API implementation might have a dependency on platform code e.g. CPU_SUSPEND |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 997 | requires the platform to export a part of the implementation. Hence the level |
| 998 | of support of the mandatory APIs depends upon the support exported by the |
| 999 | platform port as well. The Juno and FVP (all variants) platforms export all the |
| 1000 | required support. |
| 1001 | |
| 1002 | +-----------------------------+-------------+-------------------------------+ |
Roberto Vargas | d963e3e | 2017-09-12 10:28:35 +0100 | [diff] [blame] | 1003 | | PSCI v1.1 API | Supported | Comments | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1004 | +=============================+=============+===============================+ |
Roberto Vargas | d963e3e | 2017-09-12 10:28:35 +0100 | [diff] [blame] | 1005 | | ``PSCI_VERSION`` | Yes | The version returned is 1.1 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1006 | +-----------------------------+-------------+-------------------------------+ |
| 1007 | | ``CPU_SUSPEND`` | Yes\* | | |
| 1008 | +-----------------------------+-------------+-------------------------------+ |
| 1009 | | ``CPU_OFF`` | Yes\* | | |
| 1010 | +-----------------------------+-------------+-------------------------------+ |
| 1011 | | ``CPU_ON`` | Yes\* | | |
| 1012 | +-----------------------------+-------------+-------------------------------+ |
| 1013 | | ``AFFINITY_INFO`` | Yes | | |
| 1014 | +-----------------------------+-------------+-------------------------------+ |
| 1015 | | ``MIGRATE`` | Yes\*\* | | |
| 1016 | +-----------------------------+-------------+-------------------------------+ |
| 1017 | | ``MIGRATE_INFO_TYPE`` | Yes\*\* | | |
| 1018 | +-----------------------------+-------------+-------------------------------+ |
| 1019 | | ``MIGRATE_INFO_CPU`` | Yes\*\* | | |
| 1020 | +-----------------------------+-------------+-------------------------------+ |
| 1021 | | ``SYSTEM_OFF`` | Yes\* | | |
| 1022 | +-----------------------------+-------------+-------------------------------+ |
| 1023 | | ``SYSTEM_RESET`` | Yes\* | | |
| 1024 | +-----------------------------+-------------+-------------------------------+ |
| 1025 | | ``PSCI_FEATURES`` | Yes | | |
| 1026 | +-----------------------------+-------------+-------------------------------+ |
| 1027 | | ``CPU_FREEZE`` | No | | |
| 1028 | +-----------------------------+-------------+-------------------------------+ |
| 1029 | | ``CPU_DEFAULT_SUSPEND`` | No | | |
| 1030 | +-----------------------------+-------------+-------------------------------+ |
| 1031 | | ``NODE_HW_STATE`` | Yes\* | | |
| 1032 | +-----------------------------+-------------+-------------------------------+ |
| 1033 | | ``SYSTEM_SUSPEND`` | Yes\* | | |
| 1034 | +-----------------------------+-------------+-------------------------------+ |
| 1035 | | ``PSCI_SET_SUSPEND_MODE`` | No | | |
| 1036 | +-----------------------------+-------------+-------------------------------+ |
| 1037 | | ``PSCI_STAT_RESIDENCY`` | Yes\* | | |
| 1038 | +-----------------------------+-------------+-------------------------------+ |
| 1039 | | ``PSCI_STAT_COUNT`` | Yes\* | | |
| 1040 | +-----------------------------+-------------+-------------------------------+ |
Roberto Vargas | d963e3e | 2017-09-12 10:28:35 +0100 | [diff] [blame] | 1041 | | ``SYSTEM_RESET2`` | Yes\* | | |
| 1042 | +-----------------------------+-------------+-------------------------------+ |
| 1043 | | ``MEM_PROTECT`` | Yes\* | | |
| 1044 | +-----------------------------+-------------+-------------------------------+ |
| 1045 | | ``MEM_PROTECT_CHECK_RANGE`` | Yes\* | | |
| 1046 | +-----------------------------+-------------+-------------------------------+ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1047 | |
| 1048 | \*Note : These PSCI APIs require platform power management hooks to be |
| 1049 | registered with the generic PSCI code to be supported. |
| 1050 | |
| 1051 | \*\*Note : These PSCI APIs require appropriate Secure Payload Dispatcher |
| 1052 | hooks to be registered with the generic PSCI code to be supported. |
| 1053 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1054 | The PSCI implementation in TF-A is a library which can be integrated with |
| 1055 | AArch64 or AArch32 EL3 Runtime Software for Armv8-A systems. A guide to |
| 1056 | integrating PSCI library with AArch32 EL3 Runtime Software can be found |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 1057 | at :ref:`PSCI Library Integration guide for Armv8-A AArch32 systems`. |
| 1058 | |
| 1059 | .. _firmware_design_sel1_spd: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1060 | |
| 1061 | Secure-EL1 Payloads and Dispatchers |
| 1062 | ----------------------------------- |
| 1063 | |
| 1064 | On a production system that includes a Trusted OS running in Secure-EL1/EL0, |
| 1065 | the Trusted OS is coupled with a companion runtime service in the BL31 |
| 1066 | firmware. This service is responsible for the initialisation of the Trusted |
| 1067 | OS and all communications with it. The Trusted OS is the BL32 stage of the |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1068 | boot flow in TF-A. The firmware will attempt to locate, load and execute a |
| 1069 | BL32 image. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1070 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1071 | TF-A uses a more general term for the BL32 software that runs at Secure-EL1 - |
| 1072 | the *Secure-EL1 Payload* - as it is not always a Trusted OS. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1073 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1074 | TF-A provides a Test Secure-EL1 Payload (TSP) and a Test Secure-EL1 Payload |
| 1075 | Dispatcher (TSPD) service as an example of how a Trusted OS is supported on a |
| 1076 | production system using the Runtime Services Framework. On such a system, the |
| 1077 | Test BL32 image and service are replaced by the Trusted OS and its dispatcher |
| 1078 | service. The TF-A build system expects that the dispatcher will define the |
| 1079 | build flag ``NEED_BL32`` to enable it to include the BL32 in the build either |
| 1080 | as a binary or to compile from source depending on whether the ``BL32`` build |
| 1081 | option is specified or not. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1082 | |
| 1083 | The TSP runs in Secure-EL1. It is designed to demonstrate synchronous |
| 1084 | communication with the normal-world software running in EL1/EL2. Communication |
| 1085 | is initiated by the normal-world software |
| 1086 | |
| 1087 | - either directly through a Fast SMC (as defined in the `SMCCC`_) |
| 1088 | |
| 1089 | - or indirectly through a `PSCI`_ SMC. The `PSCI`_ implementation in turn |
| 1090 | informs the TSPD about the requested power management operation. This allows |
| 1091 | the TSP to prepare for or respond to the power state change |
| 1092 | |
| 1093 | The TSPD service is responsible for. |
| 1094 | |
| 1095 | - Initializing the TSP |
| 1096 | |
| 1097 | - Routing requests and responses between the secure and the non-secure |
| 1098 | states during the two types of communications just described |
| 1099 | |
| 1100 | Initializing a BL32 Image |
| 1101 | ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 1102 | |
| 1103 | The Secure-EL1 Payload Dispatcher (SPD) service is responsible for initializing |
| 1104 | the BL32 image. It needs access to the information passed by BL2 to BL31 to do |
| 1105 | so. This is provided by: |
| 1106 | |
| 1107 | .. code:: c |
| 1108 | |
| 1109 | entry_point_info_t *bl31_plat_get_next_image_ep_info(uint32_t); |
| 1110 | |
| 1111 | which returns a reference to the ``entry_point_info`` structure corresponding to |
| 1112 | the image which will be run in the specified security state. The SPD uses this |
| 1113 | API to get entry point information for the SECURE image, BL32. |
| 1114 | |
| 1115 | In the absence of a BL32 image, BL31 passes control to the normal world |
| 1116 | bootloader image (BL33). When the BL32 image is present, it is typical |
| 1117 | that the SPD wants control to be passed to BL32 first and then later to BL33. |
| 1118 | |
| 1119 | To do this the SPD has to register a BL32 initialization function during |
| 1120 | initialization of the SPD service. The BL32 initialization function has this |
| 1121 | prototype: |
| 1122 | |
| 1123 | .. code:: c |
| 1124 | |
| 1125 | int32_t init(void); |
| 1126 | |
| 1127 | and is registered using the ``bl31_register_bl32_init()`` function. |
| 1128 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1129 | TF-A supports two approaches for the SPD to pass control to BL32 before |
| 1130 | returning through EL3 and running the non-trusted firmware (BL33): |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1131 | |
| 1132 | #. In the BL32 setup function, use ``bl31_set_next_image_type()`` to |
| 1133 | request that the exit from ``bl31_main()`` is to the BL32 entrypoint in |
| 1134 | Secure-EL1. BL31 will exit to BL32 using the asynchronous method by |
| 1135 | calling ``bl31_prepare_next_image_entry()`` and ``el3_exit()``. |
| 1136 | |
| 1137 | When the BL32 has completed initialization at Secure-EL1, it returns to |
| 1138 | BL31 by issuing an SMC, using a Function ID allocated to the SPD. On |
| 1139 | receipt of this SMC, the SPD service handler should switch the CPU context |
| 1140 | from trusted to normal world and use the ``bl31_set_next_image_type()`` and |
| 1141 | ``bl31_prepare_next_image_entry()`` functions to set up the initial return to |
| 1142 | the normal world firmware BL33. On return from the handler the framework |
| 1143 | will exit to EL2 and run BL33. |
| 1144 | |
| 1145 | #. The BL32 setup function registers an initialization function using |
| 1146 | ``bl31_register_bl32_init()`` which provides a SPD-defined mechanism to |
| 1147 | invoke a 'world-switch synchronous call' to Secure-EL1 to run the BL32 |
| 1148 | entrypoint. |
Paul Beesley | ba3ed40 | 2019-03-13 16:20:44 +0000 | [diff] [blame] | 1149 | |
| 1150 | .. note:: |
| 1151 | The Test SPD service included with TF-A provides one implementation |
| 1152 | of such a mechanism. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1153 | |
| 1154 | On completion BL32 returns control to BL31 via a SMC, and on receipt the |
| 1155 | SPD service handler invokes the synchronous call return mechanism to return |
| 1156 | to the BL32 initialization function. On return from this function, |
| 1157 | ``bl31_main()`` will set up the return to the normal world firmware BL33 and |
| 1158 | continue the boot process in the normal world. |
| 1159 | |
Manish Pandey | 493bdc4 | 2023-07-21 13:08:53 +0100 | [diff] [blame] | 1160 | Exception handling in BL31 |
| 1161 | -------------------------- |
| 1162 | |
| 1163 | When exception occurs, PE must execute handler corresponding to exception. The |
| 1164 | location in memory where the handler is stored is called the exception vector. |
| 1165 | For ARM architecture, exception vectors are stored in a table, called the exception |
| 1166 | vector table. |
| 1167 | |
| 1168 | Each EL (except EL0) has its own vector table, VBAR_ELn register stores the base |
| 1169 | of vector table. Refer to `AArch64 exception vector table`_ |
| 1170 | |
| 1171 | Current EL with SP_EL0 |
| 1172 | ~~~~~~~~~~~~~~~~~~~~~~ |
| 1173 | |
| 1174 | - Sync exception : Not expected except for BRK instruction, its debugging tool which |
| 1175 | a programmer may place at specific points in a program, to check the state of |
| 1176 | processor flags at these points in the code. |
| 1177 | |
| 1178 | - IRQ/FIQ : Unexpected exception, panic |
| 1179 | |
| 1180 | - SError : "plat_handle_el3_ea", defaults to panic |
| 1181 | |
| 1182 | Current EL with SP_ELx |
| 1183 | ~~~~~~~~~~~~~~~~~~~~~~ |
| 1184 | |
| 1185 | - Sync exception : Unexpected exception, panic |
| 1186 | |
| 1187 | - IRQ/FIQ : Unexpected exception, panic |
| 1188 | |
| 1189 | - SError : "plat_handle_el3_ea" Except for special handling of lower EL's SError exception |
| 1190 | which gets triggered in EL3 when PSTATE.A is unmasked. Its only applicable when lower |
| 1191 | EL's EA is routed to EL3 (FFH_SUPPORT=1). |
| 1192 | |
| 1193 | Lower EL Exceptions |
| 1194 | ~~~~~~~~~~~~~~~~~~~ |
| 1195 | |
| 1196 | Applies to all the exceptions in both AArch64/AArch32 mode of lower EL. |
| 1197 | |
| 1198 | Before handling any lower EL exception, we synchronize the errors at EL3 entry to ensure |
| 1199 | that any errors pertaining to lower EL is isolated/identified. If we continue without |
| 1200 | identifying these errors early on then these errors will trigger in EL3 (as SError from |
| 1201 | current EL) any time after PSTATE.A is unmasked. This is wrong because the error originated |
| 1202 | in lower EL but exception happened in EL3. |
| 1203 | |
| 1204 | To solve this problem, synchronize the errors at EL3 entry and check for any pending |
| 1205 | errors (async EA). If there is no pending error then continue with original exception. |
| 1206 | If there is a pending error then, handle them based on routing model of EA's. Refer to |
| 1207 | :ref:`Reliability, Availability, and Serviceability (RAS) Extensions` for details about |
| 1208 | routing models. |
| 1209 | |
| 1210 | - KFH : Reflect it back to lower EL using **reflect_pending_async_ea_to_lower_el()** |
| 1211 | |
| 1212 | - FFH : Handle the synchronized error first using **handle_pending_async_ea()** after |
| 1213 | that continue with original exception. It is the only scenario where EL3 is capable |
| 1214 | of doing nested exception handling. |
| 1215 | |
| 1216 | After synchronizing and handling lower EL SErrors, unmask EA (PSTATE.A) to ensure |
| 1217 | that any further EA's caused by EL3 are caught. |
| 1218 | |
Jeenu Viswambharan | b60420a | 2017-08-24 15:43:44 +0100 | [diff] [blame] | 1219 | Crash Reporting in BL31 |
| 1220 | ----------------------- |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1221 | |
| 1222 | BL31 implements a scheme for reporting the processor state when an unhandled |
| 1223 | exception is encountered. The reporting mechanism attempts to preserve all the |
| 1224 | register contents and report it via a dedicated UART (PL011 console). BL31 |
| 1225 | reports the general purpose, EL3, Secure EL1 and some EL2 state registers. |
| 1226 | |
| 1227 | A dedicated per-CPU crash stack is maintained by BL31 and this is retrieved via |
| 1228 | the per-CPU pointer cache. The implementation attempts to minimise the memory |
| 1229 | required for this feature. The file ``crash_reporting.S`` contains the |
| 1230 | implementation for crash reporting. |
| 1231 | |
| 1232 | The sample crash output is shown below. |
| 1233 | |
| 1234 | :: |
| 1235 | |
Alexei Fedorov | 813c9f9 | 2020-03-03 13:31:58 +0000 | [diff] [blame] | 1236 | x0 = 0x000000002a4a0000 |
| 1237 | x1 = 0x0000000000000001 |
| 1238 | x2 = 0x0000000000000002 |
| 1239 | x3 = 0x0000000000000003 |
| 1240 | x4 = 0x0000000000000004 |
| 1241 | x5 = 0x0000000000000005 |
| 1242 | x6 = 0x0000000000000006 |
| 1243 | x7 = 0x0000000000000007 |
| 1244 | x8 = 0x0000000000000008 |
| 1245 | x9 = 0x0000000000000009 |
| 1246 | x10 = 0x0000000000000010 |
| 1247 | x11 = 0x0000000000000011 |
| 1248 | x12 = 0x0000000000000012 |
| 1249 | x13 = 0x0000000000000013 |
| 1250 | x14 = 0x0000000000000014 |
| 1251 | x15 = 0x0000000000000015 |
| 1252 | x16 = 0x0000000000000016 |
| 1253 | x17 = 0x0000000000000017 |
| 1254 | x18 = 0x0000000000000018 |
| 1255 | x19 = 0x0000000000000019 |
| 1256 | x20 = 0x0000000000000020 |
| 1257 | x21 = 0x0000000000000021 |
| 1258 | x22 = 0x0000000000000022 |
| 1259 | x23 = 0x0000000000000023 |
| 1260 | x24 = 0x0000000000000024 |
| 1261 | x25 = 0x0000000000000025 |
| 1262 | x26 = 0x0000000000000026 |
| 1263 | x27 = 0x0000000000000027 |
| 1264 | x28 = 0x0000000000000028 |
| 1265 | x29 = 0x0000000000000029 |
| 1266 | x30 = 0x0000000088000b78 |
| 1267 | scr_el3 = 0x000000000003073d |
| 1268 | sctlr_el3 = 0x00000000b0cd183f |
| 1269 | cptr_el3 = 0x0000000000000000 |
| 1270 | tcr_el3 = 0x000000008080351c |
| 1271 | daif = 0x00000000000002c0 |
| 1272 | mair_el3 = 0x00000000004404ff |
| 1273 | spsr_el3 = 0x0000000060000349 |
| 1274 | elr_el3 = 0x0000000088000114 |
| 1275 | ttbr0_el3 = 0x0000000004018201 |
| 1276 | esr_el3 = 0x00000000be000000 |
| 1277 | far_el3 = 0x0000000000000000 |
| 1278 | spsr_el1 = 0x0000000000000000 |
| 1279 | elr_el1 = 0x0000000000000000 |
| 1280 | spsr_abt = 0x0000000000000000 |
| 1281 | spsr_und = 0x0000000000000000 |
| 1282 | spsr_irq = 0x0000000000000000 |
| 1283 | spsr_fiq = 0x0000000000000000 |
| 1284 | sctlr_el1 = 0x0000000030d00800 |
| 1285 | actlr_el1 = 0x0000000000000000 |
| 1286 | cpacr_el1 = 0x0000000000000000 |
| 1287 | csselr_el1 = 0x0000000000000000 |
| 1288 | sp_el1 = 0x0000000000000000 |
| 1289 | esr_el1 = 0x0000000000000000 |
| 1290 | ttbr0_el1 = 0x0000000000000000 |
| 1291 | ttbr1_el1 = 0x0000000000000000 |
| 1292 | mair_el1 = 0x0000000000000000 |
| 1293 | amair_el1 = 0x0000000000000000 |
| 1294 | tcr_el1 = 0x0000000000000000 |
| 1295 | tpidr_el1 = 0x0000000000000000 |
| 1296 | tpidr_el0 = 0x0000000000000000 |
| 1297 | tpidrro_el0 = 0x0000000000000000 |
| 1298 | par_el1 = 0x0000000000000000 |
| 1299 | mpidr_el1 = 0x0000000080000000 |
| 1300 | afsr0_el1 = 0x0000000000000000 |
| 1301 | afsr1_el1 = 0x0000000000000000 |
| 1302 | contextidr_el1 = 0x0000000000000000 |
| 1303 | vbar_el1 = 0x0000000000000000 |
| 1304 | cntp_ctl_el0 = 0x0000000000000000 |
| 1305 | cntp_cval_el0 = 0x0000000000000000 |
| 1306 | cntv_ctl_el0 = 0x0000000000000000 |
| 1307 | cntv_cval_el0 = 0x0000000000000000 |
| 1308 | cntkctl_el1 = 0x0000000000000000 |
| 1309 | sp_el0 = 0x0000000004014940 |
| 1310 | isr_el1 = 0x0000000000000000 |
| 1311 | dacr32_el2 = 0x0000000000000000 |
| 1312 | ifsr32_el2 = 0x0000000000000000 |
| 1313 | icc_hppir0_el1 = 0x00000000000003ff |
| 1314 | icc_hppir1_el1 = 0x00000000000003ff |
| 1315 | icc_ctlr_el3 = 0x0000000000080400 |
| 1316 | gicd_ispendr regs (Offsets 0x200-0x278) |
| 1317 | Offset Value |
| 1318 | 0x200: 0x0000000000000000 |
| 1319 | 0x208: 0x0000000000000000 |
| 1320 | 0x210: 0x0000000000000000 |
| 1321 | 0x218: 0x0000000000000000 |
| 1322 | 0x220: 0x0000000000000000 |
| 1323 | 0x228: 0x0000000000000000 |
| 1324 | 0x230: 0x0000000000000000 |
| 1325 | 0x238: 0x0000000000000000 |
| 1326 | 0x240: 0x0000000000000000 |
| 1327 | 0x248: 0x0000000000000000 |
| 1328 | 0x250: 0x0000000000000000 |
| 1329 | 0x258: 0x0000000000000000 |
| 1330 | 0x260: 0x0000000000000000 |
| 1331 | 0x268: 0x0000000000000000 |
| 1332 | 0x270: 0x0000000000000000 |
| 1333 | 0x278: 0x0000000000000000 |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1334 | |
| 1335 | Guidelines for Reset Handlers |
| 1336 | ----------------------------- |
| 1337 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1338 | TF-A implements a framework that allows CPU and platform ports to perform |
| 1339 | actions very early after a CPU is released from reset in both the cold and warm |
| 1340 | boot paths. This is done by calling the ``reset_handler()`` function in both |
| 1341 | the BL1 and BL31 images. It in turn calls the platform and CPU specific reset |
| 1342 | handling functions. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1343 | |
| 1344 | Details for implementing a CPU specific reset handler can be found in |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1345 | :ref:`firmware_design_cpu_specific_reset_handling`. Details for implementing a |
| 1346 | platform specific reset handler can be found in the :ref:`Porting Guide` (see |
| 1347 | the``plat_reset_handler()`` function). |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1348 | |
| 1349 | When adding functionality to a reset handler, keep in mind that if a different |
| 1350 | reset handling behavior is required between the first and the subsequent |
| 1351 | invocations of the reset handling code, this should be detected at runtime. |
| 1352 | In other words, the reset handler should be able to detect whether an action has |
| 1353 | already been performed and act as appropriate. Possible courses of actions are, |
| 1354 | e.g. skip the action the second time, or undo/redo it. |
| 1355 | |
Madhukar Pappireddy | 86350ae | 2020-07-29 09:37:25 -0500 | [diff] [blame] | 1356 | .. _configuring-secure-interrupts: |
| 1357 | |
Jeenu Viswambharan | aeb267c | 2017-09-22 08:32:09 +0100 | [diff] [blame] | 1358 | Configuring secure interrupts |
| 1359 | ----------------------------- |
| 1360 | |
| 1361 | The GIC driver is responsible for performing initial configuration of secure |
| 1362 | interrupts on the platform. To this end, the platform is expected to provide the |
| 1363 | GIC driver (either GICv2 or GICv3, as selected by the platform) with the |
| 1364 | interrupt configuration during the driver initialisation. |
| 1365 | |
Antonio Nino Diaz | 29b9f5b | 2018-09-24 17:23:24 +0100 | [diff] [blame] | 1366 | Secure interrupt configuration are specified in an array of secure interrupt |
| 1367 | properties. In this scheme, in both GICv2 and GICv3 driver data structures, the |
| 1368 | ``interrupt_props`` member points to an array of interrupt properties. Each |
Antonio Nino Diaz | 56b68ad | 2019-02-28 13:35:21 +0000 | [diff] [blame] | 1369 | element of the array specifies the interrupt number and its attributes |
| 1370 | (priority, group, configuration). Each element of the array shall be populated |
| 1371 | by the macro ``INTR_PROP_DESC()``. The macro takes the following arguments: |
Jeenu Viswambharan | aeb267c | 2017-09-22 08:32:09 +0100 | [diff] [blame] | 1372 | |
Ming Huang | 1bea7aa | 2023-02-01 14:03:44 +0800 | [diff] [blame] | 1373 | - 13-bit interrupt number, |
Jeenu Viswambharan | aeb267c | 2017-09-22 08:32:09 +0100 | [diff] [blame] | 1374 | |
Antonio Nino Diaz | 29b9f5b | 2018-09-24 17:23:24 +0100 | [diff] [blame] | 1375 | - 8-bit interrupt priority, |
Jeenu Viswambharan | aeb267c | 2017-09-22 08:32:09 +0100 | [diff] [blame] | 1376 | |
Antonio Nino Diaz | 29b9f5b | 2018-09-24 17:23:24 +0100 | [diff] [blame] | 1377 | - Interrupt type (one of ``INTR_TYPE_EL3``, ``INTR_TYPE_S_EL1``, |
| 1378 | ``INTR_TYPE_NS``), |
Jeenu Viswambharan | aeb267c | 2017-09-22 08:32:09 +0100 | [diff] [blame] | 1379 | |
Antonio Nino Diaz | 29b9f5b | 2018-09-24 17:23:24 +0100 | [diff] [blame] | 1380 | - Interrupt configuration (either ``GIC_INTR_CFG_LEVEL`` or |
| 1381 | ``GIC_INTR_CFG_EDGE``). |
Jeenu Viswambharan | aeb267c | 2017-09-22 08:32:09 +0100 | [diff] [blame] | 1382 | |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 1383 | .. _firmware_design_cpu_ops_fwk: |
| 1384 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1385 | CPU specific operations framework |
| 1386 | --------------------------------- |
| 1387 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1388 | Certain aspects of the Armv8-A architecture are implementation defined, |
| 1389 | that is, certain behaviours are not architecturally defined, but must be |
| 1390 | defined and documented by individual processor implementations. TF-A |
| 1391 | implements a framework which categorises the common implementation defined |
| 1392 | behaviours and allows a processor to export its implementation of that |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1393 | behaviour. The categories are: |
| 1394 | |
| 1395 | #. Processor specific reset sequence. |
| 1396 | |
| 1397 | #. Processor specific power down sequences. |
| 1398 | |
| 1399 | #. Processor specific register dumping as a part of crash reporting. |
| 1400 | |
| 1401 | #. Errata status reporting. |
| 1402 | |
| 1403 | Each of the above categories fulfils a different requirement. |
| 1404 | |
| 1405 | #. allows any processor specific initialization before the caches and MMU |
| 1406 | are turned on, like implementation of errata workarounds, entry into |
| 1407 | the intra-cluster coherency domain etc. |
| 1408 | |
| 1409 | #. allows each processor to implement the power down sequence mandated in |
| 1410 | its Technical Reference Manual (TRM). |
| 1411 | |
| 1412 | #. allows a processor to provide additional information to the developer |
| 1413 | in the event of a crash, for example Cortex-A53 has registers which |
| 1414 | can expose the data cache contents. |
| 1415 | |
| 1416 | #. allows a processor to define a function that inspects and reports the status |
| 1417 | of all errata workarounds on that processor. |
| 1418 | |
| 1419 | Please note that only 2. is mandated by the TRM. |
| 1420 | |
| 1421 | The CPU specific operations framework scales to accommodate a large number of |
| 1422 | different CPUs during power down and reset handling. The platform can specify |
| 1423 | any CPU optimization it wants to enable for each CPU. It can also specify |
| 1424 | the CPU errata workarounds to be applied for each CPU type during reset |
| 1425 | handling by defining CPU errata compile time macros. Details on these macros |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 1426 | can be found in the :ref:`Arm CPU Specific Build Macros` document. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1427 | |
| 1428 | The CPU specific operations framework depends on the ``cpu_ops`` structure which |
| 1429 | needs to be exported for each type of CPU in the platform. It is defined in |
| 1430 | ``include/lib/cpus/aarch64/cpu_macros.S`` and has the following fields : ``midr``, |
| 1431 | ``reset_func()``, ``cpu_pwr_down_ops`` (array of power down functions) and |
| 1432 | ``cpu_reg_dump()``. |
| 1433 | |
| 1434 | The CPU specific files in ``lib/cpus`` export a ``cpu_ops`` data structure with |
| 1435 | suitable handlers for that CPU. For example, ``lib/cpus/aarch64/cortex_a53.S`` |
| 1436 | exports the ``cpu_ops`` for Cortex-A53 CPU. According to the platform |
| 1437 | configuration, these CPU specific files must be included in the build by |
| 1438 | the platform makefile. The generic CPU specific operations framework code exists |
| 1439 | in ``lib/cpus/aarch64/cpu_helpers.S``. |
| 1440 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1441 | CPU PCS |
| 1442 | ~~~~~~~ |
| 1443 | |
| 1444 | All assembly functions in CPU files are asked to follow a modified version of |
| 1445 | the Procedure Call Standard (PCS) in their internals. This is done to ensure |
| 1446 | calling these functions from outside the file doesn't unexpectedly corrupt |
| 1447 | registers in the very early environment and to help the internals to be easier |
| 1448 | to understand. Please see the :ref:`firmware_design_cpu_errata_implementation` |
| 1449 | for any function specific restrictions. |
| 1450 | |
| 1451 | +--------------+---------------------------------+ |
| 1452 | | register | use | |
| 1453 | +==============+=================================+ |
| 1454 | | x0 - x15 | scratch | |
| 1455 | +--------------+---------------------------------+ |
| 1456 | | x16, x17 | do not use (used by the linker) | |
| 1457 | +--------------+---------------------------------+ |
| 1458 | | x18 | do not use (platform register) | |
| 1459 | +--------------+---------------------------------+ |
| 1460 | | x19 - x28 | callee saved | |
| 1461 | +--------------+---------------------------------+ |
| 1462 | | x29, x30 | FP, LR | |
| 1463 | +--------------+---------------------------------+ |
| 1464 | |
| 1465 | .. _firmware_design_cpu_specific_reset_handling: |
| 1466 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1467 | CPU specific Reset Handling |
| 1468 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 1469 | |
| 1470 | After a reset, the state of the CPU when it calls generic reset handler is: |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1471 | MMU turned off, both instruction and data caches turned off, not part |
| 1472 | of any coherency domain and no stack. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1473 | |
| 1474 | The BL entrypoint code first invokes the ``plat_reset_handler()`` to allow |
| 1475 | the platform to perform any system initialization required and any system |
| 1476 | errata workarounds that needs to be applied. The ``get_cpu_ops_ptr()`` reads |
| 1477 | the current CPU midr, finds the matching ``cpu_ops`` entry in the ``cpu_ops`` |
| 1478 | array and returns it. Note that only the part number and implementer fields |
| 1479 | in midr are used to find the matching ``cpu_ops`` entry. The ``reset_func()`` in |
| 1480 | the returned ``cpu_ops`` is then invoked which executes the required reset |
| 1481 | handling for that CPU and also any errata workarounds enabled by the platform. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1482 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1483 | It should be defined using the ``cpu_reset_func_{start,end}`` macros and its |
| 1484 | body may only clobber x0 to x14 with x14 being the cpu_rev parameter. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1485 | |
| 1486 | CPU specific power down sequence |
| 1487 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 1488 | |
| 1489 | During the BL31 initialization sequence, the pointer to the matching ``cpu_ops`` |
| 1490 | entry is stored in per-CPU data by ``init_cpu_ops()`` so that it can be quickly |
| 1491 | retrieved during power down sequences. |
| 1492 | |
| 1493 | Various CPU drivers register handlers to perform power down at certain power |
| 1494 | levels for that specific CPU. The PSCI service, upon receiving a power down |
| 1495 | request, determines the highest power level at which to execute power down |
| 1496 | sequence for a particular CPU. It uses the ``prepare_cpu_pwr_dwn()`` function to |
| 1497 | pick the right power down handler for the requested level. The function |
| 1498 | retrieves ``cpu_ops`` pointer member of per-CPU data, and from that, further |
| 1499 | retrieves ``cpu_pwr_down_ops`` array, and indexes into the required level. If the |
| 1500 | requested power level is higher than what a CPU driver supports, the handler |
| 1501 | registered for highest level is invoked. |
| 1502 | |
| 1503 | At runtime the platform hooks for power down are invoked by the PSCI service to |
| 1504 | perform platform specific operations during a power down sequence, for example |
| 1505 | turning off CCI coherency during a cluster power down. |
| 1506 | |
| 1507 | CPU specific register reporting during crash |
| 1508 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 1509 | |
| 1510 | If the crash reporting is enabled in BL31, when a crash occurs, the crash |
| 1511 | reporting framework calls ``do_cpu_reg_dump`` which retrieves the matching |
| 1512 | ``cpu_ops`` using ``get_cpu_ops_ptr()`` function. The ``cpu_reg_dump()`` in |
| 1513 | ``cpu_ops`` is invoked, which then returns the CPU specific register values to |
| 1514 | be reported and a pointer to the ASCII list of register names in a format |
| 1515 | expected by the crash reporting framework. |
| 1516 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1517 | .. _firmware_design_cpu_errata_implementation: |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 1518 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1519 | CPU errata implementation |
| 1520 | ~~~~~~~~~~~~~~~~~~~~~~~~~ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1521 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1522 | Errata workarounds for CPUs supported in TF-A are applied during both cold and |
| 1523 | warm boots, shortly after reset. Individual Errata workarounds are enabled as |
| 1524 | build options. Some errata workarounds have potential run-time implications; |
| 1525 | therefore some are enabled by default, others not. Platform ports shall |
| 1526 | override build options to enable or disable errata as appropriate. The CPU |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1527 | drivers take care of applying errata workarounds that are enabled and applicable |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1528 | to a given CPU. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1529 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1530 | Each erratum has a build flag in ``lib/cpus/cpu-ops.mk`` of the form: |
| 1531 | ``ERRATA_<cpu_num>_<erratum_id>``. It also has a short description in |
| 1532 | :ref:`arm_cpu_macros_errata_workarounds` on when it should apply. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1533 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1534 | Errata framework |
| 1535 | ^^^^^^^^^^^^^^^^ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1536 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1537 | The errata framework is a convention and a small library to allow errata to be |
| 1538 | automatically discovered. It enables compliant errata to be automatically |
| 1539 | applied and reported at runtime (either by status reporting or the errata ABI). |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1540 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1541 | To write a compliant mitigation for erratum number ``erratum_id`` on a cpu that |
| 1542 | declared itself (with ``declare_cpu_ops``) as ``cpu_name`` one needs 3 things: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1543 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1544 | #. A CPU revision checker function: ``check_erratum_<cpu_name>_<erratum_id>`` |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1545 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1546 | It should check whether this erratum applies on this revision of this CPU. |
| 1547 | It will be called with the CPU revision as its first parameter (x0) and |
| 1548 | should return one of ``ERRATA_APPLIES`` or ``ERRATA_NOT_APPLIES``. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1549 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1550 | It may only clobber x0 to x4. The rest should be treated as callee-saved. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1551 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1552 | #. A workaround function: ``erratum_<cpu_name>_<erratum_id>_wa`` |
| 1553 | |
| 1554 | It should obtain the cpu revision (with ``cpu_get_rev_var``), call its |
| 1555 | revision checker, and perform the mitigation, should the erratum apply. |
| 1556 | |
| 1557 | It may only clobber x0 to x8. The rest should be treated as callee-saved. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1558 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1559 | #. Register itself to the framework |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1560 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1561 | Do this with |
| 1562 | ``add_erratum_entry <cpu_name>, ERRATUM(<erratum_id>), <errata_flag>`` |
| 1563 | where the ``errata_flag`` is the enable flag in ``cpu-ops.mk`` described |
| 1564 | above. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1565 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1566 | See the next section on how to do this easily. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1567 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1568 | .. note:: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1569 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1570 | CVEs have the format ``CVE_<year>_<number>``. To fit them in the framework, the |
| 1571 | ``erratum_id`` for the checker and the workaround functions become the |
| 1572 | ``number`` part of its name and the ``ERRATUM(<number>)`` part of the |
| 1573 | registration should instead be ``CVE(<year>, <number>)``. In the extremely |
| 1574 | unlikely scenario where a CVE and an erratum numbers clash, the CVE number |
| 1575 | should be prefixed with a zero. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1576 | |
Boyan Karatotev | d71b5d7 | 2023-02-07 15:46:50 +0000 | [diff] [blame] | 1577 | Also, their build flag should be ``WORKAROUND_CVE_<year>_<number>``. |
| 1578 | |
| 1579 | .. note:: |
| 1580 | |
| 1581 | AArch32 uses the legacy convention. The checker function has the format |
| 1582 | ``check_errata_<erratum_id>`` and the workaround has the format |
| 1583 | ``errata_<cpu_number>_<erratum_id>_wa`` where ``cpu_number`` is the shortform |
| 1584 | letter and number name of the CPU. |
| 1585 | |
| 1586 | For CVEs the ``erratum_id`` also becomes ``cve_<year>_<number>``. |
| 1587 | |
| 1588 | Errata framework helpers |
| 1589 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 1590 | |
| 1591 | Writing these errata involves lots of boilerplate and repetitive code. On |
| 1592 | AArch64 there are helpers to omit most of this. They are located in |
| 1593 | ``include/lib/cpus/aarch64/cpu_macros.S`` and the preferred way to implement |
| 1594 | errata. Please see their comments on how to use them. |
| 1595 | |
| 1596 | The most common type of erratum workaround, one that just sets a "chicken" bit |
| 1597 | in some arbitrary register, would have an implementation for the Cortex-A77, |
| 1598 | erratum #1925769 like:: |
| 1599 | |
| 1600 | workaround_reset_start cortex_a77, ERRATUM(1925769), ERRATA_A77_1925769 |
| 1601 | sysreg_bit_set CORTEX_A77_CPUECTLR_EL1, CORTEX_A77_CPUECTLR_EL1_BIT_8 |
| 1602 | workaround_reset_end cortex_a77, ERRATUM(1925769) |
| 1603 | |
| 1604 | check_erratum_ls cortex_a77, ERRATUM(1925769), CPU_REV(1, 1) |
| 1605 | |
| 1606 | Status reporting |
| 1607 | ^^^^^^^^^^^^^^^^ |
| 1608 | |
| 1609 | In a debug build of TF-A, on a CPU that comes out of reset, both BL1 and the |
| 1610 | runtime firmware (BL31 in AArch64, and BL32 in AArch32) will invoke a generic |
| 1611 | errata status reporting function. It will read the ``errata_entries`` list of |
| 1612 | that cpu and will report whether each known erratum was applied and, if not, |
| 1613 | whether it should have been. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1614 | |
| 1615 | Reporting the status of errata workaround is for informational purpose only; it |
| 1616 | has no functional significance. |
| 1617 | |
| 1618 | Memory layout of BL images |
| 1619 | -------------------------- |
| 1620 | |
| 1621 | Each bootloader image can be divided in 2 parts: |
| 1622 | |
| 1623 | - the static contents of the image. These are data actually stored in the |
| 1624 | binary on the disk. In the ELF terminology, they are called ``PROGBITS`` |
| 1625 | sections; |
| 1626 | |
| 1627 | - the run-time contents of the image. These are data that don't occupy any |
| 1628 | space in the binary on the disk. The ELF binary just contains some |
| 1629 | metadata indicating where these data will be stored at run-time and the |
| 1630 | corresponding sections need to be allocated and initialized at run-time. |
| 1631 | In the ELF terminology, they are called ``NOBITS`` sections. |
| 1632 | |
| 1633 | All PROGBITS sections are grouped together at the beginning of the image, |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1634 | followed by all NOBITS sections. This is true for all TF-A images and it is |
| 1635 | governed by the linker scripts. This ensures that the raw binary images are |
| 1636 | as small as possible. If a NOBITS section was inserted in between PROGBITS |
| 1637 | sections then the resulting binary file would contain zero bytes in place of |
| 1638 | this NOBITS section, making the image unnecessarily bigger. Smaller images |
| 1639 | allow faster loading from the FIP to the main memory. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1640 | |
Samuel Holland | 31a14e1 | 2018-10-17 21:40:18 -0500 | [diff] [blame] | 1641 | For BL31, a platform can specify an alternate location for NOBITS sections |
| 1642 | (other than immediately following PROGBITS sections) by setting |
| 1643 | ``SEPARATE_NOBITS_REGION`` to 1 and defining ``BL31_NOBITS_BASE`` and |
| 1644 | ``BL31_NOBITS_LIMIT``. |
| 1645 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1646 | Linker scripts and symbols |
| 1647 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 1648 | |
| 1649 | Each bootloader stage image layout is described by its own linker script. The |
| 1650 | linker scripts export some symbols into the program symbol table. Their values |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1651 | correspond to particular addresses. TF-A code can refer to these symbols to |
| 1652 | figure out the image memory layout. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1653 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1654 | Linker symbols follow the following naming convention in TF-A. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1655 | |
| 1656 | - ``__<SECTION>_START__`` |
| 1657 | |
| 1658 | Start address of a given section named ``<SECTION>``. |
| 1659 | |
| 1660 | - ``__<SECTION>_END__`` |
| 1661 | |
| 1662 | End address of a given section named ``<SECTION>``. If there is an alignment |
| 1663 | constraint on the section's end address then ``__<SECTION>_END__`` corresponds |
| 1664 | to the end address of the section's actual contents, rounded up to the right |
| 1665 | boundary. Refer to the value of ``__<SECTION>_UNALIGNED_END__`` to know the |
| 1666 | actual end address of the section's contents. |
| 1667 | |
| 1668 | - ``__<SECTION>_UNALIGNED_END__`` |
| 1669 | |
| 1670 | End address of a given section named ``<SECTION>`` without any padding or |
| 1671 | rounding up due to some alignment constraint. |
| 1672 | |
| 1673 | - ``__<SECTION>_SIZE__`` |
| 1674 | |
| 1675 | Size (in bytes) of a given section named ``<SECTION>``. If there is an |
| 1676 | alignment constraint on the section's end address then ``__<SECTION>_SIZE__`` |
| 1677 | corresponds to the size of the section's actual contents, rounded up to the |
| 1678 | right boundary. In other words, ``__<SECTION>_SIZE__ = __<SECTION>_END__ - _<SECTION>_START__``. Refer to the value of ``__<SECTION>_UNALIGNED_SIZE__`` |
| 1679 | to know the actual size of the section's contents. |
| 1680 | |
| 1681 | - ``__<SECTION>_UNALIGNED_SIZE__`` |
| 1682 | |
| 1683 | Size (in bytes) of a given section named ``<SECTION>`` without any padding or |
| 1684 | rounding up due to some alignment constraint. In other words, |
| 1685 | ``__<SECTION>_UNALIGNED_SIZE__ = __<SECTION>_UNALIGNED_END__ - __<SECTION>_START__``. |
| 1686 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1687 | Some of the linker symbols are mandatory as TF-A code relies on them to be |
| 1688 | defined. They are listed in the following subsections. Some of them must be |
| 1689 | provided for each bootloader stage and some are specific to a given bootloader |
| 1690 | stage. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1691 | |
| 1692 | The linker scripts define some extra, optional symbols. They are not actually |
| 1693 | used by any code but they help in understanding the bootloader images' memory |
| 1694 | layout as they are easy to spot in the link map files. |
| 1695 | |
| 1696 | Common linker symbols |
| 1697 | ^^^^^^^^^^^^^^^^^^^^^ |
| 1698 | |
| 1699 | All BL images share the following requirements: |
| 1700 | |
| 1701 | - The BSS section must be zero-initialised before executing any C code. |
| 1702 | - The coherent memory section (if enabled) must be zero-initialised as well. |
| 1703 | - The MMU setup code needs to know the extents of the coherent and read-only |
| 1704 | memory regions to set the right memory attributes. When |
| 1705 | ``SEPARATE_CODE_AND_RODATA=1``, it needs to know more specifically how the |
| 1706 | read-only memory region is divided between code and data. |
| 1707 | |
| 1708 | The following linker symbols are defined for this purpose: |
| 1709 | |
| 1710 | - ``__BSS_START__`` |
| 1711 | - ``__BSS_SIZE__`` |
| 1712 | - ``__COHERENT_RAM_START__`` Must be aligned on a page-size boundary. |
| 1713 | - ``__COHERENT_RAM_END__`` Must be aligned on a page-size boundary. |
| 1714 | - ``__COHERENT_RAM_UNALIGNED_SIZE__`` |
| 1715 | - ``__RO_START__`` |
| 1716 | - ``__RO_END__`` |
| 1717 | - ``__TEXT_START__`` |
Michal Simek | 80c530e | 2023-04-27 14:26:03 +0200 | [diff] [blame] | 1718 | - ``__TEXT_END_UNALIGNED__`` |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1719 | - ``__TEXT_END__`` |
| 1720 | - ``__RODATA_START__`` |
Michal Simek | 80c530e | 2023-04-27 14:26:03 +0200 | [diff] [blame] | 1721 | - ``__RODATA_END_UNALIGNED__`` |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1722 | - ``__RODATA_END__`` |
| 1723 | |
| 1724 | BL1's linker symbols |
| 1725 | ^^^^^^^^^^^^^^^^^^^^ |
| 1726 | |
| 1727 | BL1 being the ROM image, it has additional requirements. BL1 resides in ROM and |
| 1728 | it is entirely executed in place but it needs some read-write memory for its |
| 1729 | mutable data. Its ``.data`` section (i.e. its allocated read-write data) must be |
| 1730 | relocated from ROM to RAM before executing any C code. |
| 1731 | |
| 1732 | The following additional linker symbols are defined for BL1: |
| 1733 | |
| 1734 | - ``__BL1_ROM_END__`` End address of BL1's ROM contents, covering its code |
| 1735 | and ``.data`` section in ROM. |
| 1736 | - ``__DATA_ROM_START__`` Start address of the ``.data`` section in ROM. Must be |
| 1737 | aligned on a 16-byte boundary. |
| 1738 | - ``__DATA_RAM_START__`` Address in RAM where the ``.data`` section should be |
| 1739 | copied over. Must be aligned on a 16-byte boundary. |
| 1740 | - ``__DATA_SIZE__`` Size of the ``.data`` section (in ROM or RAM). |
| 1741 | - ``__BL1_RAM_START__`` Start address of BL1 read-write data. |
| 1742 | - ``__BL1_RAM_END__`` End address of BL1 read-write data. |
| 1743 | |
| 1744 | How to choose the right base addresses for each bootloader stage image |
| 1745 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 1746 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1747 | There is currently no support for dynamic image loading in TF-A. This means |
| 1748 | that all bootloader images need to be linked against their ultimate runtime |
| 1749 | locations and the base addresses of each image must be chosen carefully such |
| 1750 | that images don't overlap each other in an undesired way. As the code grows, |
| 1751 | the base addresses might need adjustments to cope with the new memory layout. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1752 | |
| 1753 | The memory layout is completely specific to the platform and so there is no |
| 1754 | general recipe for choosing the right base addresses for each bootloader image. |
| 1755 | However, there are tools to aid in understanding the memory layout. These are |
| 1756 | the link map files: ``build/<platform>/<build-type>/bl<x>/bl<x>.map``, with ``<x>`` |
| 1757 | being the stage bootloader. They provide a detailed view of the memory usage of |
| 1758 | each image. Among other useful information, they provide the end address of |
| 1759 | each image. |
| 1760 | |
| 1761 | - ``bl1.map`` link map file provides ``__BL1_RAM_END__`` address. |
| 1762 | - ``bl2.map`` link map file provides ``__BL2_END__`` address. |
| 1763 | - ``bl31.map`` link map file provides ``__BL31_END__`` address. |
| 1764 | - ``bl32.map`` link map file provides ``__BL32_END__`` address. |
| 1765 | |
| 1766 | For each bootloader image, the platform code must provide its start address |
| 1767 | as well as a limit address that it must not overstep. The latter is used in the |
| 1768 | linker scripts to check that the image doesn't grow past that address. If that |
| 1769 | happens, the linker will issue a message similar to the following: |
| 1770 | |
| 1771 | :: |
| 1772 | |
| 1773 | aarch64-none-elf-ld: BLx has exceeded its limit. |
| 1774 | |
| 1775 | Additionally, if the platform memory layout implies some image overlaying like |
| 1776 | on FVP, BL31 and TSP need to know the limit address that their PROGBITS |
| 1777 | sections must not overstep. The platform code must provide those. |
| 1778 | |
Soby Mathew | 97b1bff | 2018-09-27 16:46:41 +0100 | [diff] [blame] | 1779 | TF-A does not provide any mechanism to verify at boot time that the memory |
| 1780 | to load a new image is free to prevent overwriting a previously loaded image. |
| 1781 | The platform must specify the memory available in the system for all the |
| 1782 | relevant BL images to be loaded. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1783 | |
| 1784 | For example, in the case of BL1 loading BL2, ``bl1_plat_sec_mem_layout()`` will |
| 1785 | return the region defined by the platform where BL1 intends to load BL2. The |
| 1786 | ``load_image()`` function performs bounds check for the image size based on the |
| 1787 | base and maximum image size provided by the platforms. Platforms must take |
| 1788 | this behaviour into account when defining the base/size for each of the images. |
| 1789 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1790 | Memory layout on Arm development platforms |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1791 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 1792 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 1793 | The following list describes the memory layout on the Arm development platforms: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1794 | |
| 1795 | - A 4KB page of shared memory is used for communication between Trusted |
| 1796 | Firmware and the platform's power controller. This is located at the base of |
| 1797 | Trusted SRAM. The amount of Trusted SRAM available to load the bootloader |
| 1798 | images is reduced by the size of the shared memory. |
| 1799 | |
| 1800 | The shared memory is used to store the CPUs' entrypoint mailbox. On Juno, |
| 1801 | this is also used for the MHU payload when passing messages to and from the |
| 1802 | SCP. |
| 1803 | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1804 | - Another 4 KB page is reserved for passing memory layout between BL1 and BL2 |
| 1805 | and also the dynamic firmware configurations. |
| 1806 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1807 | - On FVP, BL1 is originally sitting in the Trusted ROM at address ``0x0``. On |
| 1808 | Juno, BL1 resides in flash memory at address ``0x0BEC0000``. BL1 read-write |
| 1809 | data are relocated to the top of Trusted SRAM at runtime. |
| 1810 | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1811 | - BL2 is loaded below BL1 RW |
| 1812 | |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 1813 | - EL3 Runtime Software, BL31 for AArch64 and BL32 for AArch32 (e.g. SP_MIN), |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1814 | is loaded at the top of the Trusted SRAM, such that its NOBITS sections will |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1815 | overwrite BL1 R/W data and BL2. This implies that BL1 global variables |
| 1816 | remain valid only until execution reaches the EL3 Runtime Software entry |
| 1817 | point during a cold boot. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1818 | |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 1819 | - On Juno, SCP_BL2 is loaded temporarily into the EL3 Runtime Software memory |
Paul Beesley | f2ec714 | 2019-10-04 16:17:46 +0000 | [diff] [blame] | 1820 | region and transferred to the SCP before being overwritten by EL3 Runtime |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1821 | Software. |
| 1822 | |
| 1823 | - BL32 (for AArch64) can be loaded in one of the following locations: |
| 1824 | |
| 1825 | - Trusted SRAM |
| 1826 | - Trusted DRAM (FVP only) |
| 1827 | - Secure region of DRAM (top 16MB of DRAM configured by the TrustZone |
| 1828 | controller) |
| 1829 | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1830 | When BL32 (for AArch64) is loaded into Trusted SRAM, it is loaded below |
| 1831 | BL31. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1832 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1833 | The location of the BL32 image will result in different memory maps. This is |
| 1834 | illustrated for both FVP and Juno in the following diagrams, using the TSP as |
| 1835 | an example. |
| 1836 | |
Paul Beesley | ba3ed40 | 2019-03-13 16:20:44 +0000 | [diff] [blame] | 1837 | .. note:: |
| 1838 | Loading the BL32 image in TZC secured DRAM doesn't change the memory |
| 1839 | layout of the other images in Trusted SRAM. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1840 | |
Sathees Balya | 9095009 | 2018-11-15 14:22:30 +0000 | [diff] [blame] | 1841 | CONFIG section in memory layouts shown below contains: |
| 1842 | |
| 1843 | :: |
| 1844 | |
| 1845 | +--------------------+ |
| 1846 | |bl2_mem_params_descs| |
| 1847 | |--------------------| |
| 1848 | | fw_configs | |
| 1849 | +--------------------+ |
| 1850 | |
| 1851 | ``bl2_mem_params_descs`` contains parameters passed from BL2 to next the |
| 1852 | BL image during boot. |
| 1853 | |
Manish V Badarkhe | ece96fd | 2020-06-13 09:42:28 +0100 | [diff] [blame] | 1854 | ``fw_configs`` includes soc_fw_config, tos_fw_config, tb_fw_config and fw_config. |
Sathees Balya | 9095009 | 2018-11-15 14:22:30 +0000 | [diff] [blame] | 1855 | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1856 | **FVP with TSP in Trusted SRAM with firmware configs :** |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1857 | (These diagrams only cover the AArch64 case) |
| 1858 | |
| 1859 | :: |
| 1860 | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1861 | DRAM |
| 1862 | 0xffffffff +----------+ |
Manish V Badarkhe | 638ac18 | 2023-03-07 10:21:30 +0000 | [diff] [blame] | 1863 | | EL3 TZC | |
| 1864 | 0xffe00000 |----------| (secure) |
| 1865 | | AP TZC | |
| 1866 | 0xff000000 +----------+ |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1867 | : : |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 1868 | 0x82100000 |----------| |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1869 | |HW_CONFIG | |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 1870 | 0x82000000 |----------| (non-secure) |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1871 | | | |
| 1872 | 0x80000000 +----------+ |
| 1873 | |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 1874 | Trusted DRAM |
| 1875 | 0x08000000 +----------+ |
| 1876 | |HW_CONFIG | |
| 1877 | 0x07f00000 |----------| |
| 1878 | : : |
| 1879 | | | |
| 1880 | 0x06000000 +----------+ |
| 1881 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1882 | Trusted SRAM |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1883 | 0x04040000 +----------+ loaded by BL2 +----------------+ |
| 1884 | | BL1 (rw) | <<<<<<<<<<<<< | | |
| 1885 | |----------| <<<<<<<<<<<<< | BL31 NOBITS | |
| 1886 | | BL2 | <<<<<<<<<<<<< | | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1887 | |----------| <<<<<<<<<<<<< |----------------| |
| 1888 | | | <<<<<<<<<<<<< | BL31 PROGBITS | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1889 | | | <<<<<<<<<<<<< |----------------| |
| 1890 | | | <<<<<<<<<<<<< | BL32 | |
Manish V Badarkhe | ece96fd | 2020-06-13 09:42:28 +0100 | [diff] [blame] | 1891 | 0x04003000 +----------+ +----------------+ |
Sathees Balya | 9095009 | 2018-11-15 14:22:30 +0000 | [diff] [blame] | 1892 | | CONFIG | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1893 | 0x04001000 +----------+ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1894 | | Shared | |
| 1895 | 0x04000000 +----------+ |
| 1896 | |
| 1897 | Trusted ROM |
| 1898 | 0x04000000 +----------+ |
| 1899 | | BL1 (ro) | |
| 1900 | 0x00000000 +----------+ |
| 1901 | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1902 | **FVP with TSP in Trusted DRAM with firmware configs (default option):** |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1903 | |
| 1904 | :: |
| 1905 | |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 1906 | DRAM |
| 1907 | 0xffffffff +--------------+ |
Manish V Badarkhe | 638ac18 | 2023-03-07 10:21:30 +0000 | [diff] [blame] | 1908 | | EL3 TZC | |
| 1909 | 0xffe00000 |--------------| (secure) |
| 1910 | | AP TZC | |
| 1911 | 0xff000000 +--------------+ |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 1912 | : : |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 1913 | 0x82100000 |--------------| |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 1914 | | HW_CONFIG | |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 1915 | 0x82000000 |--------------| (non-secure) |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 1916 | | | |
| 1917 | 0x80000000 +--------------+ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1918 | |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 1919 | Trusted DRAM |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 1920 | 0x08000000 +--------------+ |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 1921 | | HW_CONFIG | |
| 1922 | 0x07f00000 |--------------| |
| 1923 | : : |
| 1924 | | BL32 | |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 1925 | 0x06000000 +--------------+ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1926 | |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 1927 | Trusted SRAM |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1928 | 0x04040000 +--------------+ loaded by BL2 +----------------+ |
| 1929 | | BL1 (rw) | <<<<<<<<<<<<< | | |
| 1930 | |--------------| <<<<<<<<<<<<< | BL31 NOBITS | |
| 1931 | | BL2 | <<<<<<<<<<<<< | | |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 1932 | |--------------| <<<<<<<<<<<<< |----------------| |
| 1933 | | | <<<<<<<<<<<<< | BL31 PROGBITS | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1934 | | | +----------------+ |
Manish V Badarkhe | ece96fd | 2020-06-13 09:42:28 +0100 | [diff] [blame] | 1935 | 0x04003000 +--------------+ |
Sathees Balya | 9095009 | 2018-11-15 14:22:30 +0000 | [diff] [blame] | 1936 | | CONFIG | |
Soby Mathew | b1bf044 | 2018-02-16 14:52:52 +0000 | [diff] [blame] | 1937 | 0x04001000 +--------------+ |
| 1938 | | Shared | |
| 1939 | 0x04000000 +--------------+ |
| 1940 | |
| 1941 | Trusted ROM |
| 1942 | 0x04000000 +--------------+ |
| 1943 | | BL1 (ro) | |
| 1944 | 0x00000000 +--------------+ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1945 | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1946 | **FVP with TSP in TZC-Secured DRAM with firmware configs :** |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1947 | |
| 1948 | :: |
| 1949 | |
| 1950 | DRAM |
| 1951 | 0xffffffff +----------+ |
Manish V Badarkhe | 638ac18 | 2023-03-07 10:21:30 +0000 | [diff] [blame] | 1952 | | EL3 TZC | |
| 1953 | 0xffe00000 |----------| (secure) |
| 1954 | | AP TZC | |
| 1955 | | (BL32) | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1956 | 0xff000000 +----------+ |
| 1957 | | | |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 1958 | 0x82100000 |----------| |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1959 | |HW_CONFIG | |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 1960 | 0x82000000 |----------| (non-secure) |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1961 | | | |
| 1962 | 0x80000000 +----------+ |
| 1963 | |
Manish V Badarkhe | 70d8eee | 2022-04-12 21:11:56 +0100 | [diff] [blame] | 1964 | Trusted DRAM |
| 1965 | 0x08000000 +----------+ |
| 1966 | |HW_CONFIG | |
| 1967 | 0x7f000000 |----------| |
| 1968 | : : |
| 1969 | | | |
| 1970 | 0x06000000 +----------+ |
| 1971 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1972 | Trusted SRAM |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1973 | 0x04040000 +----------+ loaded by BL2 +----------------+ |
| 1974 | | BL1 (rw) | <<<<<<<<<<<<< | | |
| 1975 | |----------| <<<<<<<<<<<<< | BL31 NOBITS | |
| 1976 | | BL2 | <<<<<<<<<<<<< | | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1977 | |----------| <<<<<<<<<<<<< |----------------| |
| 1978 | | | <<<<<<<<<<<<< | BL31 PROGBITS | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1979 | | | +----------------+ |
Manish V Badarkhe | ece96fd | 2020-06-13 09:42:28 +0100 | [diff] [blame] | 1980 | 0x04003000 +----------+ |
Sathees Balya | 9095009 | 2018-11-15 14:22:30 +0000 | [diff] [blame] | 1981 | | CONFIG | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1982 | 0x04001000 +----------+ |
| 1983 | | Shared | |
| 1984 | 0x04000000 +----------+ |
| 1985 | |
| 1986 | Trusted ROM |
| 1987 | 0x04000000 +----------+ |
| 1988 | | BL1 (ro) | |
| 1989 | 0x00000000 +----------+ |
| 1990 | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 1991 | **Juno with BL32 in Trusted SRAM :** |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 1992 | |
| 1993 | :: |
| 1994 | |
Manish V Badarkhe | 638ac18 | 2023-03-07 10:21:30 +0000 | [diff] [blame] | 1995 | DRAM |
| 1996 | 0xFFFFFFFF +----------+ |
| 1997 | | SCP TZC | |
| 1998 | 0xFFE00000 |----------| |
| 1999 | | EL3 TZC | |
| 2000 | 0xFFC00000 |----------| (secure) |
| 2001 | | AP TZC | |
| 2002 | 0xFF000000 +----------+ |
| 2003 | | | |
| 2004 | : : (non-secure) |
| 2005 | | | |
| 2006 | 0x80000000 +----------+ |
| 2007 | |
| 2008 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2009 | Flash0 |
| 2010 | 0x0C000000 +----------+ |
| 2011 | : : |
| 2012 | 0x0BED0000 |----------| |
| 2013 | | BL1 (ro) | |
| 2014 | 0x0BEC0000 |----------| |
| 2015 | : : |
| 2016 | 0x08000000 +----------+ BL31 is loaded |
| 2017 | after SCP_BL2 has |
| 2018 | Trusted SRAM been sent to SCP |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 2019 | 0x04040000 +----------+ loaded by BL2 +----------------+ |
| 2020 | | BL1 (rw) | <<<<<<<<<<<<< | | |
| 2021 | |----------| <<<<<<<<<<<<< | BL31 NOBITS | |
| 2022 | | BL2 | <<<<<<<<<<<<< | | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2023 | |----------| <<<<<<<<<<<<< |----------------| |
| 2024 | | SCP_BL2 | <<<<<<<<<<<<< | BL31 PROGBITS | |
Chris Kay | f8fa465 | 2020-03-12 13:50:26 +0000 | [diff] [blame] | 2025 | | | <<<<<<<<<<<<< |----------------| |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 2026 | | | <<<<<<<<<<<<< | BL32 | |
| 2027 | | | +----------------+ |
| 2028 | | | |
| 2029 | 0x04001000 +----------+ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2030 | | MHU | |
| 2031 | 0x04000000 +----------+ |
| 2032 | |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 2033 | **Juno with BL32 in TZC-secured DRAM :** |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2034 | |
| 2035 | :: |
| 2036 | |
| 2037 | DRAM |
Manish V Badarkhe | 638ac18 | 2023-03-07 10:21:30 +0000 | [diff] [blame] | 2038 | 0xFFFFFFFF +----------+ |
| 2039 | | SCP TZC | |
| 2040 | 0xFFE00000 |----------| |
| 2041 | | EL3 TZC | |
| 2042 | 0xFFC00000 |----------| (secure) |
| 2043 | | AP TZC | |
| 2044 | | (BL32) | |
| 2045 | 0xFF000000 +----------+ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2046 | | | |
| 2047 | : : (non-secure) |
| 2048 | | | |
| 2049 | 0x80000000 +----------+ |
| 2050 | |
| 2051 | Flash0 |
| 2052 | 0x0C000000 +----------+ |
| 2053 | : : |
| 2054 | 0x0BED0000 |----------| |
| 2055 | | BL1 (ro) | |
| 2056 | 0x0BEC0000 |----------| |
| 2057 | : : |
| 2058 | 0x08000000 +----------+ BL31 is loaded |
| 2059 | after SCP_BL2 has |
| 2060 | Trusted SRAM been sent to SCP |
Soby Mathew | 492e245 | 2018-06-06 16:03:10 +0100 | [diff] [blame] | 2061 | 0x04040000 +----------+ loaded by BL2 +----------------+ |
| 2062 | | BL1 (rw) | <<<<<<<<<<<<< | | |
| 2063 | |----------| <<<<<<<<<<<<< | BL31 NOBITS | |
| 2064 | | BL2 | <<<<<<<<<<<<< | | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2065 | |----------| <<<<<<<<<<<<< |----------------| |
| 2066 | | SCP_BL2 | <<<<<<<<<<<<< | BL31 PROGBITS | |
Chris Kay | f8fa465 | 2020-03-12 13:50:26 +0000 | [diff] [blame] | 2067 | | | +----------------+ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2068 | 0x04001000 +----------+ |
| 2069 | | MHU | |
| 2070 | 0x04000000 +----------+ |
| 2071 | |
Paul Beesley | d2fcc4e | 2019-05-29 13:59:40 +0100 | [diff] [blame] | 2072 | .. _firmware_design_fip: |
Sathees Balya | 17d8eed | 2019-01-30 15:56:44 +0000 | [diff] [blame] | 2073 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2074 | Firmware Image Package (FIP) |
| 2075 | ---------------------------- |
| 2076 | |
| 2077 | Using a Firmware Image Package (FIP) allows for packing bootloader images (and |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2078 | potentially other payloads) into a single archive that can be loaded by TF-A |
| 2079 | from non-volatile platform storage. A driver to load images from a FIP has |
| 2080 | been added to the storage layer and allows a package to be read from supported |
| 2081 | platform storage. A tool to create Firmware Image Packages is also provided |
| 2082 | and described below. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2083 | |
| 2084 | Firmware Image Package layout |
| 2085 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 2086 | |
| 2087 | The FIP layout consists of a table of contents (ToC) followed by payload data. |
| 2088 | The ToC itself has a header followed by one or more table entries. The ToC is |
Jett Zhou | 7556610 | 2017-11-24 16:03:58 +0800 | [diff] [blame] | 2089 | terminated by an end marker entry, and since the size of the ToC is 0 bytes, |
| 2090 | the offset equals the total size of the FIP file. All ToC entries describe some |
| 2091 | payload data that has been appended to the end of the binary package. With the |
| 2092 | information provided in the ToC entry the corresponding payload data can be |
| 2093 | retrieved. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2094 | |
| 2095 | :: |
| 2096 | |
| 2097 | ------------------ |
| 2098 | | ToC Header | |
| 2099 | |----------------| |
| 2100 | | ToC Entry 0 | |
| 2101 | |----------------| |
| 2102 | | ToC Entry 1 | |
| 2103 | |----------------| |
| 2104 | | ToC End Marker | |
| 2105 | |----------------| |
| 2106 | | | |
| 2107 | | Data 0 | |
| 2108 | | | |
| 2109 | |----------------| |
| 2110 | | | |
| 2111 | | Data 1 | |
| 2112 | | | |
| 2113 | ------------------ |
| 2114 | |
| 2115 | The ToC header and entry formats are described in the header file |
| 2116 | ``include/tools_share/firmware_image_package.h``. This file is used by both the |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2117 | tool and TF-A. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2118 | |
| 2119 | The ToC header has the following fields: |
| 2120 | |
| 2121 | :: |
| 2122 | |
| 2123 | `name`: The name of the ToC. This is currently used to validate the header. |
| 2124 | `serial_number`: A non-zero number provided by the creation tool |
| 2125 | `flags`: Flags associated with this data. |
| 2126 | Bits 0-31: Reserved |
| 2127 | Bits 32-47: Platform defined |
| 2128 | Bits 48-63: Reserved |
| 2129 | |
| 2130 | A ToC entry has the following fields: |
| 2131 | |
| 2132 | :: |
| 2133 | |
| 2134 | `uuid`: All files are referred to by a pre-defined Universally Unique |
| 2135 | IDentifier [UUID] . The UUIDs are defined in |
| 2136 | `include/tools_share/firmware_image_package.h`. The platform translates |
| 2137 | the requested image name into the corresponding UUID when accessing the |
| 2138 | package. |
| 2139 | `offset_address`: The offset address at which the corresponding payload data |
| 2140 | can be found. The offset is calculated from the ToC base address. |
| 2141 | `size`: The size of the corresponding payload data in bytes. |
Etienne Carriere | 7421bf1 | 2017-08-23 15:43:33 +0200 | [diff] [blame] | 2142 | `flags`: Flags associated with this entry. None are yet defined. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2143 | |
| 2144 | Firmware Image Package creation tool |
| 2145 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 2146 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2147 | The FIP creation tool can be used to pack specified images into a binary |
| 2148 | package that can be loaded by TF-A from platform storage. The tool currently |
| 2149 | only supports packing bootloader images. Additional image definitions can be |
| 2150 | added to the tool as required. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2151 | |
| 2152 | The tool can be found in ``tools/fiptool``. |
| 2153 | |
| 2154 | Loading from a Firmware Image Package (FIP) |
| 2155 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 2156 | |
| 2157 | The Firmware Image Package (FIP) driver can load images from a binary package on |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2158 | non-volatile platform storage. For the Arm development platforms, this is |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2159 | currently NOR FLASH. |
| 2160 | |
| 2161 | Bootloader images are loaded according to the platform policy as specified by |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2162 | the function ``plat_get_image_source()``. For the Arm development platforms, this |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2163 | means the platform will attempt to load images from a Firmware Image Package |
| 2164 | located at the start of NOR FLASH0. |
| 2165 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2166 | The Arm development platforms' policy is to only allow loading of a known set of |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2167 | images. The platform policy can be modified to allow additional images. |
| 2168 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2169 | Use of coherent memory in TF-A |
| 2170 | ------------------------------ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2171 | |
| 2172 | There might be loss of coherency when physical memory with mismatched |
| 2173 | shareability, cacheability and memory attributes is accessed by multiple CPUs |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2174 | (refer to section B2.9 of `Arm ARM`_ for more details). This possibility occurs |
| 2175 | in TF-A during power up/down sequences when coherency, MMU and caches are |
| 2176 | turned on/off incrementally. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2177 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2178 | TF-A defines coherent memory as a region of memory with Device nGnRE attributes |
| 2179 | in the translation tables. The translation granule size in TF-A is 4KB. This |
| 2180 | is the smallest possible size of the coherent memory region. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2181 | |
| 2182 | By default, all data structures which are susceptible to accesses with |
| 2183 | mismatched attributes from various CPUs are allocated in a coherent memory |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 2184 | region (refer to section 2.1 of :ref:`Porting Guide`). The coherent memory |
| 2185 | region accesses are Outer Shareable, non-cacheable and they can be accessed with |
| 2186 | the Device nGnRE attributes when the MMU is turned on. Hence, at the expense of |
| 2187 | at least an extra page of memory, TF-A is able to work around coherency issues |
| 2188 | due to mismatched memory attributes. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2189 | |
| 2190 | The alternative to the above approach is to allocate the susceptible data |
| 2191 | structures in Normal WriteBack WriteAllocate Inner shareable memory. This |
| 2192 | approach requires the data structures to be designed so that it is possible to |
| 2193 | work around the issue of mismatched memory attributes by performing software |
| 2194 | cache maintenance on them. |
| 2195 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2196 | Disabling the use of coherent memory in TF-A |
| 2197 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2198 | |
| 2199 | It might be desirable to avoid the cost of allocating coherent memory on |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2200 | platforms which are memory constrained. TF-A enables inclusion of coherent |
| 2201 | memory in firmware images through the build flag ``USE_COHERENT_MEM``. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2202 | This flag is enabled by default. It can be disabled to choose the second |
| 2203 | approach described above. |
| 2204 | |
| 2205 | The below sections analyze the data structures allocated in the coherent memory |
| 2206 | region and the changes required to allocate them in normal memory. |
| 2207 | |
| 2208 | Coherent memory usage in PSCI implementation |
| 2209 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 2210 | |
| 2211 | The ``psci_non_cpu_pd_nodes`` data structure stores the platform's power domain |
| 2212 | tree information for state management of power domains. By default, this data |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2213 | structure is allocated in the coherent memory region in TF-A because it can be |
Paul Beesley | 1fbc97b | 2019-01-11 18:26:51 +0000 | [diff] [blame] | 2214 | accessed by multiple CPUs, either with caches enabled or disabled. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2215 | |
| 2216 | .. code:: c |
| 2217 | |
| 2218 | typedef struct non_cpu_pwr_domain_node { |
| 2219 | /* |
| 2220 | * Index of the first CPU power domain node level 0 which has this node |
| 2221 | * as its parent. |
| 2222 | */ |
| 2223 | unsigned int cpu_start_idx; |
| 2224 | |
| 2225 | /* |
| 2226 | * Number of CPU power domains which are siblings of the domain indexed |
| 2227 | * by 'cpu_start_idx' i.e. all the domains in the range 'cpu_start_idx |
| 2228 | * -> cpu_start_idx + ncpus' have this node as their parent. |
| 2229 | */ |
| 2230 | unsigned int ncpus; |
| 2231 | |
| 2232 | /* |
| 2233 | * Index of the parent power domain node. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2234 | */ |
| 2235 | unsigned int parent_node; |
| 2236 | |
| 2237 | plat_local_state_t local_state; |
| 2238 | |
| 2239 | unsigned char level; |
| 2240 | |
| 2241 | /* For indexing the psci_lock array*/ |
| 2242 | unsigned char lock_index; |
| 2243 | } non_cpu_pd_node_t; |
| 2244 | |
| 2245 | In order to move this data structure to normal memory, the use of each of its |
| 2246 | fields must be analyzed. Fields like ``cpu_start_idx``, ``ncpus``, ``parent_node`` |
| 2247 | ``level`` and ``lock_index`` are only written once during cold boot. Hence removing |
| 2248 | them from coherent memory involves only doing a clean and invalidate of the |
| 2249 | cache lines after these fields are written. |
| 2250 | |
| 2251 | The field ``local_state`` can be concurrently accessed by multiple CPUs in |
| 2252 | different cache states. A Lamport's Bakery lock ``psci_locks`` is used to ensure |
Paul Beesley | 1fbc97b | 2019-01-11 18:26:51 +0000 | [diff] [blame] | 2253 | mutual exclusion to this field and a clean and invalidate is needed after it |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2254 | is written. |
| 2255 | |
| 2256 | Bakery lock data |
| 2257 | ~~~~~~~~~~~~~~~~ |
| 2258 | |
| 2259 | The bakery lock data structure ``bakery_lock_t`` is allocated in coherent memory |
| 2260 | and is accessed by multiple CPUs with mismatched attributes. ``bakery_lock_t`` is |
| 2261 | defined as follows: |
| 2262 | |
| 2263 | .. code:: c |
| 2264 | |
| 2265 | typedef struct bakery_lock { |
| 2266 | /* |
| 2267 | * The lock_data is a bit-field of 2 members: |
| 2268 | * Bit[0] : choosing. This field is set when the CPU is |
| 2269 | * choosing its bakery number. |
| 2270 | * Bits[1 - 15] : number. This is the bakery number allocated. |
| 2271 | */ |
| 2272 | volatile uint16_t lock_data[BAKERY_LOCK_MAX_CPUS]; |
| 2273 | } bakery_lock_t; |
| 2274 | |
| 2275 | It is a characteristic of Lamport's Bakery algorithm that the volatile per-CPU |
| 2276 | fields can be read by all CPUs but only written to by the owning CPU. |
| 2277 | |
| 2278 | Depending upon the data cache line size, the per-CPU fields of the |
| 2279 | ``bakery_lock_t`` structure for multiple CPUs may exist on a single cache line. |
| 2280 | These per-CPU fields can be read and written during lock contention by multiple |
| 2281 | CPUs with mismatched memory attributes. Since these fields are a part of the |
| 2282 | lock implementation, they do not have access to any other locking primitive to |
| 2283 | safeguard against the resulting coherency issues. As a result, simple software |
| 2284 | cache maintenance is not enough to allocate them in coherent memory. Consider |
| 2285 | the following example. |
| 2286 | |
| 2287 | CPU0 updates its per-CPU field with data cache enabled. This write updates a |
| 2288 | local cache line which contains a copy of the fields for other CPUs as well. Now |
| 2289 | CPU1 updates its per-CPU field of the ``bakery_lock_t`` structure with data cache |
| 2290 | disabled. CPU1 then issues a DCIVAC operation to invalidate any stale copies of |
| 2291 | its field in any other cache line in the system. This operation will invalidate |
| 2292 | the update made by CPU0 as well. |
| 2293 | |
| 2294 | To use bakery locks when ``USE_COHERENT_MEM`` is disabled, the lock data structure |
| 2295 | has been redesigned. The changes utilise the characteristic of Lamport's Bakery |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 2296 | algorithm mentioned earlier. The bakery_lock structure only allocates the memory |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2297 | for a single CPU. The macro ``DEFINE_BAKERY_LOCK`` allocates all the bakery locks |
Chris Kay | 33bfc5e | 2023-02-14 11:30:04 +0000 | [diff] [blame] | 2298 | needed for a CPU into a section ``.bakery_lock``. The linker allocates the memory |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 2299 | for other cores by using the total size allocated for the bakery_lock section |
| 2300 | and multiplying it with (PLATFORM_CORE_COUNT - 1). This enables software to |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2301 | perform software cache maintenance on the lock data structure without running |
| 2302 | into coherency issues associated with mismatched attributes. |
| 2303 | |
| 2304 | The bakery lock data structure ``bakery_info_t`` is defined for use when |
| 2305 | ``USE_COHERENT_MEM`` is disabled as follows: |
| 2306 | |
| 2307 | .. code:: c |
| 2308 | |
| 2309 | typedef struct bakery_info { |
| 2310 | /* |
| 2311 | * The lock_data is a bit-field of 2 members: |
| 2312 | * Bit[0] : choosing. This field is set when the CPU is |
| 2313 | * choosing its bakery number. |
| 2314 | * Bits[1 - 15] : number. This is the bakery number allocated. |
| 2315 | */ |
| 2316 | volatile uint16_t lock_data; |
| 2317 | } bakery_info_t; |
| 2318 | |
| 2319 | The ``bakery_info_t`` represents a single per-CPU field of one lock and |
| 2320 | the combination of corresponding ``bakery_info_t`` structures for all CPUs in the |
| 2321 | system represents the complete bakery lock. The view in memory for a system |
| 2322 | with n bakery locks are: |
| 2323 | |
| 2324 | :: |
| 2325 | |
Chris Kay | 33bfc5e | 2023-02-14 11:30:04 +0000 | [diff] [blame] | 2326 | .bakery_lock section start |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2327 | |----------------| |
| 2328 | | `bakery_info_t`| <-- Lock_0 per-CPU field |
| 2329 | | Lock_0 | for CPU0 |
| 2330 | |----------------| |
| 2331 | | `bakery_info_t`| <-- Lock_1 per-CPU field |
| 2332 | | Lock_1 | for CPU0 |
| 2333 | |----------------| |
| 2334 | | .... | |
| 2335 | |----------------| |
| 2336 | | `bakery_info_t`| <-- Lock_N per-CPU field |
| 2337 | | Lock_N | for CPU0 |
| 2338 | ------------------ |
| 2339 | | XXXXX | |
| 2340 | | Padding to | |
| 2341 | | next Cache WB | <--- Calculate PERCPU_BAKERY_LOCK_SIZE, allocate |
| 2342 | | Granule | continuous memory for remaining CPUs. |
| 2343 | ------------------ |
| 2344 | | `bakery_info_t`| <-- Lock_0 per-CPU field |
| 2345 | | Lock_0 | for CPU1 |
| 2346 | |----------------| |
| 2347 | | `bakery_info_t`| <-- Lock_1 per-CPU field |
| 2348 | | Lock_1 | for CPU1 |
| 2349 | |----------------| |
| 2350 | | .... | |
| 2351 | |----------------| |
| 2352 | | `bakery_info_t`| <-- Lock_N per-CPU field |
| 2353 | | Lock_N | for CPU1 |
| 2354 | ------------------ |
| 2355 | | XXXXX | |
| 2356 | | Padding to | |
| 2357 | | next Cache WB | |
| 2358 | | Granule | |
| 2359 | ------------------ |
| 2360 | |
| 2361 | Consider a system of 2 CPUs with 'N' bakery locks as shown above. For an |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 2362 | operation on Lock_N, the corresponding ``bakery_info_t`` in both CPU0 and CPU1 |
Chris Kay | 33bfc5e | 2023-02-14 11:30:04 +0000 | [diff] [blame] | 2363 | ``.bakery_lock`` section need to be fetched and appropriate cache operations need |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2364 | to be performed for each access. |
| 2365 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2366 | On Arm Platforms, bakery locks are used in psci (``psci_locks``) and power controller |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2367 | driver (``arm_lock``). |
| 2368 | |
| 2369 | Non Functional Impact of removing coherent memory |
| 2370 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 2371 | |
| 2372 | Removal of the coherent memory region leads to the additional software overhead |
| 2373 | of performing cache maintenance for the affected data structures. However, since |
| 2374 | the memory where the data structures are allocated is cacheable, the overhead is |
| 2375 | mostly mitigated by an increase in performance. |
| 2376 | |
| 2377 | There is however a performance impact for bakery locks, due to: |
| 2378 | |
| 2379 | - Additional cache maintenance operations, and |
| 2380 | - Multiple cache line reads for each lock operation, since the bakery locks |
| 2381 | for each CPU are distributed across different cache lines. |
| 2382 | |
| 2383 | The implementation has been optimized to minimize this additional overhead. |
| 2384 | Measurements indicate that when bakery locks are allocated in Normal memory, the |
| 2385 | minimum latency of acquiring a lock is on an average 3-4 micro seconds whereas |
| 2386 | in Device memory the same is 2 micro seconds. The measurements were done on the |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2387 | Juno Arm development platform. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2388 | |
| 2389 | As mentioned earlier, almost a page of memory can be saved by disabling |
| 2390 | ``USE_COHERENT_MEM``. Each platform needs to consider these trade-offs to decide |
| 2391 | whether coherent memory should be used. If a platform disables |
| 2392 | ``USE_COHERENT_MEM`` and needs to use bakery locks in the porting layer, it can |
| 2393 | optionally define macro ``PLAT_PERCPU_BAKERY_LOCK_SIZE`` (see the |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 2394 | :ref:`Porting Guide`). Refer to the reference platform code for examples. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2395 | |
| 2396 | Isolating code and read-only data on separate memory pages |
| 2397 | ---------------------------------------------------------- |
| 2398 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2399 | In the Armv8-A VMSA, translation table entries include fields that define the |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2400 | properties of the target memory region, such as its access permissions. The |
| 2401 | smallest unit of memory that can be addressed by a translation table entry is |
| 2402 | a memory page. Therefore, if software needs to set different permissions on two |
| 2403 | memory regions then it needs to map them using different memory pages. |
| 2404 | |
| 2405 | The default memory layout for each BL image is as follows: |
| 2406 | |
| 2407 | :: |
| 2408 | |
| 2409 | | ... | |
| 2410 | +-------------------+ |
| 2411 | | Read-write data | |
| 2412 | +-------------------+ Page boundary |
| 2413 | | <Padding> | |
| 2414 | +-------------------+ |
| 2415 | | Exception vectors | |
| 2416 | +-------------------+ 2 KB boundary |
| 2417 | | <Padding> | |
| 2418 | +-------------------+ |
| 2419 | | Read-only data | |
| 2420 | +-------------------+ |
| 2421 | | Code | |
| 2422 | +-------------------+ BLx_BASE |
| 2423 | |
Paul Beesley | ba3ed40 | 2019-03-13 16:20:44 +0000 | [diff] [blame] | 2424 | .. note:: |
| 2425 | The 2KB alignment for the exception vectors is an architectural |
| 2426 | requirement. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2427 | |
| 2428 | The read-write data start on a new memory page so that they can be mapped with |
| 2429 | read-write permissions, whereas the code and read-only data below are configured |
| 2430 | as read-only. |
| 2431 | |
| 2432 | However, the read-only data are not aligned on a page boundary. They are |
| 2433 | contiguous to the code. Therefore, the end of the code section and the beginning |
| 2434 | of the read-only data one might share a memory page. This forces both to be |
| 2435 | mapped with the same memory attributes. As the code needs to be executable, this |
| 2436 | means that the read-only data stored on the same memory page as the code are |
| 2437 | executable as well. This could potentially be exploited as part of a security |
| 2438 | attack. |
| 2439 | |
| 2440 | TF provides the build flag ``SEPARATE_CODE_AND_RODATA`` to isolate the code and |
| 2441 | read-only data on separate memory pages. This in turn allows independent control |
| 2442 | of the access permissions for the code and read-only data. In this case, |
| 2443 | platform code gets a finer-grained view of the image layout and can |
| 2444 | appropriately map the code region as executable and the read-only data as |
| 2445 | execute-never. |
| 2446 | |
| 2447 | This has an impact on memory footprint, as padding bytes need to be introduced |
Paul Beesley | 1fbc97b | 2019-01-11 18:26:51 +0000 | [diff] [blame] | 2448 | between the code and read-only data to ensure the segregation of the two. To |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2449 | limit the memory cost, this flag also changes the memory layout such that the |
| 2450 | code and exception vectors are now contiguous, like so: |
| 2451 | |
| 2452 | :: |
| 2453 | |
| 2454 | | ... | |
| 2455 | +-------------------+ |
| 2456 | | Read-write data | |
| 2457 | +-------------------+ Page boundary |
| 2458 | | <Padding> | |
| 2459 | +-------------------+ |
| 2460 | | Read-only data | |
| 2461 | +-------------------+ Page boundary |
| 2462 | | <Padding> | |
| 2463 | +-------------------+ |
| 2464 | | Exception vectors | |
| 2465 | +-------------------+ 2 KB boundary |
| 2466 | | <Padding> | |
| 2467 | +-------------------+ |
| 2468 | | Code | |
| 2469 | +-------------------+ BLx_BASE |
| 2470 | |
| 2471 | With this more condensed memory layout, the separation of read-only data will |
| 2472 | add zero or one page to the memory footprint of each BL image. Each platform |
| 2473 | should consider the trade-off between memory footprint and security. |
| 2474 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2475 | This build flag is disabled by default, minimising memory footprint. On Arm |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2476 | platforms, it is enabled. |
| 2477 | |
Jeenu Viswambharan | e3f2200 | 2017-09-22 08:32:10 +0100 | [diff] [blame] | 2478 | Publish and Subscribe Framework |
| 2479 | ------------------------------- |
| 2480 | |
| 2481 | The Publish and Subscribe Framework allows EL3 components to define and publish |
| 2482 | events, to which other EL3 components can subscribe. |
| 2483 | |
| 2484 | The following macros are provided by the framework: |
| 2485 | |
| 2486 | - ``REGISTER_PUBSUB_EVENT(event)``: Defines an event, and takes one argument, |
| 2487 | the event name, which must be a valid C identifier. All calls to |
| 2488 | ``REGISTER_PUBSUB_EVENT`` macro must be placed in the file |
| 2489 | ``pubsub_events.h``. |
| 2490 | |
| 2491 | - ``PUBLISH_EVENT_ARG(event, arg)``: Publishes a defined event, by iterating |
| 2492 | subscribed handlers and calling them in turn. The handlers will be passed the |
| 2493 | parameter ``arg``. The expected use-case is to broadcast an event. |
| 2494 | |
| 2495 | - ``PUBLISH_EVENT(event)``: Like ``PUBLISH_EVENT_ARG``, except that the value |
| 2496 | ``NULL`` is passed to subscribed handlers. |
| 2497 | |
| 2498 | - ``SUBSCRIBE_TO_EVENT(event, handler)``: Registers the ``handler`` to |
| 2499 | subscribe to ``event``. The handler will be executed whenever the ``event`` |
| 2500 | is published. |
| 2501 | |
| 2502 | - ``for_each_subscriber(event, subscriber)``: Iterates through all handlers |
| 2503 | subscribed for ``event``. ``subscriber`` must be a local variable of type |
| 2504 | ``pubsub_cb_t *``, and will point to each subscribed handler in turn during |
| 2505 | iteration. This macro can be used for those patterns that none of the |
| 2506 | ``PUBLISH_EVENT_*()`` macros cover. |
| 2507 | |
| 2508 | Publishing an event that wasn't defined using ``REGISTER_PUBSUB_EVENT`` will |
| 2509 | result in build error. Subscribing to an undefined event however won't. |
| 2510 | |
| 2511 | Subscribed handlers must be of type ``pubsub_cb_t``, with following function |
| 2512 | signature: |
| 2513 | |
Paul Beesley | 493e349 | 2019-03-13 15:11:04 +0000 | [diff] [blame] | 2514 | .. code:: c |
Jeenu Viswambharan | e3f2200 | 2017-09-22 08:32:10 +0100 | [diff] [blame] | 2515 | |
| 2516 | typedef void* (*pubsub_cb_t)(const void *arg); |
| 2517 | |
| 2518 | There may be arbitrary number of handlers registered to the same event. The |
| 2519 | order in which subscribed handlers are notified when that event is published is |
| 2520 | not defined. Subscribed handlers may be executed in any order; handlers should |
| 2521 | not assume any relative ordering amongst them. |
| 2522 | |
| 2523 | Publishing an event on a PE will result in subscribed handlers executing on that |
| 2524 | PE only; it won't cause handlers to execute on a different PE. |
| 2525 | |
| 2526 | Note that publishing an event on a PE blocks until all the subscribed handlers |
| 2527 | finish executing on the PE. |
| 2528 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2529 | TF-A generic code publishes and subscribes to some events within. Platform |
| 2530 | ports are discouraged from subscribing to them. These events may be withdrawn, |
| 2531 | renamed, or have their semantics altered in the future. Platforms may however |
| 2532 | register, publish, and subscribe to platform-specific events. |
Dimitris Papastamos | a7921b9 | 2017-10-13 15:27:58 +0100 | [diff] [blame] | 2533 | |
Jeenu Viswambharan | e3f2200 | 2017-09-22 08:32:10 +0100 | [diff] [blame] | 2534 | Publish and Subscribe Example |
| 2535 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 2536 | |
| 2537 | A publisher that wants to publish event ``foo`` would: |
| 2538 | |
| 2539 | - Define the event ``foo`` in the ``pubsub_events.h``. |
| 2540 | |
Paul Beesley | 493e349 | 2019-03-13 15:11:04 +0000 | [diff] [blame] | 2541 | .. code:: c |
Jeenu Viswambharan | e3f2200 | 2017-09-22 08:32:10 +0100 | [diff] [blame] | 2542 | |
| 2543 | REGISTER_PUBSUB_EVENT(foo); |
| 2544 | |
| 2545 | - Depending on the nature of event, use one of ``PUBLISH_EVENT_*()`` macros to |
| 2546 | publish the event at the appropriate path and time of execution. |
| 2547 | |
| 2548 | A subscriber that wants to subscribe to event ``foo`` published above would |
| 2549 | implement: |
| 2550 | |
Sandrine Bailleux | f5a9100 | 2019-02-08 10:50:28 +0100 | [diff] [blame] | 2551 | .. code:: c |
Jeenu Viswambharan | e3f2200 | 2017-09-22 08:32:10 +0100 | [diff] [blame] | 2552 | |
Sandrine Bailleux | f5a9100 | 2019-02-08 10:50:28 +0100 | [diff] [blame] | 2553 | void *foo_handler(const void *arg) |
| 2554 | { |
| 2555 | void *result; |
Jeenu Viswambharan | e3f2200 | 2017-09-22 08:32:10 +0100 | [diff] [blame] | 2556 | |
Sandrine Bailleux | f5a9100 | 2019-02-08 10:50:28 +0100 | [diff] [blame] | 2557 | /* Do handling ... */ |
Jeenu Viswambharan | e3f2200 | 2017-09-22 08:32:10 +0100 | [diff] [blame] | 2558 | |
Sandrine Bailleux | f5a9100 | 2019-02-08 10:50:28 +0100 | [diff] [blame] | 2559 | return result; |
| 2560 | } |
Jeenu Viswambharan | e3f2200 | 2017-09-22 08:32:10 +0100 | [diff] [blame] | 2561 | |
Sandrine Bailleux | f5a9100 | 2019-02-08 10:50:28 +0100 | [diff] [blame] | 2562 | SUBSCRIBE_TO_EVENT(foo, foo_handler); |
Jeenu Viswambharan | e3f2200 | 2017-09-22 08:32:10 +0100 | [diff] [blame] | 2563 | |
Daniel Boulby | 468f0d7 | 2018-09-18 11:45:51 +0100 | [diff] [blame] | 2564 | |
| 2565 | Reclaiming the BL31 initialization code |
| 2566 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 2567 | |
| 2568 | A significant amount of the code used for the initialization of BL31 is never |
| 2569 | needed again after boot time. In order to reduce the runtime memory |
| 2570 | footprint, the memory used for this code can be reclaimed after initialization |
| 2571 | has finished and be used for runtime data. |
| 2572 | |
| 2573 | The build option ``RECLAIM_INIT_CODE`` can be set to mark this boot time code |
| 2574 | with a ``.text.init.*`` attribute which can be filtered and placed suitably |
Paul Beesley | 1fbc97b | 2019-01-11 18:26:51 +0000 | [diff] [blame] | 2575 | within the BL image for later reclamation by the platform. The platform can |
| 2576 | specify the filter and the memory region for this init section in BL31 via the |
Daniel Boulby | 468f0d7 | 2018-09-18 11:45:51 +0100 | [diff] [blame] | 2577 | plat.ld.S linker script. For example, on the FVP, this section is placed |
| 2578 | overlapping the secondary CPU stacks so that after the cold boot is done, this |
| 2579 | memory can be reclaimed for the stacks. The init memory section is initially |
Paul Beesley | 1fbc97b | 2019-01-11 18:26:51 +0000 | [diff] [blame] | 2580 | mapped with ``RO``, ``EXECUTE`` attributes. After BL31 initialization has |
Daniel Boulby | 468f0d7 | 2018-09-18 11:45:51 +0100 | [diff] [blame] | 2581 | completed, the FVP changes the attributes of this section to ``RW``, |
| 2582 | ``EXECUTE_NEVER`` allowing it to be used for runtime data. The memory attributes |
| 2583 | are changed within the ``bl31_plat_runtime_setup`` platform hook. The init |
| 2584 | section section can be reclaimed for any data which is accessed after cold |
| 2585 | boot initialization and it is upto the platform to make the decision. |
| 2586 | |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 2587 | .. _firmware_design_pmf: |
| 2588 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2589 | Performance Measurement Framework |
| 2590 | --------------------------------- |
| 2591 | |
| 2592 | The Performance Measurement Framework (PMF) facilitates collection of |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2593 | timestamps by registered services and provides interfaces to retrieve them |
| 2594 | from within TF-A. A platform can choose to expose appropriate SMCs to |
| 2595 | retrieve these collected timestamps. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2596 | |
| 2597 | By default, the global physical counter is used for the timestamp |
| 2598 | value and is read via ``CNTPCT_EL0``. The framework allows to retrieve |
| 2599 | timestamps captured by other CPUs. |
| 2600 | |
| 2601 | Timestamp identifier format |
| 2602 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 2603 | |
| 2604 | A PMF timestamp is uniquely identified across the system via the |
| 2605 | timestamp ID or ``tid``. The ``tid`` is composed as follows: |
| 2606 | |
| 2607 | :: |
| 2608 | |
| 2609 | Bits 0-7: The local timestamp identifier. |
| 2610 | Bits 8-9: Reserved. |
| 2611 | Bits 10-15: The service identifier. |
| 2612 | Bits 16-31: Reserved. |
| 2613 | |
| 2614 | #. The service identifier. Each PMF service is identified by a |
| 2615 | service name and a service identifier. Both the service name and |
| 2616 | identifier are unique within the system as a whole. |
| 2617 | |
| 2618 | #. The local timestamp identifier. This identifier is unique within a given |
| 2619 | service. |
| 2620 | |
| 2621 | Registering a PMF service |
| 2622 | ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 2623 | |
| 2624 | To register a PMF service, the ``PMF_REGISTER_SERVICE()`` macro from ``pmf.h`` |
| 2625 | is used. The arguments required are the service name, the service ID, |
| 2626 | the total number of local timestamps to be captured and a set of flags. |
| 2627 | |
| 2628 | The ``flags`` field can be specified as a bitwise-OR of the following values: |
| 2629 | |
| 2630 | :: |
| 2631 | |
| 2632 | PMF_STORE_ENABLE: The timestamp is stored in memory for later retrieval. |
| 2633 | PMF_DUMP_ENABLE: The timestamp is dumped on the serial console. |
| 2634 | |
| 2635 | The ``PMF_REGISTER_SERVICE()`` reserves memory to store captured |
| 2636 | timestamps in a PMF specific linker section at build time. |
| 2637 | Additionally, it defines necessary functions to capture and |
| 2638 | retrieve a particular timestamp for the given service at runtime. |
| 2639 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2640 | The macro ``PMF_REGISTER_SERVICE()`` only enables capturing PMF timestamps |
| 2641 | from within TF-A. In order to retrieve timestamps from outside of TF-A, the |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2642 | ``PMF_REGISTER_SERVICE_SMC()`` macro must be used instead. This macro |
| 2643 | accepts the same set of arguments as the ``PMF_REGISTER_SERVICE()`` |
| 2644 | macro but additionally supports retrieving timestamps using SMCs. |
| 2645 | |
| 2646 | Capturing a timestamp |
| 2647 | ~~~~~~~~~~~~~~~~~~~~~ |
| 2648 | |
| 2649 | PMF timestamps are stored in a per-service timestamp region. On a |
| 2650 | system with multiple CPUs, each timestamp is captured and stored |
| 2651 | in a per-CPU cache line aligned memory region. |
| 2652 | |
| 2653 | Having registered the service, the ``PMF_CAPTURE_TIMESTAMP()`` macro can be |
| 2654 | used to capture a timestamp at the location where it is used. The macro |
| 2655 | takes the service name, a local timestamp identifier and a flag as arguments. |
| 2656 | |
| 2657 | The ``flags`` field argument can be zero, or ``PMF_CACHE_MAINT`` which |
| 2658 | instructs PMF to do cache maintenance following the capture. Cache |
| 2659 | maintenance is required if any of the service's timestamps are captured |
| 2660 | with data cache disabled. |
| 2661 | |
| 2662 | To capture a timestamp in assembly code, the caller should use |
| 2663 | ``pmf_calc_timestamp_addr`` macro (defined in ``pmf_asm_macros.S``) to |
| 2664 | calculate the address of where the timestamp would be stored. The |
| 2665 | caller should then read ``CNTPCT_EL0`` register to obtain the timestamp |
| 2666 | and store it at the determined address for later retrieval. |
| 2667 | |
| 2668 | Retrieving a timestamp |
| 2669 | ~~~~~~~~~~~~~~~~~~~~~~ |
| 2670 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2671 | From within TF-A, timestamps for individual CPUs can be retrieved using either |
| 2672 | ``PMF_GET_TIMESTAMP_BY_MPIDR()`` or ``PMF_GET_TIMESTAMP_BY_INDEX()`` macros. |
| 2673 | These macros accept the CPU's MPIDR value, or its ordinal position |
| 2674 | respectively. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2675 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2676 | From outside TF-A, timestamps for individual CPUs can be retrieved by calling |
| 2677 | into ``pmf_smc_handler()``. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2678 | |
Paul Beesley | 493e349 | 2019-03-13 15:11:04 +0000 | [diff] [blame] | 2679 | :: |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2680 | |
| 2681 | Interface : pmf_smc_handler() |
| 2682 | Argument : unsigned int smc_fid, u_register_t x1, |
| 2683 | u_register_t x2, u_register_t x3, |
| 2684 | u_register_t x4, void *cookie, |
| 2685 | void *handle, u_register_t flags |
| 2686 | Return : uintptr_t |
| 2687 | |
| 2688 | smc_fid: Holds the SMC identifier which is either `PMF_SMC_GET_TIMESTAMP_32` |
| 2689 | when the caller of the SMC is running in AArch32 mode |
| 2690 | or `PMF_SMC_GET_TIMESTAMP_64` when the caller is running in AArch64 mode. |
| 2691 | x1: Timestamp identifier. |
| 2692 | x2: The `mpidr` of the CPU for which the timestamp has to be retrieved. |
| 2693 | This can be the `mpidr` of a different core to the one initiating |
| 2694 | the SMC. In that case, service specific cache maintenance may be |
| 2695 | required to ensure the updated copy of the timestamp is returned. |
| 2696 | x3: A flags value that is either 0 or `PMF_CACHE_MAINT`. If |
| 2697 | `PMF_CACHE_MAINT` is passed, then the PMF code will perform a |
| 2698 | cache invalidate before reading the timestamp. This ensures |
| 2699 | an updated copy is returned. |
| 2700 | |
| 2701 | The remaining arguments, ``x4``, ``cookie``, ``handle`` and ``flags`` are unused |
| 2702 | in this implementation. |
| 2703 | |
| 2704 | PMF code structure |
| 2705 | ~~~~~~~~~~~~~~~~~~ |
| 2706 | |
| 2707 | #. ``pmf_main.c`` consists of core functions that implement service registration, |
| 2708 | initialization, storing, dumping and retrieving timestamps. |
| 2709 | |
| 2710 | #. ``pmf_smc.c`` contains the SMC handling for registered PMF services. |
| 2711 | |
| 2712 | #. ``pmf.h`` contains the public interface to Performance Measurement Framework. |
| 2713 | |
| 2714 | #. ``pmf_asm_macros.S`` consists of macros to facilitate capturing timestamps in |
| 2715 | assembly code. |
| 2716 | |
| 2717 | #. ``pmf_helpers.h`` is an internal header used by ``pmf.h``. |
| 2718 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2719 | Armv8-A Architecture Extensions |
| 2720 | ------------------------------- |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2721 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2722 | TF-A makes use of Armv8-A Architecture Extensions where applicable. This |
| 2723 | section lists the usage of Architecture Extensions, and build flags |
| 2724 | controlling them. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2725 | |
Manish Pandey | acdaac2 | 2023-05-12 14:51:39 +0100 | [diff] [blame] | 2726 | Build options |
| 2727 | ~~~~~~~~~~~~~ |
| 2728 | |
| 2729 | ``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` |
| 2730 | |
| 2731 | These build options serve dual purpose |
| 2732 | |
| 2733 | - Determine the architecture extension support in TF-A build: All the mandatory |
| 2734 | architectural features up to ``ARM_ARCH_MAJOR.ARM_ARCH_MINOR`` are included |
| 2735 | and unconditionally enabled by TF-A build system. |
| 2736 | |
Govindraj Raja | 8152565 | 2023-07-18 13:55:33 -0500 | [diff] [blame] | 2737 | - ``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` are passed to a march.mk build utility |
| 2738 | this will try to come up with an appropriate -march value to be passed to compiler |
| 2739 | by probing the compiler and checking what's supported by the compiler and what's best |
| 2740 | that can be used. But if platform provides a ``MARCH_DIRECTIVE`` then it will used |
| 2741 | directly and compiler probing will be skipped. |
Manish Pandey | acdaac2 | 2023-05-12 14:51:39 +0100 | [diff] [blame] | 2742 | |
| 2743 | The build system requires that the platform provides a valid numeric value based on |
| 2744 | CPU architecture extension, otherwise it defaults to base Armv8.0-A architecture. |
| 2745 | Subsequent Arm Architecture versions also support extensions which were introduced |
| 2746 | in previous versions. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2747 | |
Paul Beesley | d2fcc4e | 2019-05-29 13:59:40 +0100 | [diff] [blame] | 2748 | .. seealso:: :ref:`Build Options` |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2749 | |
| 2750 | For details on the Architecture Extension and available features, please refer |
| 2751 | to the respective Architecture Extension Supplement. |
| 2752 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2753 | Armv8.1-A |
| 2754 | ~~~~~~~~~ |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2755 | |
| 2756 | This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` >= 8, or when |
| 2757 | ``ARM_ARCH_MAJOR`` == 8 and ``ARM_ARCH_MINOR`` >= 1. |
| 2758 | |
Soby Mathew | ad04201 | 2019-09-25 14:03:41 +0100 | [diff] [blame] | 2759 | - By default, a load-/store-exclusive instruction pair is used to implement |
| 2760 | spinlocks. The ``USE_SPINLOCK_CAS`` build option when set to 1 selects the |
| 2761 | spinlock implementation using the ARMv8.1-LSE Compare and Swap instruction. |
| 2762 | Notice this instruction is only available in AArch64 execution state, so |
| 2763 | the option is only available to AArch64 builds. |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2764 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2765 | Armv8.2-A |
| 2766 | ~~~~~~~~~ |
Isla Mitchell | c4a1a07 | 2017-08-07 11:20:13 +0100 | [diff] [blame] | 2767 | |
Antonio Nino Diaz | 633703a | 2019-02-19 13:14:06 +0000 | [diff] [blame] | 2768 | - The presence of ARMv8.2-TTCNP is detected at runtime. When it is present, the |
| 2769 | Common not Private (TTBRn_ELx.CnP) bit is enabled to indicate that multiple |
Sandrine Bailleux | fee6e26 | 2018-01-29 14:48:15 +0100 | [diff] [blame] | 2770 | Processing Elements in the same Inner Shareable domain use the same |
| 2771 | translation table entries for a given stage of translation for a particular |
| 2772 | translation regime. |
Isla Mitchell | c4a1a07 | 2017-08-07 11:20:13 +0100 | [diff] [blame] | 2773 | |
Jeenu Viswambharan | cbad661 | 2018-08-15 14:29:29 +0100 | [diff] [blame] | 2774 | Armv8.3-A |
| 2775 | ~~~~~~~~~ |
| 2776 | |
Antonio Nino Diaz | 594811b | 2019-01-31 11:58:00 +0000 | [diff] [blame] | 2777 | - Pointer authentication features of Armv8.3-A are unconditionally enabled in |
| 2778 | the Non-secure world so that lower ELs are allowed to use them without |
| 2779 | causing a trap to EL3. |
| 2780 | |
| 2781 | In order to enable the Secure world to use it, ``CTX_INCLUDE_PAUTH_REGS`` |
| 2782 | must be set to 1. This will add all pointer authentication system registers |
| 2783 | to the context that is saved when doing a world switch. |
Jeenu Viswambharan | cbad661 | 2018-08-15 14:29:29 +0100 | [diff] [blame] | 2784 | |
Alexei Fedorov | 2831d58 | 2019-03-13 11:05:07 +0000 | [diff] [blame] | 2785 | The TF-A itself has support for pointer authentication at runtime |
Alexei Fedorov | 90f2e88 | 2019-05-24 12:17:09 +0100 | [diff] [blame] | 2786 | that can be enabled by setting ``BRANCH_PROTECTION`` option to non-zero and |
Antonio Nino Diaz | 25cda67 | 2019-02-19 11:53:51 +0000 | [diff] [blame] | 2787 | ``CTX_INCLUDE_PAUTH_REGS`` to 1. This enables pointer authentication in BL1, |
| 2788 | BL2, BL31, and the TSP if it is used. |
| 2789 | |
Alexei Fedorov | 2831d58 | 2019-03-13 11:05:07 +0000 | [diff] [blame] | 2790 | Note that Pointer Authentication is enabled for Non-secure world irrespective |
| 2791 | of the value of these build flags if the CPU supports it. |
| 2792 | |
Alexei Fedorov | b567e5d | 2019-03-11 16:51:47 +0000 | [diff] [blame] | 2793 | If ``ARM_ARCH_MAJOR == 8`` and ``ARM_ARCH_MINOR >= 3`` the code footprint of |
| 2794 | enabling PAuth is lower because the compiler will use the optimized |
| 2795 | PAuth instructions rather than the backwards-compatible ones. |
| 2796 | |
Alexei Fedorov | 90f2e88 | 2019-05-24 12:17:09 +0100 | [diff] [blame] | 2797 | Armv8.5-A |
| 2798 | ~~~~~~~~~ |
| 2799 | |
| 2800 | - Branch Target Identification feature is selected by ``BRANCH_PROTECTION`` |
Manish Pandey | 34a305e | 2021-10-21 21:53:49 +0100 | [diff] [blame] | 2801 | option set to 1. This option defaults to 0. |
Justin Chadwell | 55c7351 | 2019-07-18 16:16:32 +0100 | [diff] [blame] | 2802 | |
Govindraj Raja | c1be66f | 2024-03-07 14:42:20 -0600 | [diff] [blame] | 2803 | - Memory Tagging Extension feature has few variants but not all of them require |
| 2804 | enablement from EL3 to be used at lower EL. e.g. Memory tagging only at |
| 2805 | EL0(MTE) does not require EL3 configuration however memory tagging at |
| 2806 | EL2/EL1 (MTE2) does require EL3 enablement and we need to set this option |
| 2807 | ``ENABLE_FEAT_MTE2`` to 1. This option defaults to 0. |
Alexei Fedorov | 90f2e88 | 2019-05-24 12:17:09 +0100 | [diff] [blame] | 2808 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2809 | Armv7-A |
| 2810 | ~~~~~~~ |
Etienne Carriere | 1374fcb | 2017-11-08 13:48:40 +0100 | [diff] [blame] | 2811 | |
| 2812 | This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` == 7. |
| 2813 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2814 | There are several Armv7-A extensions available. Obviously the TrustZone |
| 2815 | extension is mandatory to support the TF-A bootloader and runtime services. |
Etienne Carriere | 1374fcb | 2017-11-08 13:48:40 +0100 | [diff] [blame] | 2816 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2817 | Platform implementing an Armv7-A system can to define from its target |
Etienne Carriere | 1374fcb | 2017-11-08 13:48:40 +0100 | [diff] [blame] | 2818 | Cortex-A architecture through ``ARM_CORTEX_A<X> = yes`` in their |
Paul Beesley | 1fbc97b | 2019-01-11 18:26:51 +0000 | [diff] [blame] | 2819 | ``platform.mk`` script. For example ``ARM_CORTEX_A15=yes`` for a |
Etienne Carriere | 1374fcb | 2017-11-08 13:48:40 +0100 | [diff] [blame] | 2820 | Cortex-A15 target. |
| 2821 | |
| 2822 | Platform can also set ``ARM_WITH_NEON=yes`` to enable neon support. |
Paul Beesley | f2ec714 | 2019-10-04 16:17:46 +0000 | [diff] [blame] | 2823 | Note that using neon at runtime has constraints on non secure world context. |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2824 | TF-A does not yet provide VFP context management. |
Etienne Carriere | 1374fcb | 2017-11-08 13:48:40 +0100 | [diff] [blame] | 2825 | |
| 2826 | Directive ``ARM_CORTEX_A<x>`` and ``ARM_WITH_NEON`` are used to set |
| 2827 | the toolchain target architecture directive. |
| 2828 | |
| 2829 | Platform may choose to not define straight the toolchain target architecture |
Govindraj Raja | cd10c6e | 2023-05-30 16:52:15 -0500 | [diff] [blame] | 2830 | directive by defining ``MARCH_DIRECTIVE``. |
Etienne Carriere | 1374fcb | 2017-11-08 13:48:40 +0100 | [diff] [blame] | 2831 | I.e: |
| 2832 | |
Paul Beesley | 493e349 | 2019-03-13 15:11:04 +0000 | [diff] [blame] | 2833 | .. code:: make |
Etienne Carriere | 1374fcb | 2017-11-08 13:48:40 +0100 | [diff] [blame] | 2834 | |
Govindraj Raja | 8152565 | 2023-07-18 13:55:33 -0500 | [diff] [blame] | 2835 | MARCH_DIRECTIVE := -march=armv7-a |
Etienne Carriere | 1374fcb | 2017-11-08 13:48:40 +0100 | [diff] [blame] | 2836 | |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2837 | Code Structure |
| 2838 | -------------- |
| 2839 | |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2840 | TF-A code is logically divided between the three boot loader stages mentioned |
| 2841 | in the previous sections. The code is also divided into the following |
| 2842 | categories (present as directories in the source code): |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2843 | |
| 2844 | - **Platform specific.** Choice of architecture specific code depends upon |
| 2845 | the platform. |
| 2846 | - **Common code.** This is platform and architecture agnostic code. |
| 2847 | - **Library code.** This code comprises of functionality commonly used by all |
| 2848 | other code. The PSCI implementation and other EL3 runtime frameworks reside |
| 2849 | as Library components. |
| 2850 | - **Stage specific.** Code specific to a boot stage. |
| 2851 | - **Drivers.** |
| 2852 | - **Services.** EL3 runtime services (eg: SPD). Specific SPD services |
| 2853 | reside in the ``services/spd`` directory (e.g. ``services/spd/tspd``). |
| 2854 | |
| 2855 | Each boot loader stage uses code from one or more of the above mentioned |
| 2856 | categories. Based upon the above, the code layout looks like this: |
| 2857 | |
| 2858 | :: |
| 2859 | |
| 2860 | Directory Used by BL1? Used by BL2? Used by BL31? |
| 2861 | bl1 Yes No No |
| 2862 | bl2 No Yes No |
| 2863 | bl31 No No Yes |
| 2864 | plat Yes Yes Yes |
| 2865 | drivers Yes No Yes |
| 2866 | common Yes Yes Yes |
| 2867 | lib Yes Yes Yes |
| 2868 | services No No Yes |
| 2869 | |
Sandrine Bailleux | 15530dd | 2019-02-08 15:26:36 +0100 | [diff] [blame] | 2870 | The build system provides a non configurable build option IMAGE_BLx for each |
| 2871 | boot loader stage (where x = BL stage). e.g. for BL1 , IMAGE_BL1 will be |
Dan Handley | 610e7e1 | 2018-03-01 18:44:00 +0000 | [diff] [blame] | 2872 | defined by the build system. This enables TF-A to compile certain code only |
| 2873 | for specific boot loader stages |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2874 | |
| 2875 | All assembler files have the ``.S`` extension. The linker source files for each |
| 2876 | boot stage have the extension ``.ld.S``. These are processed by GCC to create the |
| 2877 | linker scripts which have the extension ``.ld``. |
| 2878 | |
| 2879 | FDTs provide a description of the hardware platform and are used by the Linux |
| 2880 | kernel at boot time. These can be found in the ``fdts`` directory. |
| 2881 | |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 2882 | .. rubric:: References |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2883 | |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 2884 | - `Trusted Board Boot Requirements CLIENT (TBBR-CLIENT) Armv8-A (ARM DEN0006D)`_ |
| 2885 | |
Manish V Badarkhe | 9d24e9b | 2023-06-15 09:14:33 +0100 | [diff] [blame] | 2886 | - `PSCI`_ |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 2887 | |
Sandrine Bailleux | d9202df | 2020-04-17 14:06:52 +0200 | [diff] [blame] | 2888 | - `SMC Calling Convention`_ |
Paul Beesley | f864067 | 2019-04-12 14:19:42 +0100 | [diff] [blame] | 2889 | |
| 2890 | - :ref:`Interrupt Management Framework` |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2891 | |
| 2892 | -------------- |
| 2893 | |
Govindraj Raja | 24d3a4e | 2023-12-21 13:57:49 -0600 | [diff] [blame] | 2894 | *Copyright (c) 2013-2024, Arm Limited and Contributors. All rights reserved.* |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2895 | |
laurenw-arm | 03e7e61 | 2020-04-16 10:02:17 -0500 | [diff] [blame] | 2896 | .. _SMCCC: https://developer.arm.com/docs/den0028/latest |
Manish V Badarkhe | 9d24e9b | 2023-06-15 09:14:33 +0100 | [diff] [blame] | 2897 | .. _PSCI: https://developer.arm.com/documentation/den0022/latest/ |
Petre-Ionut Tudor | 620a702 | 2019-09-27 15:13:21 +0100 | [diff] [blame] | 2898 | .. _Arm ARM: https://developer.arm.com/docs/ddi0487/latest |
laurenw-arm | 03e7e61 | 2020-04-16 10:02:17 -0500 | [diff] [blame] | 2899 | .. _SMC Calling Convention: https://developer.arm.com/docs/den0028/latest |
Sandrine Bailleux | f238417 | 2024-02-02 11:16:12 +0100 | [diff] [blame] | 2900 | .. _Trusted Board Boot Requirements CLIENT (TBBR-CLIENT) Armv8-A (ARM DEN0006D): https://developer.arm.com/docs/den0006/latest |
Zelalem Aweke | 023b1a4 | 2021-10-21 13:59:45 -0500 | [diff] [blame] | 2901 | .. _Arm Confidential Compute Architecture (Arm CCA): https://www.arm.com/why-arm/architecture/security-features/arm-confidential-compute-architecture |
Manish Pandey | 493bdc4 | 2023-07-21 13:08:53 +0100 | [diff] [blame] | 2902 | .. _AArch64 exception vector table: https://developer.arm.com/documentation/100933/0100/AArch64-exception-vector-table |
Douglas Raillard | d7c21b7 | 2017-06-28 15:23:03 +0100 | [diff] [blame] | 2903 | |
Paul Beesley | 814f8c0 | 2019-03-13 15:49:27 +0000 | [diff] [blame] | 2904 | .. |Image 1| image:: ../resources/diagrams/rt-svc-descs-layout.png |