johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 1 | Granule Protection Tables Library |
| 2 | ================================= |
| 3 | |
AlexeiFedorov | d251c35 | 2024-05-14 16:29:37 +0100 | [diff] [blame] | 4 | This document describes the design of the Granule Protection Tables (GPT) |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 5 | library used by Trusted Firmware-A (TF-A). This library provides the APIs needed |
| 6 | to initialize the GPTs based on a data structure containing information about |
| 7 | the systems memory layout, configure the system registers to enable granule |
| 8 | protection checks based on these tables, and transition granules between |
| 9 | different PAS (physical address spaces) at runtime. |
| 10 | |
AlexeiFedorov | d251c35 | 2024-05-14 16:29:37 +0100 | [diff] [blame] | 11 | Arm CCA adds two new security states for a total of four: root, realm, secure, |
| 12 | and non-secure. In addition to new security states, corresponding physical |
| 13 | address spaces have been added to control memory access for each state. The PAS |
| 14 | access allowed to each security state can be seen in the table below. |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 15 | |
| 16 | .. list-table:: Security states and PAS access rights |
| 17 | :widths: 25 25 25 25 25 |
| 18 | :header-rows: 1 |
| 19 | |
| 20 | * - |
| 21 | - Root state |
| 22 | - Realm state |
| 23 | - Secure state |
| 24 | - Non-secure state |
| 25 | * - Root PAS |
| 26 | - yes |
| 27 | - no |
| 28 | - no |
| 29 | - no |
| 30 | * - Realm PAS |
| 31 | - yes |
| 32 | - yes |
| 33 | - no |
| 34 | - no |
| 35 | * - Secure PAS |
| 36 | - yes |
| 37 | - no |
| 38 | - yes |
| 39 | - no |
| 40 | * - Non-secure PAS |
| 41 | - yes |
| 42 | - yes |
| 43 | - yes |
| 44 | - yes |
| 45 | |
| 46 | The GPT can function as either a 1 level or 2 level lookup depending on how a |
| 47 | PAS region is configured. The first step is the level 0 table, each entry in the |
AlexeiFedorov | d251c35 | 2024-05-14 16:29:37 +0100 | [diff] [blame] | 48 | level 0 table controls access to a relatively large region in memory (GPT Block |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 49 | descriptor), and the entire region can belong to a single PAS when a one step |
AlexeiFedorov | d251c35 | 2024-05-14 16:29:37 +0100 | [diff] [blame] | 50 | mapping is used. Level 0 entry can also link to a level 1 table (GPT Table |
| 51 | descriptor) with a 2 step mapping. To change PAS of a region dynamically, the |
| 52 | region must be mapped in Level 1 table. |
| 53 | |
| 54 | The Level 1 tables entries with the same PAS can be combined to form a |
| 55 | contiguous block entry using GPT Contiguous descriptor. More details about this |
| 56 | is explained in the following section. |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 57 | |
| 58 | Design Concepts and Interfaces |
| 59 | ------------------------------ |
| 60 | |
| 61 | This section covers some important concepts and data structures used in the GPT |
| 62 | library. |
| 63 | |
| 64 | There are three main parameters that determine how the tables are organized and |
| 65 | function: the PPS (protected physical space) which is the total amount of |
| 66 | protected physical address space in the system, PGS (physical granule size) |
| 67 | which is how large each level 1 granule is, and L0GPTSZ (level 0 GPT size) which |
| 68 | determines how much physical memory is governed by each level 0 entry. A granule |
| 69 | is the smallest unit of memory that can be independently assigned to a PAS. |
| 70 | |
| 71 | L0GPTSZ is determined by the hardware and is read from the GPCCR_EL3 register. |
| 72 | PPS and PGS are passed into the APIs at runtime and can be determined in |
| 73 | whatever way is best for a given platform, either through some algorithm or hard |
| 74 | coded in the firmware. |
| 75 | |
| 76 | GPT setup is split into two parts: table creation and runtime initialization. In |
| 77 | the table creation step, a data structure containing information about the |
| 78 | desired PAS regions is passed into the library which validates the mappings, |
AlexeiFedorov | d251c35 | 2024-05-14 16:29:37 +0100 | [diff] [blame] | 79 | creates the tables in memory, and enables granule protection checks. It also |
| 80 | allocates memory for fine-grained locks adjacent to the L0 tables. In the |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 81 | runtime initialization step, the runtime firmware locates the existing tables in |
| 82 | memory using the GPT register configuration and saves important data to a |
| 83 | structure used by the granule transition service which will be covered more |
| 84 | below. |
| 85 | |
| 86 | In the reference implementation for FVP models, you can find an example of PAS |
Rohit Mathew | f085b87 | 2023-12-20 17:29:18 +0000 | [diff] [blame] | 87 | region definitions in the file ``plat/arm/board/fvp/include/fvp_pas_def.h``. |
Rohit Mathew | f6f02da | 2024-01-21 22:49:08 +0000 | [diff] [blame] | 88 | Table creation API calls can be found in ``plat/arm/common/arm_common.c`` and |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 89 | runtime initialization API calls can be seen in |
| 90 | ``plat/arm/common/arm_bl31_setup.c``. |
| 91 | |
AlexeiFedorov | d251c35 | 2024-05-14 16:29:37 +0100 | [diff] [blame] | 92 | During the table creation time, the GPT lib opportunistically fuses contiguous |
| 93 | GPT L1 entries having the same PAS. The maximum size of |
| 94 | supported contiguous blocks is defined by ``RME_GPT_MAX_BLOCK`` build option. |
| 95 | |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 96 | Defining PAS regions |
| 97 | ~~~~~~~~~~~~~~~~~~~~ |
| 98 | |
| 99 | A ``pas_region_t`` structure is a way to represent a physical address space and |
| 100 | its attributes that can be used by the GPT library to initialize the tables. |
| 101 | |
| 102 | This structure is composed of the following: |
| 103 | |
| 104 | #. The base physical address |
| 105 | #. The region size |
| 106 | #. The desired attributes of this memory region (mapping type, PAS type) |
| 107 | |
| 108 | See the ``pas_region_t`` type in ``include/lib/gpt_rme/gpt_rme.h``. |
| 109 | |
| 110 | The programmer should provide the API with an array containing ``pas_region_t`` |
| 111 | structures, then the library will check the desired memory access layout for |
| 112 | validity and create tables to implement it. |
| 113 | |
| 114 | ``pas_region_t`` is a public type, however it is recommended that the macros |
| 115 | ``GPT_MAP_REGION_BLOCK`` and ``GPT_MAP_REGION_GRANULE`` be used to populate |
| 116 | these structures instead of doing it manually to reduce the risk of future |
| 117 | compatibility issues. These macros take the base physical address, region size, |
| 118 | and PAS type as arguments to generate the pas_region_t structure. As the names |
| 119 | imply, ``GPT_MAP_REGION_BLOCK`` creates a region using only L0 mapping while |
| 120 | ``GPT_MAP_REGION_GRANULE`` creates a region using L0 and L1 mappings. |
| 121 | |
| 122 | Level 0 and Level 1 Tables |
| 123 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 124 | |
| 125 | The GPT initialization APIs require memory to be passed in for the tables to be |
AlexeiFedorov | d251c35 | 2024-05-14 16:29:37 +0100 | [diff] [blame] | 126 | constructed. The ``gpt_init_l0_tables`` API takes a memory address and size for |
| 127 | building the level 0 tables and also memory for allocating the fine-grained bitlock |
| 128 | data structure. The amount of memory needed for bitlock structure is controlled via |
| 129 | ``RME_GPT_BITLOCK_BLOCK`` config which defines the block size for each bit of the |
| 130 | the bitlock. |
| 131 | |
| 132 | The ``gpt_init_pas_l1_tables`` API takes an address and size for |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 133 | building the level 1 tables which are linked from level 0 descriptors. The |
| 134 | tables should have PAS type ``GPT_GPI_ROOT`` and a typical system might place |
| 135 | its level 0 table in SRAM and its level 1 table(s) in DRAM. |
| 136 | |
| 137 | Granule Transition Service |
| 138 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 139 | |
AlexeiFedorov | d251c35 | 2024-05-14 16:29:37 +0100 | [diff] [blame] | 140 | The Granule Transition Service allows memory mapped with |
| 141 | ``GPT_MAP_REGION_GRANULE`` ownership to be changed using SMC calls. Non-secure |
| 142 | granules can be transitioned to either realm or secure space, and realm and |
| 143 | secure granules can be transitioned back to non-secure. This library only |
| 144 | allows Level 1 entries to be transitioned. The lib may either shatter |
| 145 | contiguous blocks or fuse adjacent GPT entries to form a contiguous block |
| 146 | opportunistically. Depending on the maximum block size, the fuse operation may |
| 147 | propogate to higher block sizes as allowed by RME Architecture. Thus a higher |
| 148 | maximum block size may have a higher runtime cost due to software operations |
| 149 | that need to be performed for fuse to bigger block sizes. This cost may |
| 150 | be offset by better TLB performance due to the higher block size and platforms |
| 151 | need to make the trade-off decision based on their particular workload. |
| 152 | |
| 153 | Locking Scheme |
| 154 | ~~~~~~~~~~~~~~ |
| 155 | |
| 156 | During Granule Transition access to L1 tables is controlled by a lock to ensure |
| 157 | that no more than one CPU is allowed to make changes at any given time. |
| 158 | The granularity of the lock is defined by ``RME_GPT_BITLOCK_BLOCK`` build option |
| 159 | which defines the size of the memory block protected by one bit of ``bitlock`` |
| 160 | structure. Setting this option to 0 chooses a single spinlock for all GPT L1 |
| 161 | table entries. |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 162 | |
| 163 | Library APIs |
| 164 | ------------ |
| 165 | |
| 166 | The public APIs and types can be found in ``include/lib/gpt_rme/gpt_rme.h`` and this |
| 167 | section is intended to provide additional details and clarifications. |
| 168 | |
| 169 | To create the GPTs and enable granule protection checks the APIs need to be |
| 170 | called in the correct order and at the correct time during the system boot |
| 171 | process. |
| 172 | |
| 173 | #. Firmware must enable the MMU. |
| 174 | #. Firmware must call ``gpt_init_l0_tables`` to initialize the level 0 tables to |
| 175 | a default state, that is, initializing all of the L0 descriptors to allow all |
| 176 | accesses to all memory. The PPS is provided to this function as an argument. |
| 177 | #. DDR discovery and initialization by the system, the discovered DDR region(s) |
| 178 | are then added to the L1 PAS regions to be initialized in the next step and |
| 179 | used by the GTSI at runtime. |
| 180 | #. Firmware must call ``gpt_init_pas_l1_tables`` with a pointer to an array of |
| 181 | ``pas_region_t`` structures containing the desired memory access layout. The |
| 182 | PGS is provided to this function as an argument. |
| 183 | #. Firmware must call ``gpt_enable`` to enable granule protection checks by |
| 184 | setting the correct register values. |
| 185 | #. In systems that make use of the granule transition service, runtime |
| 186 | firmware must call ``gpt_runtime_init`` to set up the data structures needed |
| 187 | by the GTSI to find the tables and transition granules between PAS types. |
| 188 | |
| 189 | API Constraints |
| 190 | ~~~~~~~~~~~~~~~ |
| 191 | |
| 192 | The values allowed by the API for PPS and PGS are enumerated types |
| 193 | defined in the file ``include/lib/gpt_rme/gpt_rme.h``. |
| 194 | |
| 195 | Allowable values for PPS along with their corresponding size. |
| 196 | |
| 197 | * ``GPCCR_PPS_4GB`` (4GB protected space, 0x100000000 bytes) |
| 198 | * ``GPCCR_PPS_64GB`` (64GB protected space, 0x1000000000 bytes) |
| 199 | * ``GPCCR_PPS_1TB`` (1TB protected space, 0x10000000000 bytes) |
| 200 | * ``GPCCR_PPS_4TB`` (4TB protected space, 0x40000000000 bytes) |
| 201 | * ``GPCCR_PPS_16TB`` (16TB protected space, 0x100000000000 bytes) |
| 202 | * ``GPCCR_PPS_256TB`` (256TB protected space, 0x1000000000000 bytes) |
| 203 | * ``GPCCR_PPS_4PB`` (4PB protected space, 0x10000000000000 bytes) |
| 204 | |
| 205 | Allowable values for PGS along with their corresponding size. |
| 206 | |
| 207 | * ``GPCCR_PGS_4K`` (4KB granules, 0x1000 bytes) |
| 208 | * ``GPCCR_PGS_16K`` (16KB granules, 0x4000 bytes) |
| 209 | * ``GPCCR_PGS_64K`` (64KB granules, 0x10000 bytes) |
| 210 | |
| 211 | Allowable values for L0GPTSZ along with the corresponding size. |
| 212 | |
| 213 | * ``GPCCR_L0GPTSZ_30BITS`` (1GB regions, 0x40000000 bytes) |
| 214 | * ``GPCCR_L0GPTSZ_34BITS`` (16GB regions, 0x400000000 bytes) |
| 215 | * ``GPCCR_L0GPTSZ_36BITS`` (64GB regions, 0x1000000000 bytes) |
| 216 | * ``GPCCR_L0GPTSZ_39BITS`` (512GB regions, 0x8000000000 bytes) |
| 217 | |
| 218 | Note that the value of the PPS, PGS, and L0GPTSZ definitions is an encoded value |
| 219 | corresponding to the size, not the size itself. The decoded hex representations |
| 220 | of the sizes have been provided for convenience. |
| 221 | |
| 222 | The L0 table memory has some constraints that must be taken into account. |
| 223 | |
| 224 | * The L0 table must be aligned to either the table size or 4096 bytes, whichever |
| 225 | is greater. L0 table size is the total protected space (PPS) divided by the |
| 226 | size of each L0 region (L0GPTSZ) multiplied by the size of each L0 descriptor |
| 227 | (8 bytes). ((PPS / L0GPTSZ) * 8) |
AlexeiFedorov | d251c35 | 2024-05-14 16:29:37 +0100 | [diff] [blame] | 228 | * The L0 memory size must be greater than the table size and have enough space |
| 229 | to allocate array of ``bitlock`` structures at the end of L0 table if |
| 230 | required (``RME_GPT_BITLOCK_BLOCK`` is not 0). |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 231 | * The L0 memory must fall within a PAS of type GPT_GPI_ROOT. |
| 232 | |
| 233 | The L1 memory also has some constraints. |
| 234 | |
| 235 | * The L1 tables must be aligned to their size. The size of each L1 table is the |
| 236 | size of each L0 region (L0GPTSZ) divided by the granule size (PGS) divided by |
| 237 | the granules controlled in each byte (2). ((L0GPTSZ / PGS) / 2) |
| 238 | * There must be enough L1 memory supplied to build all requested L1 tables. |
| 239 | * The L1 memory must fall within a PAS of type GPT_GPI_ROOT. |
| 240 | |
| 241 | If an invalid combination of parameters is supplied, the APIs will print an |
| 242 | error message and return a negative value. The return values of APIs should be |
| 243 | checked to ensure successful configuration. |
| 244 | |
| 245 | Sample Calculation for L0 memory size and alignment |
| 246 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 247 | |
| 248 | Let PPS=GPCCR_PPS_4GB and L0GPTSZ=GPCCR_L0GPTSZ_30BITS |
| 249 | |
| 250 | We can find the total L0 table size with ((PPS / L0GPTSZ) * 8) |
| 251 | |
| 252 | Substitute values to get this: ((0x100000000 / 0x40000000) * 8) |
| 253 | |
| 254 | And solve to get 32 bytes. In this case, 4096 is greater than 32, so the L0 |
| 255 | tables must be aligned to 4096 bytes. |
| 256 | |
AlexeiFedorov | d251c35 | 2024-05-14 16:29:37 +0100 | [diff] [blame] | 257 | Sample calculation for bitlock array size |
| 258 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 259 | |
| 260 | Let PGS=GPCCR_PPS_256TB and RME_GPT_BITLOCK_BLOCK=1 |
| 261 | |
| 262 | The size of bit lock array in bits is the total protected space (PPS) divided |
| 263 | by the size of memory block per bit. The size of memory block |
| 264 | is ``RME_GPT_BITLOCK_BLOCK`` (number of 512MB blocks per bit) times |
| 265 | 512MB (0x20000000). This is then divided by the number of bits in ``bitlock`` |
| 266 | structure (8) to get the size of bit array in bytes. |
| 267 | |
| 268 | In other words, we can find the total size of ``bitlock`` array |
| 269 | in bytes with PPS / (RME_GPT_BITLOCK_BLOCK * 0x20000000 * 8). |
| 270 | |
| 271 | Substitute values to get this: 0x1000000000000 / (1 * 0x20000000 * 8) |
| 272 | |
| 273 | And solve to get 0x10000 bytes. |
| 274 | |
johpow01 | 7529440 | 2021-08-25 16:32:23 -0500 | [diff] [blame] | 275 | Sample calculation for L1 table size and alignment |
| 276 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 277 | |
| 278 | Let PGS=GPCCR_PGS_4K and L0GPTSZ=GPCCR_L0GPTSZ_30BITS |
| 279 | |
| 280 | We can find the size of each L1 table with ((L0GPTSZ / PGS) / 2). |
| 281 | |
| 282 | Substitute values: ((0x40000000 / 0x1000) / 2) |
| 283 | |
| 284 | And solve to get 0x20000 bytes per L1 table. |