blob: 78d2f12aa0ff8959d313674c2abc0dfffd5c612f [file] [log] [blame]
johpow0175294402021-08-25 16:32:23 -05001Granule Protection Tables Library
2=================================
3
AlexeiFedorovd251c352024-05-14 16:29:37 +01004This document describes the design of the Granule Protection Tables (GPT)
johpow0175294402021-08-25 16:32:23 -05005library used by Trusted Firmware-A (TF-A). This library provides the APIs needed
6to initialize the GPTs based on a data structure containing information about
7the systems memory layout, configure the system registers to enable granule
8protection checks based on these tables, and transition granules between
9different PAS (physical address spaces) at runtime.
10
AlexeiFedorovd251c352024-05-14 16:29:37 +010011Arm CCA adds two new security states for a total of four: root, realm, secure,
12and non-secure. In addition to new security states, corresponding physical
13address spaces have been added to control memory access for each state. The PAS
14access allowed to each security state can be seen in the table below.
johpow0175294402021-08-25 16:32:23 -050015
16.. list-table:: Security states and PAS access rights
17 :widths: 25 25 25 25 25
18 :header-rows: 1
19
20 * -
21 - Root state
22 - Realm state
23 - Secure state
24 - Non-secure state
25 * - Root PAS
26 - yes
27 - no
28 - no
29 - no
30 * - Realm PAS
31 - yes
32 - yes
33 - no
34 - no
35 * - Secure PAS
36 - yes
37 - no
38 - yes
39 - no
40 * - Non-secure PAS
41 - yes
42 - yes
43 - yes
44 - yes
45
46The GPT can function as either a 1 level or 2 level lookup depending on how a
47PAS region is configured. The first step is the level 0 table, each entry in the
AlexeiFedorovd251c352024-05-14 16:29:37 +010048level 0 table controls access to a relatively large region in memory (GPT Block
johpow0175294402021-08-25 16:32:23 -050049descriptor), and the entire region can belong to a single PAS when a one step
AlexeiFedorovd251c352024-05-14 16:29:37 +010050mapping is used. Level 0 entry can also link to a level 1 table (GPT Table
51descriptor) with a 2 step mapping. To change PAS of a region dynamically, the
52region must be mapped in Level 1 table.
53
54The Level 1 tables entries with the same PAS can be combined to form a
55contiguous block entry using GPT Contiguous descriptor. More details about this
56is explained in the following section.
johpow0175294402021-08-25 16:32:23 -050057
58Design Concepts and Interfaces
59------------------------------
60
61This section covers some important concepts and data structures used in the GPT
62library.
63
64There are three main parameters that determine how the tables are organized and
65function: the PPS (protected physical space) which is the total amount of
66protected physical address space in the system, PGS (physical granule size)
67which is how large each level 1 granule is, and L0GPTSZ (level 0 GPT size) which
68determines how much physical memory is governed by each level 0 entry. A granule
69is the smallest unit of memory that can be independently assigned to a PAS.
70
71L0GPTSZ is determined by the hardware and is read from the GPCCR_EL3 register.
72PPS and PGS are passed into the APIs at runtime and can be determined in
73whatever way is best for a given platform, either through some algorithm or hard
74coded in the firmware.
75
76GPT setup is split into two parts: table creation and runtime initialization. In
77the table creation step, a data structure containing information about the
78desired PAS regions is passed into the library which validates the mappings,
AlexeiFedorovd251c352024-05-14 16:29:37 +010079creates the tables in memory, and enables granule protection checks. It also
80allocates memory for fine-grained locks adjacent to the L0 tables. In the
johpow0175294402021-08-25 16:32:23 -050081runtime initialization step, the runtime firmware locates the existing tables in
82memory using the GPT register configuration and saves important data to a
83structure used by the granule transition service which will be covered more
84below.
85
86In the reference implementation for FVP models, you can find an example of PAS
Rohit Mathewf085b872023-12-20 17:29:18 +000087region definitions in the file ``plat/arm/board/fvp/include/fvp_pas_def.h``.
Rohit Mathewf6f02da2024-01-21 22:49:08 +000088Table creation API calls can be found in ``plat/arm/common/arm_common.c`` and
johpow0175294402021-08-25 16:32:23 -050089runtime initialization API calls can be seen in
90``plat/arm/common/arm_bl31_setup.c``.
91
AlexeiFedorovd251c352024-05-14 16:29:37 +010092During the table creation time, the GPT lib opportunistically fuses contiguous
93GPT L1 entries having the same PAS. The maximum size of
94supported contiguous blocks is defined by ``RME_GPT_MAX_BLOCK`` build option.
95
johpow0175294402021-08-25 16:32:23 -050096Defining PAS regions
97~~~~~~~~~~~~~~~~~~~~
98
99A ``pas_region_t`` structure is a way to represent a physical address space and
100its attributes that can be used by the GPT library to initialize the tables.
101
102This structure is composed of the following:
103
104#. The base physical address
105#. The region size
106#. The desired attributes of this memory region (mapping type, PAS type)
107
108See the ``pas_region_t`` type in ``include/lib/gpt_rme/gpt_rme.h``.
109
110The programmer should provide the API with an array containing ``pas_region_t``
111structures, then the library will check the desired memory access layout for
112validity and create tables to implement it.
113
114``pas_region_t`` is a public type, however it is recommended that the macros
115``GPT_MAP_REGION_BLOCK`` and ``GPT_MAP_REGION_GRANULE`` be used to populate
116these structures instead of doing it manually to reduce the risk of future
117compatibility issues. These macros take the base physical address, region size,
118and PAS type as arguments to generate the pas_region_t structure. As the names
119imply, ``GPT_MAP_REGION_BLOCK`` creates a region using only L0 mapping while
120``GPT_MAP_REGION_GRANULE`` creates a region using L0 and L1 mappings.
121
122Level 0 and Level 1 Tables
123~~~~~~~~~~~~~~~~~~~~~~~~~~
124
125The GPT initialization APIs require memory to be passed in for the tables to be
AlexeiFedorovd251c352024-05-14 16:29:37 +0100126constructed. The ``gpt_init_l0_tables`` API takes a memory address and size for
127building the level 0 tables and also memory for allocating the fine-grained bitlock
128data structure. The amount of memory needed for bitlock structure is controlled via
129``RME_GPT_BITLOCK_BLOCK`` config which defines the block size for each bit of the
130the bitlock.
131
132The ``gpt_init_pas_l1_tables`` API takes an address and size for
johpow0175294402021-08-25 16:32:23 -0500133building the level 1 tables which are linked from level 0 descriptors. The
134tables should have PAS type ``GPT_GPI_ROOT`` and a typical system might place
135its level 0 table in SRAM and its level 1 table(s) in DRAM.
136
137Granule Transition Service
138~~~~~~~~~~~~~~~~~~~~~~~~~~
139
AlexeiFedorovd251c352024-05-14 16:29:37 +0100140The Granule Transition Service allows memory mapped with
141``GPT_MAP_REGION_GRANULE`` ownership to be changed using SMC calls. Non-secure
142granules can be transitioned to either realm or secure space, and realm and
143secure granules can be transitioned back to non-secure. This library only
144allows Level 1 entries to be transitioned. The lib may either shatter
145contiguous blocks or fuse adjacent GPT entries to form a contiguous block
146opportunistically. Depending on the maximum block size, the fuse operation may
147propogate to higher block sizes as allowed by RME Architecture. Thus a higher
148maximum block size may have a higher runtime cost due to software operations
149that need to be performed for fuse to bigger block sizes. This cost may
150be offset by better TLB performance due to the higher block size and platforms
151need to make the trade-off decision based on their particular workload.
152
153Locking Scheme
154~~~~~~~~~~~~~~
155
156During Granule Transition access to L1 tables is controlled by a lock to ensure
157that no more than one CPU is allowed to make changes at any given time.
158The granularity of the lock is defined by ``RME_GPT_BITLOCK_BLOCK`` build option
159which defines the size of the memory block protected by one bit of ``bitlock``
160structure. Setting this option to 0 chooses a single spinlock for all GPT L1
161table entries.
johpow0175294402021-08-25 16:32:23 -0500162
163Library APIs
164------------
165
166The public APIs and types can be found in ``include/lib/gpt_rme/gpt_rme.h`` and this
167section is intended to provide additional details and clarifications.
168
169To create the GPTs and enable granule protection checks the APIs need to be
170called in the correct order and at the correct time during the system boot
171process.
172
173#. Firmware must enable the MMU.
174#. Firmware must call ``gpt_init_l0_tables`` to initialize the level 0 tables to
175 a default state, that is, initializing all of the L0 descriptors to allow all
176 accesses to all memory. The PPS is provided to this function as an argument.
177#. DDR discovery and initialization by the system, the discovered DDR region(s)
178 are then added to the L1 PAS regions to be initialized in the next step and
179 used by the GTSI at runtime.
180#. Firmware must call ``gpt_init_pas_l1_tables`` with a pointer to an array of
181 ``pas_region_t`` structures containing the desired memory access layout. The
182 PGS is provided to this function as an argument.
183#. Firmware must call ``gpt_enable`` to enable granule protection checks by
184 setting the correct register values.
185#. In systems that make use of the granule transition service, runtime
186 firmware must call ``gpt_runtime_init`` to set up the data structures needed
187 by the GTSI to find the tables and transition granules between PAS types.
188
189API Constraints
190~~~~~~~~~~~~~~~
191
192The values allowed by the API for PPS and PGS are enumerated types
193defined in the file ``include/lib/gpt_rme/gpt_rme.h``.
194
195Allowable values for PPS along with their corresponding size.
196
197* ``GPCCR_PPS_4GB`` (4GB protected space, 0x100000000 bytes)
198* ``GPCCR_PPS_64GB`` (64GB protected space, 0x1000000000 bytes)
199* ``GPCCR_PPS_1TB`` (1TB protected space, 0x10000000000 bytes)
200* ``GPCCR_PPS_4TB`` (4TB protected space, 0x40000000000 bytes)
201* ``GPCCR_PPS_16TB`` (16TB protected space, 0x100000000000 bytes)
202* ``GPCCR_PPS_256TB`` (256TB protected space, 0x1000000000000 bytes)
203* ``GPCCR_PPS_4PB`` (4PB protected space, 0x10000000000000 bytes)
204
205Allowable values for PGS along with their corresponding size.
206
207* ``GPCCR_PGS_4K`` (4KB granules, 0x1000 bytes)
208* ``GPCCR_PGS_16K`` (16KB granules, 0x4000 bytes)
209* ``GPCCR_PGS_64K`` (64KB granules, 0x10000 bytes)
210
211Allowable values for L0GPTSZ along with the corresponding size.
212
213* ``GPCCR_L0GPTSZ_30BITS`` (1GB regions, 0x40000000 bytes)
214* ``GPCCR_L0GPTSZ_34BITS`` (16GB regions, 0x400000000 bytes)
215* ``GPCCR_L0GPTSZ_36BITS`` (64GB regions, 0x1000000000 bytes)
216* ``GPCCR_L0GPTSZ_39BITS`` (512GB regions, 0x8000000000 bytes)
217
218Note that the value of the PPS, PGS, and L0GPTSZ definitions is an encoded value
219corresponding to the size, not the size itself. The decoded hex representations
220of the sizes have been provided for convenience.
221
222The L0 table memory has some constraints that must be taken into account.
223
224* The L0 table must be aligned to either the table size or 4096 bytes, whichever
225 is greater. L0 table size is the total protected space (PPS) divided by the
226 size of each L0 region (L0GPTSZ) multiplied by the size of each L0 descriptor
227 (8 bytes). ((PPS / L0GPTSZ) * 8)
AlexeiFedorovd251c352024-05-14 16:29:37 +0100228* The L0 memory size must be greater than the table size and have enough space
229 to allocate array of ``bitlock`` structures at the end of L0 table if
230 required (``RME_GPT_BITLOCK_BLOCK`` is not 0).
johpow0175294402021-08-25 16:32:23 -0500231* The L0 memory must fall within a PAS of type GPT_GPI_ROOT.
232
233The L1 memory also has some constraints.
234
235* The L1 tables must be aligned to their size. The size of each L1 table is the
236 size of each L0 region (L0GPTSZ) divided by the granule size (PGS) divided by
237 the granules controlled in each byte (2). ((L0GPTSZ / PGS) / 2)
238* There must be enough L1 memory supplied to build all requested L1 tables.
239* The L1 memory must fall within a PAS of type GPT_GPI_ROOT.
240
241If an invalid combination of parameters is supplied, the APIs will print an
242error message and return a negative value. The return values of APIs should be
243checked to ensure successful configuration.
244
245Sample Calculation for L0 memory size and alignment
246~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
247
248Let PPS=GPCCR_PPS_4GB and L0GPTSZ=GPCCR_L0GPTSZ_30BITS
249
250We can find the total L0 table size with ((PPS / L0GPTSZ) * 8)
251
252Substitute values to get this: ((0x100000000 / 0x40000000) * 8)
253
254And solve to get 32 bytes. In this case, 4096 is greater than 32, so the L0
255tables must be aligned to 4096 bytes.
256
AlexeiFedorovd251c352024-05-14 16:29:37 +0100257Sample calculation for bitlock array size
258~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
259
260Let PGS=GPCCR_PPS_256TB and RME_GPT_BITLOCK_BLOCK=1
261
262The size of bit lock array in bits is the total protected space (PPS) divided
263by the size of memory block per bit. The size of memory block
264is ``RME_GPT_BITLOCK_BLOCK`` (number of 512MB blocks per bit) times
265512MB (0x20000000). This is then divided by the number of bits in ``bitlock``
266structure (8) to get the size of bit array in bytes.
267
268In other words, we can find the total size of ``bitlock`` array
269in bytes with PPS / (RME_GPT_BITLOCK_BLOCK * 0x20000000 * 8).
270
271Substitute values to get this: 0x1000000000000 / (1 * 0x20000000 * 8)
272
273And solve to get 0x10000 bytes.
274
johpow0175294402021-08-25 16:32:23 -0500275Sample calculation for L1 table size and alignment
276~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
277
278Let PGS=GPCCR_PGS_4K and L0GPTSZ=GPCCR_L0GPTSZ_30BITS
279
280We can find the size of each L1 table with ((L0GPTSZ / PGS) / 2).
281
282Substitute values: ((0x40000000 / 0x1000) / 2)
283
284And solve to get 0x20000 bytes per L1 table.