Blame - docs/firmware-design.rst - filogic/atf

blob: cc2fe118d78af4ad8e42d15bd0ad34fe1b03add1 [file] [log] [blame]

Douglas Raillard	d7c21b7	2017-06-28 15:23:03 +0100	[diff] [blame^]	1	ARM Trusted Firmware Design
				2	===========================
				3
				4
				5	.. section-numbering::
				6	:suffix: .
				7
				8	.. contents::
				9
				10	The ARM Trusted Firmware implements a subset of the Trusted Board Boot
				11	Requirements (TBBR) Platform Design Document (PDD) [1] for ARM reference
				12	platforms. The TBB sequence starts when the platform is powered on and runs up
				13	to the stage where it hands-off control to firmware running in the normal
				14	world in DRAM. This is the cold boot path.
				15
				16	The ARM Trusted Firmware also implements the Power State Coordination Interface
				17	PDD [2] as a runtime service. PSCI is the interface from normal world software
				18	to firmware implementing power management use-cases (for example, secondary CPU
				19	boot, hotplug and idle). Normal world software can access ARM Trusted Firmware
				20	runtime services via the ARM SMC (Secure Monitor Call) instruction. The SMC
				21	instruction must be used as mandated by the SMC Calling Convention [3].
				22
				23	The ARM Trusted Firmware implements a framework for configuring and managing
				24	interrupts generated in either security state. The details of the interrupt
				25	management framework and its design can be found in ARM Trusted Firmware
				26	Interrupt Management Design guide [4].
				27
				28	The ARM Trusted Firmware can be built to support either AArch64 or AArch32
				29	execution state.
				30
				31	Cold boot
				32	---------
				33
				34	The cold boot path starts when the platform is physically turned on. If
				35	``COLD_BOOT_SINGLE_CPU=0``, one of the CPUs released from reset is chosen as the
				36	primary CPU, and the remaining CPUs are considered secondary CPUs. The primary
				37	CPU is chosen through platform-specific means. The cold boot path is mainly
				38	executed by the primary CPU, other than essential CPU initialization executed by
				39	all CPUs. The secondary CPUs are kept in a safe platform-specific state until
				40	the primary CPU has performed enough initialization to boot them.
				41
				42	Refer to the `Reset Design`_ for more information on the effect of the
				43	``COLD_BOOT_SINGLE_CPU`` platform build option.
				44
				45	The cold boot path in this implementation of the ARM Trusted Firmware,
				46	depends on the execution state.
				47	For AArch64, it is divided into five steps (in order of execution):
				48
				49	- Boot Loader stage 1 (BL1) AP Trusted ROM
				50	- Boot Loader stage 2 (BL2) Trusted Boot Firmware
				51	- Boot Loader stage 3-1 (BL31) EL3 Runtime Software
				52	- Boot Loader stage 3-2 (BL32) Secure-EL1 Payload (optional)
				53	- Boot Loader stage 3-3 (BL33) Non-trusted Firmware
				54
				55	For AArch32, it is divided into four steps (in order of execution):
				56
				57	- Boot Loader stage 1 (BL1) AP Trusted ROM
				58	- Boot Loader stage 2 (BL2) Trusted Boot Firmware
				59	- Boot Loader stage 3-2 (BL32) EL3 Runtime Software
				60	- Boot Loader stage 3-3 (BL33) Non-trusted Firmware
				61
				62	ARM development platforms (Fixed Virtual Platforms (FVPs) and Juno) implement a
				63	combination of the following types of memory regions. Each bootloader stage uses
				64	one or more of these memory regions.
				65
				66	- Regions accessible from both non-secure and secure states. For example,
				67	non-trusted SRAM, ROM and DRAM.
				68	- Regions accessible from only the secure state. For example, trusted SRAM and
				69	ROM. The FVPs also implement the trusted DRAM which is statically
				70	configured. Additionally, the Base FVPs and Juno development platform
				71	configure the TrustZone Controller (TZC) to create a region in the DRAM
				72	which is accessible only from the secure state.
				73
				74	The sections below provide the following details:
				75
				76	- initialization and execution of the first three stages during cold boot
				77	- specification of the EL3 Runtime Software (BL31 for AArch64 and BL32 for
				78	AArch32) entrypoint requirements for use by alternative Trusted Boot
				79	Firmware in place of the provided BL1 and BL2
				80
				81	BL1
				82	~~~
				83
				84	This stage begins execution from the platform's reset vector at EL3. The reset
				85	address is platform dependent but it is usually located in a Trusted ROM area.
				86	The BL1 data section is copied to trusted SRAM at runtime.
				87
				88	On the ARM development platforms, BL1 code starts execution from the reset
				89	vector defined by the constant ``BL1_RO_BASE``. The BL1 data section is copied
				90	to the top of trusted SRAM as defined by the constant ``BL1_RW_BASE``.
				91
				92	The functionality implemented by this stage is as follows.
				93
				94	Determination of boot path
				95	^^^^^^^^^^^^^^^^^^^^^^^^^^
				96
				97	Whenever a CPU is released from reset, BL1 needs to distinguish between a warm
				98	boot and a cold boot. This is done using platform-specific mechanisms (see the
				99	``plat_get_my_entrypoint()`` function in the `Porting Guide`_). In the case of a
				100	warm boot, a CPU is expected to continue execution from a separate
				101	entrypoint. In the case of a cold boot, the secondary CPUs are placed in a safe
				102	platform-specific state (see the ``plat_secondary_cold_boot_setup()`` function in
				103	the `Porting Guide`_) while the primary CPU executes the remaining cold boot path
				104	as described in the following sections.
				105
				106	This step only applies when ``PROGRAMMABLE_RESET_ADDRESS=0``. Refer to the
				107	`Reset Design`_ for more information on the effect of the
				108	``PROGRAMMABLE_RESET_ADDRESS`` platform build option.
				109
				110	Architectural initialization
				111	^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				112
				113	BL1 performs minimal architectural initialization as follows.
				114
				115	- Exception vectors
				116
				117	BL1 sets up simple exception vectors for both synchronous and asynchronous
				118	exceptions. The default behavior upon receiving an exception is to populate
				119	a status code in the general purpose register ``X0/R0`` and call the
				120	``plat_report_exception()`` function (see the `Porting Guide`_). The status
				121	code is one of:
				122
				123	For AArch64:
				124
				125	::
				126
				127	0x0 : Synchronous exception from Current EL with SP_EL0
				128	0x1 : IRQ exception from Current EL with SP_EL0
				129	0x2 : FIQ exception from Current EL with SP_EL0
				130	0x3 : System Error exception from Current EL with SP_EL0
				131	0x4 : Synchronous exception from Current EL with SP_ELx
				132	0x5 : IRQ exception from Current EL with SP_ELx
				133	0x6 : FIQ exception from Current EL with SP_ELx
				134	0x7 : System Error exception from Current EL with SP_ELx
				135	0x8 : Synchronous exception from Lower EL using aarch64
				136	0x9 : IRQ exception from Lower EL using aarch64
				137	0xa : FIQ exception from Lower EL using aarch64
				138	0xb : System Error exception from Lower EL using aarch64
				139	0xc : Synchronous exception from Lower EL using aarch32
				140	0xd : IRQ exception from Lower EL using aarch32
				141	0xe : FIQ exception from Lower EL using aarch32
				142	0xf : System Error exception from Lower EL using aarch32
				143
				144	For AArch32:
				145
				146	::
				147
				148	0x10 : User mode
				149	0x11 : FIQ mode
				150	0x12 : IRQ mode
				151	0x13 : SVC mode
				152	0x16 : Monitor mode
				153	0x17 : Abort mode
				154	0x1a : Hypervisor mode
				155	0x1b : Undefined mode
				156	0x1f : System mode
				157
				158	The ``plat_report_exception()`` implementation on the ARM FVP port programs
				159	the Versatile Express System LED register in the following format to
				160	indicate the occurence of an unexpected exception:
				161
				162	::
				163
				164	SYS_LED[0] - Security state (Secure=0/Non-Secure=1)
				165	SYS_LED[2:1] - Exception Level (EL3=0x3, EL2=0x2, EL1=0x1, EL0=0x0)
				166	For AArch32 it is always 0x0
				167	SYS_LED[7:3] - Exception Class (Sync/Async & origin). This is the value
				168	of the status code
				169
				170	A write to the LED register reflects in the System LEDs (S6LED0..7) in the
				171	CLCD window of the FVP.
				172
				173	BL1 does not expect to receive any exceptions other than the SMC exception.
				174	For the latter, BL1 installs a simple stub. The stub expects to receive a
				175	limited set of SMC types (determined by their function IDs in the general
				176	purpose register ``X0/R0``):
				177
				178	- ``BL1_SMC_RUN_IMAGE``: This SMC is raised by BL2 to make BL1 pass control
				179	to EL3 Runtime Software.
				180	- All SMCs listed in section "BL1 SMC Interface" in the `Firmware Update`_
				181	Design Guide are supported for AArch64 only. These SMCs are currently
				182	not supported when BL1 is built for AArch32.
				183
				184	Any other SMC leads to an assertion failure.
				185
				186	- CPU initialization
				187
				188	BL1 calls the ``reset_handler()`` function which in turn calls the CPU
				189	specific reset handler function (see the section: "CPU specific operations
				190	framework").
				191
				192	- Control register setup (for AArch64)
				193
				194	- ``SCTLR_EL3``. Instruction cache is enabled by setting the ``SCTLR_EL3.I``
				195	bit. Alignment and stack alignment checking is enabled by setting the
				196	``SCTLR_EL3.A`` and ``SCTLR_EL3.SA`` bits. Exception endianness is set to
				197	little-endian by clearing the ``SCTLR_EL3.EE`` bit.
				198
				199	- ``SCR_EL3``. The register width of the next lower exception level is set
				200	to AArch64 by setting the ``SCR.RW`` bit. The ``SCR.EA`` bit is set to trap
				201	both External Aborts and SError Interrupts in EL3. The ``SCR.SIF`` bit is
				202	also set to disable instruction fetches from Non-secure memory when in
				203	secure state.
				204
				205	- ``CPTR_EL3``. Accesses to the ``CPACR_EL1`` register from EL1 or EL2, or the
				206	``CPTR_EL2`` register from EL2 are configured to not trap to EL3 by
				207	clearing the ``CPTR_EL3.TCPAC`` bit. Access to the trace functionality is
				208	configured not to trap to EL3 by clearing the ``CPTR_EL3.TTA`` bit.
				209	Instructions that access the registers associated with Floating Point
				210	and Advanced SIMD execution are configured to not trap to EL3 by
				211	clearing the ``CPTR_EL3.TFP`` bit.
				212
				213	- ``DAIF``. The SError interrupt is enabled by clearing the SError interrupt
				214	mask bit.
				215
				216	- ``MDCR_EL3``. The trap controls, ``MDCR_EL3.TDOSA``, ``MDCR_EL3.TDA`` and
				217	``MDCR_EL3.TPM``, are set so that accesses to the registers they control
				218	do not trap to EL3. AArch64 Secure self-hosted debug is disabled by
				219	setting the ``MDCR_EL3.SDD`` bit. Also ``MDCR_EL3.SPD32`` is set to
				220	disable AArch32 Secure self-hosted privileged debug from S-EL1.
				221
				222	- Control register setup (for AArch32)
				223
				224	- ``SCTLR``. Instruction cache is enabled by setting the ``SCTLR.I`` bit.
				225	Alignment checking is enabled by setting the ``SCTLR.A`` bit.
				226	Exception endianness is set to little-endian by clearing the
				227	``SCTLR.EE`` bit.
				228
				229	- ``SCR``. The ``SCR.SIF`` bit is set to disable instruction fetches from
				230	Non-secure memory when in secure state.
				231
				232	- ``CPACR``. Allow execution of Advanced SIMD instructions at PL0 and PL1,
				233	by clearing the ``CPACR.ASEDIS`` bit. Access to the trace functionality
				234	is configured not to trap to undefined mode by clearing the
				235	``CPACR.TRCDIS`` bit.
				236
				237	- ``NSACR``. Enable non-secure access to Advanced SIMD functionality and
				238	system register access to implemented trace registers.
				239
				240	- ``FPEXC``. Enable access to the Advanced SIMD and floating-point
				241	functionality from all Exception levels.
				242
				243	- ``CPSR.A``. The Asynchronous data abort interrupt is enabled by clearing
				244	the Asynchronous data abort interrupt mask bit.
				245
				246	- ``SDCR``. The ``SDCR.SPD`` field is set to disable AArch32 Secure
				247	self-hosted privileged debug.
				248
				249	Platform initialization
				250	^^^^^^^^^^^^^^^^^^^^^^^
				251
				252	On ARM platforms, BL1 performs the following platform initializations:
				253
				254	- Enable the Trusted Watchdog.
				255	- Initialize the console.
				256	- Configure the Interconnect to enable hardware coherency.
				257	- Enable the MMU and map the memory it needs to access.
				258	- Configure any required platform storage to load the next bootloader image
				259	(BL2).
				260
				261	Firmware Update detection and execution
				262	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				263
				264	After performing platform setup, BL1 common code calls
				265	``bl1_plat_get_next_image_id()`` to determine if `Firmware Update`_ is required or
				266	to proceed with the normal boot process. If the platform code returns
				267	``BL2_IMAGE_ID`` then the normal boot sequence is executed as described in the
				268	next section, else BL1 assumes that `Firmware Update`_ is required and execution
				269	passes to the first image in the `Firmware Update`_ process. In either case, BL1
				270	retrieves a descriptor of the next image by calling ``bl1_plat_get_image_desc()``.
				271	The image descriptor contains an ``entry_point_info_t`` structure, which BL1
				272	uses to initialize the execution state of the next image.
				273
				274	BL2 image load and execution
				275	^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				276
				277	In the normal boot flow, BL1 execution continues as follows:
				278
				279	#. BL1 prints the following string from the primary CPU to indicate successful
				280	execution of the BL1 stage:
				281
				282	::
				283
				284	"Booting Trusted Firmware"
				285
				286	#. BL1 determines the amount of free trusted SRAM memory available by
				287	calculating the extent of its own data section, which also resides in
				288	trusted SRAM. BL1 loads a BL2 raw binary image from platform storage, at a
				289	platform-specific base address. If the BL2 image file is not present or if
				290	there is not enough free trusted SRAM the following error message is
				291	printed:
				292
				293	::
				294
				295	"Failed to load BL2 firmware."
				296
				297	BL1 calculates the amount of Trusted SRAM that can be used by the BL2
				298	image. The exact load location of the image is provided as a base address
				299	in the platform header. Further description of the memory layout can be
				300	found later in this document.
				301
				302	#. BL1 passes control to the BL2 image at Secure EL1 (for AArch64) or at
				303	Secure SVC mode (for AArch32), starting from its load address.
				304
				305	#. BL1 also passes information about the amount of trusted SRAM used and
				306	available for use. This information is populated at a platform-specific
				307	memory address.
				308
				309	BL2
				310	~~~
				311
				312	BL1 loads and passes control to BL2 at Secure-EL1 (for AArch64) or at Secure
				313	SVC mode (for AArch32) . BL2 is linked against and loaded at a platform-specific
				314	base address (more information can be found later in this document).
				315	The functionality implemented by BL2 is as follows.
				316
				317	Architectural initialization
				318	^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				319
				320	For AArch64, BL2 performs the minimal architectural initialization required
				321	for subsequent stages of the ARM Trusted Firmware and normal world software.
				322	EL1 and EL0 are given access to Floating Point and Advanced SIMD registers
				323	by clearing the ``CPACR.FPEN`` bits.
				324
				325	For AArch32, the minimal architectural initialization required for subsequent
				326	stages of the ARM Trusted Firmware and normal world software is taken care of
				327	in BL1 as both BL1 and BL2 execute at PL1.
				328
				329	Platform initialization
				330	^^^^^^^^^^^^^^^^^^^^^^^
				331
				332	On ARM platforms, BL2 performs the following platform initializations:
				333
				334	- Initialize the console.
				335	- Configure any required platform storage to allow loading further bootloader
				336	images.
				337	- Enable the MMU and map the memory it needs to access.
				338	- Perform platform security setup to allow access to controlled components.
				339	- Reserve some memory for passing information to the next bootloader image
				340	EL3 Runtime Software and populate it.
				341	- Define the extents of memory available for loading each subsequent
				342	bootloader image.
				343
				344	Image loading in BL2
				345	^^^^^^^^^^^^^^^^^^^^
				346
				347	Image loading scheme in BL2 depends on ``LOAD_IMAGE_V2`` build option. If the
				348	flag is disabled, the BLxx images are loaded, by calling the respective
				349	load\_blxx() function from BL2 generic code. If the flag is enabled, the BL2
				350	generic code loads the images based on the list of loadable images provided
				351	by the platform. BL2 passes the list of executable images provided by the
				352	platform to the next handover BL image. By default, this flag is disabled for
				353	AArch64 and the AArch32 build is supported only if this flag is enabled.
				354
				355	SCP\_BL2 (System Control Processor Firmware) image load
				356	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				357
				358	Some systems have a separate System Control Processor (SCP) for power, clock,
				359	reset and system control. BL2 loads the optional SCP\_BL2 image from platform
				360	storage into a platform-specific region of secure memory. The subsequent
				361	handling of SCP\_BL2 is platform specific. For example, on the Juno ARM
				362	development platform port the image is transferred into SCP's internal memory
				363	using the Boot Over MHU (BOM) protocol after being loaded in the trusted SRAM
				364	memory. The SCP executes SCP\_BL2 and signals to the Application Processor (AP)
				365	for BL2 execution to continue.
				366
				367	EL3 Runtime Software image load
				368	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				369
				370	BL2 loads the EL3 Runtime Software image from platform storage into a platform-
				371	specific address in trusted SRAM. If there is not enough memory to load the
				372	image or image is missing it leads to an assertion failure. If ``LOAD_IMAGE_V2``
				373	is disabled and if image loads successfully, BL2 updates the amount of trusted
				374	SRAM used and available for use by EL3 Runtime Software. This information is
				375	populated at a platform-specific memory address.
				376
				377	AArch64 BL32 (Secure-EL1 Payload) image load
				378	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				379
				380	BL2 loads the optional BL32 image from platform storage into a platform-
				381	specific region of secure memory. The image executes in the secure world. BL2
				382	relies on BL31 to pass control to the BL32 image, if present. Hence, BL2
				383	populates a platform-specific area of memory with the entrypoint/load-address
				384	of the BL32 image. The value of the Saved Processor Status Register (``SPSR``)
				385	for entry into BL32 is not determined by BL2, it is initialized by the
				386	Secure-EL1 Payload Dispatcher (see later) within BL31, which is responsible for
				387	managing interaction with BL32. This information is passed to BL31.
				388
				389	BL33 (Non-trusted Firmware) image load
				390	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				391
				392	BL2 loads the BL33 image (e.g. UEFI or other test or boot software) from
				393	platform storage into non-secure memory as defined by the platform.
				394
				395	BL2 relies on EL3 Runtime Software to pass control to BL33 once secure state
				396	initialization is complete. Hence, BL2 populates a platform-specific area of
				397	memory with the entrypoint and Saved Program Status Register (``SPSR``) of the
				398	normal world software image. The entrypoint is the load address of the BL33
				399	image. The ``SPSR`` is determined as specified in Section 5.13 of the
				400	`PSCI PDD`_. This information is passed to the EL3 Runtime Software.
				401
				402	AArch64 BL31 (EL3 Runtime Software) execution
				403	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				404
				405	BL2 execution continues as follows:
				406
				407	#. BL2 passes control back to BL1 by raising an SMC, providing BL1 with the
				408	BL31 entrypoint. The exception is handled by the SMC exception handler
				409	installed by BL1.
				410
				411	#. BL1 turns off the MMU and flushes the caches. It clears the
				412	``SCTLR_EL3.M/I/C`` bits, flushes the data cache to the point of coherency
				413	and invalidates the TLBs.
				414
				415	#. BL1 passes control to BL31 at the specified entrypoint at EL3.
				416
				417	AArch64 BL31
				418	~~~~~~~~~~~~
				419
				420	The image for this stage is loaded by BL2 and BL1 passes control to BL31 at
				421	EL3. BL31 executes solely in trusted SRAM. BL31 is linked against and
				422	loaded at a platform-specific base address (more information can be found later
				423	in this document). The functionality implemented by BL31 is as follows.
				424
				425	Architectural initialization
				426	^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				427
				428	Currently, BL31 performs a similar architectural initialization to BL1 as
				429	far as system register settings are concerned. Since BL1 code resides in ROM,
				430	architectural initialization in BL31 allows override of any previous
				431	initialization done by BL1.
				432
				433	BL31 initializes the per-CPU data framework, which provides a cache of
				434	frequently accessed per-CPU data optimised for fast, concurrent manipulation
				435	on different CPUs. This buffer includes pointers to per-CPU contexts, crash
				436	buffer, CPU reset and power down operations, PSCI data, platform data and so on.
				437
				438	It then replaces the exception vectors populated by BL1 with its own. BL31
				439	exception vectors implement more elaborate support for handling SMCs since this
				440	is the only mechanism to access the runtime services implemented by BL31 (PSCI
				441	for example). BL31 checks each SMC for validity as specified by the
				442	`SMC calling convention PDD`_ before passing control to the required SMC
				443	handler routine.
				444
				445	BL31 programs the ``CNTFRQ_EL0`` register with the clock frequency of the system
				446	counter, which is provided by the platform.
				447
				448	Platform initialization
				449	^^^^^^^^^^^^^^^^^^^^^^^
				450
				451	BL31 performs detailed platform initialization, which enables normal world
				452	software to function correctly.
				453
				454	On ARM platforms, this consists of the following:
				455
				456	- Initialize the console.
				457	- Configure the Interconnect to enable hardware coherency.
				458	- Enable the MMU and map the memory it needs to access.
				459	- Initialize the generic interrupt controller.
				460	- Initialize the power controller device.
				461	- Detect the system topology.
				462
				463	Runtime services initialization
				464	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				465
				466	BL31 is responsible for initializing the runtime services. One of them is PSCI.
				467
				468	As part of the PSCI initializations, BL31 detects the system topology. It also
				469	initializes the data structures that implement the state machine used to track
				470	the state of power domain nodes. The state can be one of ``OFF``, ``RUN`` or
				471	``RETENTION``. All secondary CPUs are initially in the ``OFF`` state. The cluster
				472	that the primary CPU belongs to is ``ON``; any other cluster is ``OFF``. It also
				473	initializes the locks that protect them. BL31 accesses the state of a CPU or
				474	cluster immediately after reset and before the data cache is enabled in the
				475	warm boot path. It is not currently possible to use 'exclusive' based spinlocks,
				476	therefore BL31 uses locks based on Lamport's Bakery algorithm instead.
				477
				478	The runtime service framework and its initialization is described in more
				479	detail in the "EL3 runtime services framework" section below.
				480
				481	Details about the status of the PSCI implementation are provided in the
				482	"Power State Coordination Interface" section below.
				483
				484	AArch64 BL32 (Secure-EL1 Payload) image initialization
				485	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				486
				487	If a BL32 image is present then there must be a matching Secure-EL1 Payload
				488	Dispatcher (SPD) service (see later for details). During initialization
				489	that service must register a function to carry out initialization of BL32
				490	once the runtime services are fully initialized. BL31 invokes such a
				491	registered function to initialize BL32 before running BL33. This initialization
				492	is not necessary for AArch32 SPs.
				493
				494	Details on BL32 initialization and the SPD's role are described in the
				495	"Secure-EL1 Payloads and Dispatchers" section below.
				496
				497	BL33 (Non-trusted Firmware) execution
				498	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				499
				500	EL3 Runtime Software initializes the EL2 or EL1 processor context for normal-
				501	world cold boot, ensuring that no secure state information finds its way into
				502	the non-secure execution state. EL3 Runtime Software uses the entrypoint
				503	information provided by BL2 to jump to the Non-trusted firmware image (BL33)
				504	at the highest available Exception Level (EL2 if available, otherwise EL1).
				505
				506	Using alternative Trusted Boot Firmware in place of BL1 & BL2 (AArch64 only)
				507	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				508
				509	Some platforms have existing implementations of Trusted Boot Firmware that
				510	would like to use ARM Trusted Firmware BL31 for the EL3 Runtime Software. To
				511	enable this firmware architecture it is important to provide a fully documented
				512	and stable interface between the Trusted Boot Firmware and BL31.
				513
				514	Future changes to the BL31 interface will be done in a backwards compatible
				515	way, and this enables these firmware components to be independently enhanced/
				516	updated to develop and exploit new functionality.
				517
				518	Required CPU state when calling ``bl31_entrypoint()`` during cold boot
				519	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				520
				521	This function must only be called by the primary CPU.
				522
				523	On entry to this function the calling primary CPU must be executing in AArch64
				524	EL3, little-endian data access, and all interrupt sources masked:
				525
				526	::
				527
				528	PSTATE.EL = 3
				529	PSTATE.RW = 1
				530	PSTATE.DAIF = 0xf
				531	SCTLR_EL3.EE = 0
				532
				533	X0 and X1 can be used to pass information from the Trusted Boot Firmware to the
				534	platform code in BL31:
				535
				536	::
				537
				538	X0 : Reserved for common Trusted Firmware information
				539	X1 : Platform specific information
				540
				541	BL31 zero-init sections (e.g. ``.bss``) should not contain valid data on entry,
				542	these will be zero filled prior to invoking platform setup code.
				543
				544	Use of the X0 and X1 parameters
				545	'''''''''''''''''''''''''''''''
				546
				547	The parameters are platform specific and passed from ``bl31_entrypoint()`` to
				548	``bl31_early_platform_setup()``. The value of these parameters is never directly
				549	used by the common BL31 code.
				550
				551	The convention is that ``X0`` conveys information regarding the BL31, BL32 and
				552	BL33 images from the Trusted Boot firmware and ``X1`` can be used for other
				553	platform specific purpose. This convention allows platforms which use ARM
				554	Trusted Firmware's BL1 and BL2 images to transfer additional platform specific
				555	information from Secure Boot without conflicting with future evolution of the
				556	Trusted Firmware using ``X0`` to pass a ``bl31_params`` structure.
				557
				558	BL31 common and SPD initialization code depends on image and entrypoint
				559	information about BL33 and BL32, which is provided via BL31 platform APIs.
				560	This information is required until the start of execution of BL33. This
				561	information can be provided in a platform defined manner, e.g. compiled into
				562	the platform code in BL31, or provided in a platform defined memory location
				563	by the Trusted Boot firmware, or passed from the Trusted Boot Firmware via the
				564	Cold boot Initialization parameters. This data may need to be cleaned out of
				565	the CPU caches if it is provided by an earlier boot stage and then accessed by
				566	BL31 platform code before the caches are enabled.
				567
				568	ARM Trusted Firmware's BL2 implementation passes a ``bl31_params`` structure in
				569	``X0`` and the ARM development platforms interpret this in the BL31 platform
				570	code.
				571
				572	MMU, Data caches & Coherency
				573	''''''''''''''''''''''''''''
				574
				575	BL31 does not depend on the enabled state of the MMU, data caches or
				576	interconnect coherency on entry to ``bl31_entrypoint()``. If these are disabled
				577	on entry, these should be enabled during ``bl31_plat_arch_setup()``.
				578
				579	Data structures used in the BL31 cold boot interface
				580	''''''''''''''''''''''''''''''''''''''''''''''''''''
				581
				582	These structures are designed to support compatibility and independent
				583	evolution of the structures and the firmware images. For example, a version of
				584	BL31 that can interpret the BL3x image information from different versions of
				585	BL2, a platform that uses an extended entry\_point\_info structure to convey
				586	additional register information to BL31, or a ELF image loader that can convey
				587	more details about the firmware images.
				588
				589	To support these scenarios the structures are versioned and sized, which enables
				590	BL31 to detect which information is present and respond appropriately. The
				591	``param_header`` is defined to capture this information:
				592
				593	.. code:: c
				594
				595	typedef struct param_header {
				596	uint8_t type; /* type of the structure */
				597	uint8_t version; /* version of this structure */
				598	uint16_t size; /* size of this structure in bytes */
				599	uint32_t attr; /* attributes: unused bits SBZ */
				600	} param_header_t;
				601
				602	The structures using this format are ``entry_point_info``, ``image_info`` and
				603	``bl31_params``. The code that allocates and populates these structures must set
				604	the header fields appropriately, and the ``SET_PARAM_HEAD()`` a macro is defined
				605	to simplify this action.
				606
				607	Required CPU state for BL31 Warm boot initialization
				608	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				609
				610	When requesting a CPU power-on, or suspending a running CPU, ARM Trusted
				611	Firmware provides the platform power management code with a Warm boot
				612	initialization entry-point, to be invoked by the CPU immediately after the
				613	reset handler. On entry to the Warm boot initialization function the calling
				614	CPU must be in AArch64 EL3, little-endian data access and all interrupt sources
				615	masked:
				616
				617	::
				618
				619	PSTATE.EL = 3
				620	PSTATE.RW = 1
				621	PSTATE.DAIF = 0xf
				622	SCTLR_EL3.EE = 0
				623
				624	The PSCI implementation will initialize the processor state and ensure that the
				625	platform power management code is then invoked as required to initialize all
				626	necessary system, cluster and CPU resources.
				627
				628	AArch32 EL3 Runtime Software entrypoint interface
				629	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				630
				631	To enable this firmware architecture it is important to provide a fully
				632	documented and stable interface between the Trusted Boot Firmware and the
				633	AArch32 EL3 Runtime Software.
				634
				635	Future changes to the entrypoint interface will be done in a backwards
				636	compatible way, and this enables these firmware components to be independently
				637	enhanced/updated to develop and exploit new functionality.
				638
				639	Required CPU state when entering during cold boot
				640	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				641
				642	This function must only be called by the primary CPU.
				643
				644	On entry to this function the calling primary CPU must be executing in AArch32
				645	EL3, little-endian data access, and all interrupt sources masked:
				646
				647	::
				648
				649	PSTATE.AIF = 0x7
				650	SCTLR.EE = 0
				651
				652	R0 and R1 are used to pass information from the Trusted Boot Firmware to the
				653	platform code in AArch32 EL3 Runtime Software:
				654
				655	::
				656
				657	R0 : Reserved for common Trusted Firmware information
				658	R1 : Platform specific information
				659
				660	Use of the R0 and R1 parameters
				661	'''''''''''''''''''''''''''''''
				662
				663	The parameters are platform specific and the convention is that ``R0`` conveys
				664	information regarding the BL3x images from the Trusted Boot firmware and ``R1``
				665	can be used for other platform specific purpose. This convention allows
				666	platforms which use ARM Trusted Firmware's BL1 and BL2 images to transfer
				667	additional platform specific information from Secure Boot without conflicting
				668	with future evolution of the Trusted Firmware using ``R0`` to pass a ``bl_params``
				669	structure.
				670
				671	The AArch32 EL3 Runtime Software is responsible for entry into BL33. This
				672	information can be obtained in a platform defined manner, e.g. compiled into
				673	the AArch32 EL3 Runtime Software, or provided in a platform defined memory
				674	location by the Trusted Boot firmware, or passed from the Trusted Boot Firmware
				675	via the Cold boot Initialization parameters. This data may need to be cleaned
				676	out of the CPU caches if it is provided by an earlier boot stage and then
				677	accessed by AArch32 EL3 Runtime Software before the caches are enabled.
				678
				679	When using AArch32 EL3 Runtime Software, the ARM development platforms pass a
				680	``bl_params`` structure in ``R0`` from BL2 to be interpreted by AArch32 EL3 Runtime
				681	Software platform code.
				682
				683	MMU, Data caches & Coherency
				684	''''''''''''''''''''''''''''
				685
				686	AArch32 EL3 Runtime Software must not depend on the enabled state of the MMU,
				687	data caches or interconnect coherency in its entrypoint. They must be explicitly
				688	enabled if required.
				689
				690	Data structures used in cold boot interface
				691	'''''''''''''''''''''''''''''''''''''''''''
				692
				693	The AArch32 EL3 Runtime Software cold boot interface uses ``bl_params`` instead
				694	of ``bl31_params``. The ``bl_params`` structure is based on the convention
				695	described in AArch64 BL31 cold boot interface section.
				696
				697	Required CPU state for warm boot initialization
				698	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				699
				700	When requesting a CPU power-on, or suspending a running CPU, AArch32 EL3
				701	Runtime Software must ensure execution of a warm boot initialization entrypoint.
				702	If ARM Trusted Firmware BL1 is used and the PROGRAMMABLE\_RESET\_ADDRESS build
				703	flag is false, then AArch32 EL3 Runtime Software must ensure that BL1 branches
				704	to the warm boot entrypoint by arranging for the BL1 platform function,
				705	plat\_get\_my\_entrypoint(), to return a non-zero value.
				706
				707	In this case, the warm boot entrypoint must be in AArch32 EL3, little-endian
				708	data access and all interrupt sources masked:
				709
				710	::
				711
				712	PSTATE.AIF = 0x7
				713	SCTLR.EE = 0
				714
				715	The warm boot entrypoint may be implemented by using the ARM Trusted Firmware
				716	``psci_warmboot_entrypoint()`` function. In that case, the platform must fulfil
				717	the pre-requisites mentioned in the `PSCI Library integration guide`_.
				718
				719	EL3 runtime services framework
				720	------------------------------
				721
				722	Software executing in the non-secure state and in the secure state at exception
				723	levels lower than EL3 will request runtime services using the Secure Monitor
				724	Call (SMC) instruction. These requests will follow the convention described in
				725	the SMC Calling Convention PDD (`SMCCC`_). The `SMCCC`_ assigns function
				726	identifiers to each SMC request and describes how arguments are passed and
				727	returned.
				728
				729	The EL3 runtime services framework enables the development of services by
				730	different providers that can be easily integrated into final product firmware.
				731	The following sections describe the framework which facilitates the
				732	registration, initialization and use of runtime services in EL3 Runtime
				733	Software (BL31).
				734
				735	The design of the runtime services depends heavily on the concepts and
				736	definitions described in the `SMCCC`_, in particular SMC Function IDs, Owning
				737	Entity Numbers (OEN), Fast and Yielding calls, and the SMC32 and SMC64 calling
				738	conventions. Please refer to that document for more detailed explanation of
				739	these terms.
				740
				741	The following runtime services are expected to be implemented first. They have
				742	not all been instantiated in the current implementation.
				743
				744	#. Standard service calls
				745
				746	This service is for management of the entire system. The Power State
				747	Coordination Interface (`PSCI`_) is the first set of standard service calls
				748	defined by ARM (see PSCI section later).
				749
				750	#. Secure-EL1 Payload Dispatcher service
				751
				752	If a system runs a Trusted OS or other Secure-EL1 Payload (SP) then
				753	it also requires a Secure Monitor at EL3 to switch the EL1 processor
				754	context between the normal world (EL1/EL2) and trusted world (Secure-EL1).
				755	The Secure Monitor will make these world switches in response to SMCs. The
				756	`SMCCC`_ provides for such SMCs with the Trusted OS Call and Trusted
				757	Application Call OEN ranges.
				758
				759	The interface between the EL3 Runtime Software and the Secure-EL1 Payload is
				760	not defined by the `SMCCC`_ or any other standard. As a result, each
				761	Secure-EL1 Payload requires a specific Secure Monitor that runs as a runtime
				762	service - within ARM Trusted Firmware this service is referred to as the
				763	Secure-EL1 Payload Dispatcher (SPD).
				764
				765	ARM Trusted Firmware provides a Test Secure-EL1 Payload (TSP) and its
				766	associated Dispatcher (TSPD). Details of SPD design and TSP/TSPD operation
				767	are described in the "Secure-EL1 Payloads and Dispatchers" section below.
				768
				769	#. CPU implementation service
				770
				771	This service will provide an interface to CPU implementation specific
				772	services for a given platform e.g. access to processor errata workarounds.
				773	This service is currently unimplemented.
				774
				775	Additional services for ARM Architecture, SiP and OEM calls can be implemented.
				776	Each implemented service handles a range of SMC function identifiers as
				777	described in the `SMCCC`_.
				778
				779	Registration
				780	~~~~~~~~~~~~
				781
				782	A runtime service is registered using the ``DECLARE_RT_SVC()`` macro, specifying
				783	the name of the service, the range of OENs covered, the type of service and
				784	initialization and call handler functions. This macro instantiates a ``const struct rt_svc_desc`` for the service with these details (see ``runtime_svc.h``).
				785	This structure is allocated in a special ELF section ``rt_svc_descs``, enabling
				786	the framework to find all service descriptors included into BL31.
				787
				788	The specific service for a SMC Function is selected based on the OEN and call
				789	type of the Function ID, and the framework uses that information in the service
				790	descriptor to identify the handler for the SMC Call.
				791
				792	The service descriptors do not include information to identify the precise set
				793	of SMC function identifiers supported by this service implementation, the
				794	security state from which such calls are valid nor the capability to support
				795	64-bit and/or 32-bit callers (using SMC32 or SMC64). Responding appropriately
				796	to these aspects of a SMC call is the responsibility of the service
				797	implementation, the framework is focused on integration of services from
				798	different providers and minimizing the time taken by the framework before the
				799	service handler is invoked.
				800
				801	Details of the parameters, requirements and behavior of the initialization and
				802	call handling functions are provided in the following sections.
				803
				804	Initialization
				805	~~~~~~~~~~~~~~
				806
				807	``runtime_svc_init()`` in ``runtime_svc.c`` initializes the runtime services
				808	framework running on the primary CPU during cold boot as part of the BL31
				809	initialization. This happens prior to initializing a Trusted OS and running
				810	Normal world boot firmware that might in turn use these services.
				811	Initialization involves validating each of the declared runtime service
				812	descriptors, calling the service initialization function and populating the
				813	index used for runtime lookup of the service.
				814
				815	The BL31 linker script collects all of the declared service descriptors into a
				816	single array and defines symbols that allow the framework to locate and traverse
				817	the array, and determine its size.
				818
				819	The framework does basic validation of each descriptor to halt firmware
				820	initialization if service declaration errors are detected. The framework does
				821	not check descriptors for the following error conditions, and may behave in an
				822	unpredictable manner under such scenarios:
				823
				824	#. Overlapping OEN ranges
				825	#. Multiple descriptors for the same range of OENs and ``call_type``
				826	#. Incorrect range of owning entity numbers for a given ``call_type``
				827
				828	Once validated, the service ``init()`` callback is invoked. This function carries
				829	out any essential EL3 initialization before servicing requests. The ``init()``
				830	function is only invoked on the primary CPU during cold boot. If the service
				831	uses per-CPU data this must either be initialized for all CPUs during this call,
				832	or be done lazily when a CPU first issues an SMC call to that service. If
				833	``init()`` returns anything other than ``0``, this is treated as an initialization
				834	error and the service is ignored: this does not cause the firmware to halt.
				835
				836	The OEN and call type fields present in the SMC Function ID cover a total of
				837	128 distinct services, but in practice a single descriptor can cover a range of
				838	OENs, e.g. SMCs to call a Trusted OS function. To optimize the lookup of a
				839	service handler, the framework uses an array of 128 indices that map every
				840	distinct OEN/call-type combination either to one of the declared services or to
				841	indicate the service is not handled. This ``rt_svc_descs_indices[]`` array is
				842	populated for all of the OENs covered by a service after the service ``init()``
				843	function has reported success. So a service that fails to initialize will never
				844	have it's ``handle()`` function invoked.
				845
				846	The following figure shows how the ``rt_svc_descs_indices[]`` index maps the SMC
				847	Function ID call type and OEN onto a specific service handler in the
				848	``rt_svc_descs[]`` array.
				849
				850	\|Image 1\|
				851
				852	Handling an SMC
				853	~~~~~~~~~~~~~~~
				854
				855	When the EL3 runtime services framework receives a Secure Monitor Call, the SMC
				856	Function ID is passed in W0 from the lower exception level (as per the
				857	`SMCCC`_). If the calling register width is AArch32, it is invalid to invoke an
				858	SMC Function which indicates the SMC64 calling convention: such calls are
				859	ignored and return the Unknown SMC Function Identifier result code ``0xFFFFFFFF``
				860	in R0/X0.
				861
				862	Bit[31] (fast/yielding call) and bits[29:24] (owning entity number) of the SMC
				863	Function ID are combined to index into the ``rt_svc_descs_indices[]`` array. The
				864	resulting value might indicate a service that has no handler, in this case the
				865	framework will also report an Unknown SMC Function ID. Otherwise, the value is
				866	used as a further index into the ``rt_svc_descs[]`` array to locate the required
				867	service and handler.
				868
				869	The service's ``handle()`` callback is provided with five of the SMC parameters
				870	directly, the others are saved into memory for retrieval (if needed) by the
				871	handler. The handler is also provided with an opaque ``handle`` for use with the
				872	supporting library for parameter retrieval, setting return values and context
				873	manipulation; and with ``flags`` indicating the security state of the caller. The
				874	framework finally sets up the execution stack for the handler, and invokes the
				875	services ``handle()`` function.
				876
				877	On return from the handler the result registers are populated in X0-X3 before
				878	restoring the stack and CPU state and returning from the original SMC.
				879
				880	Power State Coordination Interface
				881	----------------------------------
				882
				883	TODO: Provide design walkthrough of PSCI implementation.
				884
				885	The PSCI v1.0 specification categorizes APIs as optional and mandatory. All the
				886	mandatory APIs in PSCI v1.0 and all the APIs in PSCI v0.2 draft specification
				887	`Power State Coordination Interface PDD`_ are implemented. The table lists
				888	the PSCI v1.0 APIs and their support in generic code.
				889
				890	An API implementation might have a dependency on platform code e.g. CPU\_SUSPEND
				891	requires the platform to export a part of the implementation. Hence the level
				892	of support of the mandatory APIs depends upon the support exported by the
				893	platform port as well. The Juno and FVP (all variants) platforms export all the
				894	required support.
				895
				896	+-----------------------------+-------------+-------------------------------+
				897	\| PSCI v1.0 API \| Supported \| Comments \|
				898	+=============================+=============+===============================+
				899	\| ``PSCI_VERSION`` \| Yes \| The version returned is 1.0 \|
				900	+-----------------------------+-------------+-------------------------------+
				901	\| ``CPU_SUSPEND`` \| Yes\* \| \|
				902	+-----------------------------+-------------+-------------------------------+
				903	\| ``CPU_OFF`` \| Yes\* \| \|
				904	+-----------------------------+-------------+-------------------------------+
				905	\| ``CPU_ON`` \| Yes\* \| \|
				906	+-----------------------------+-------------+-------------------------------+
				907	\| ``AFFINITY_INFO`` \| Yes \| \|
				908	+-----------------------------+-------------+-------------------------------+
				909	\| ``MIGRATE`` \| Yes\\ \| \|
				910	+-----------------------------+-------------+-------------------------------+
				911	\| ``MIGRATE_INFO_TYPE`` \| Yes\\ \| \|
				912	+-----------------------------+-------------+-------------------------------+
				913	\| ``MIGRATE_INFO_CPU`` \| Yes\\ \| \|
				914	+-----------------------------+-------------+-------------------------------+
				915	\| ``SYSTEM_OFF`` \| Yes\* \| \|
				916	+-----------------------------+-------------+-------------------------------+
				917	\| ``SYSTEM_RESET`` \| Yes\* \| \|
				918	+-----------------------------+-------------+-------------------------------+
				919	\| ``PSCI_FEATURES`` \| Yes \| \|
				920	+-----------------------------+-------------+-------------------------------+
				921	\| ``CPU_FREEZE`` \| No \| \|
				922	+-----------------------------+-------------+-------------------------------+
				923	\| ``CPU_DEFAULT_SUSPEND`` \| No \| \|
				924	+-----------------------------+-------------+-------------------------------+
				925	\| ``NODE_HW_STATE`` \| Yes\* \| \|
				926	+-----------------------------+-------------+-------------------------------+
				927	\| ``SYSTEM_SUSPEND`` \| Yes\* \| \|
				928	+-----------------------------+-------------+-------------------------------+
				929	\| ``PSCI_SET_SUSPEND_MODE`` \| No \| \|
				930	+-----------------------------+-------------+-------------------------------+
				931	\| ``PSCI_STAT_RESIDENCY`` \| Yes\* \| \|
				932	+-----------------------------+-------------+-------------------------------+
				933	\| ``PSCI_STAT_COUNT`` \| Yes\* \| \|
				934	+-----------------------------+-------------+-------------------------------+
				935
				936	\*Note : These PSCI APIs require platform power management hooks to be
				937	registered with the generic PSCI code to be supported.
				938
				939	\\Note : These PSCI APIs require appropriate Secure Payload Dispatcher
				940	hooks to be registered with the generic PSCI code to be supported.
				941
				942	The PSCI implementation in ARM Trusted Firmware is a library which can be
				943	integrated with AArch64 or AArch32 EL3 Runtime Software for ARMv8-A systems.
				944	A guide to integrating PSCI library with AArch32 EL3 Runtime Software
				945	can be found `here`_.
				946
				947	Secure-EL1 Payloads and Dispatchers
				948	-----------------------------------
				949
				950	On a production system that includes a Trusted OS running in Secure-EL1/EL0,
				951	the Trusted OS is coupled with a companion runtime service in the BL31
				952	firmware. This service is responsible for the initialisation of the Trusted
				953	OS and all communications with it. The Trusted OS is the BL32 stage of the
				954	boot flow in ARM Trusted Firmware. The firmware will attempt to locate, load
				955	and execute a BL32 image.
				956
				957	ARM Trusted Firmware uses a more general term for the BL32 software that runs
				958	at Secure-EL1 - the Secure-EL1 Payload - as it is not always a Trusted OS.
				959
				960	The ARM Trusted Firmware provides a Test Secure-EL1 Payload (TSP) and a Test
				961	Secure-EL1 Payload Dispatcher (TSPD) service as an example of how a Trusted OS
				962	is supported on a production system using the Runtime Services Framework. On
				963	such a system, the Test BL32 image and service are replaced by the Trusted OS
				964	and its dispatcher service. The ARM Trusted Firmware build system expects that
				965	the dispatcher will define the build flag ``NEED_BL32`` to enable it to include
				966	the BL32 in the build either as a binary or to compile from source depending
				967	on whether the ``BL32`` build option is specified or not.
				968
				969	The TSP runs in Secure-EL1. It is designed to demonstrate synchronous
				970	communication with the normal-world software running in EL1/EL2. Communication
				971	is initiated by the normal-world software
				972
				973	- either directly through a Fast SMC (as defined in the `SMCCC`_)
				974
				975	- or indirectly through a `PSCI`_ SMC. The `PSCI`_ implementation in turn
				976	informs the TSPD about the requested power management operation. This allows
				977	the TSP to prepare for or respond to the power state change
				978
				979	The TSPD service is responsible for.
				980
				981	- Initializing the TSP
				982
				983	- Routing requests and responses between the secure and the non-secure
				984	states during the two types of communications just described
				985
				986	Initializing a BL32 Image
				987	~~~~~~~~~~~~~~~~~~~~~~~~~
				988
				989	The Secure-EL1 Payload Dispatcher (SPD) service is responsible for initializing
				990	the BL32 image. It needs access to the information passed by BL2 to BL31 to do
				991	so. This is provided by:
				992
				993	.. code:: c
				994
				995	entry_point_info_t *bl31_plat_get_next_image_ep_info(uint32_t);
				996
				997	which returns a reference to the ``entry_point_info`` structure corresponding to
				998	the image which will be run in the specified security state. The SPD uses this
				999	API to get entry point information for the SECURE image, BL32.
				1000
				1001	In the absence of a BL32 image, BL31 passes control to the normal world
				1002	bootloader image (BL33). When the BL32 image is present, it is typical
				1003	that the SPD wants control to be passed to BL32 first and then later to BL33.
				1004
				1005	To do this the SPD has to register a BL32 initialization function during
				1006	initialization of the SPD service. The BL32 initialization function has this
				1007	prototype:
				1008
				1009	.. code:: c
				1010
				1011	int32_t init(void);
				1012
				1013	and is registered using the ``bl31_register_bl32_init()`` function.
				1014
				1015	Trusted Firmware supports two approaches for the SPD to pass control to BL32
				1016	before returning through EL3 and running the non-trusted firmware (BL33):
				1017
				1018	#. In the BL32 setup function, use ``bl31_set_next_image_type()`` to
				1019	request that the exit from ``bl31_main()`` is to the BL32 entrypoint in
				1020	Secure-EL1. BL31 will exit to BL32 using the asynchronous method by
				1021	calling ``bl31_prepare_next_image_entry()`` and ``el3_exit()``.
				1022
				1023	When the BL32 has completed initialization at Secure-EL1, it returns to
				1024	BL31 by issuing an SMC, using a Function ID allocated to the SPD. On
				1025	receipt of this SMC, the SPD service handler should switch the CPU context
				1026	from trusted to normal world and use the ``bl31_set_next_image_type()`` and
				1027	``bl31_prepare_next_image_entry()`` functions to set up the initial return to
				1028	the normal world firmware BL33. On return from the handler the framework
				1029	will exit to EL2 and run BL33.
				1030
				1031	#. The BL32 setup function registers an initialization function using
				1032	``bl31_register_bl32_init()`` which provides a SPD-defined mechanism to
				1033	invoke a 'world-switch synchronous call' to Secure-EL1 to run the BL32
				1034	entrypoint.
				1035	NOTE: The Test SPD service included with the Trusted Firmware provides one
				1036	implementation of such a mechanism.
				1037
				1038	On completion BL32 returns control to BL31 via a SMC, and on receipt the
				1039	SPD service handler invokes the synchronous call return mechanism to return
				1040	to the BL32 initialization function. On return from this function,
				1041	``bl31_main()`` will set up the return to the normal world firmware BL33 and
				1042	continue the boot process in the normal world.
				1043
				1044	#. .. rubric:: Crash Reporting in BL31
				1045	:name: crash-reporting-in-bl31
				1046
				1047	BL31 implements a scheme for reporting the processor state when an unhandled
				1048	exception is encountered. The reporting mechanism attempts to preserve all the
				1049	register contents and report it via a dedicated UART (PL011 console). BL31
				1050	reports the general purpose, EL3, Secure EL1 and some EL2 state registers.
				1051
				1052	A dedicated per-CPU crash stack is maintained by BL31 and this is retrieved via
				1053	the per-CPU pointer cache. The implementation attempts to minimise the memory
				1054	required for this feature. The file ``crash_reporting.S`` contains the
				1055	implementation for crash reporting.
				1056
				1057	The sample crash output is shown below.
				1058
				1059	::
				1060
				1061	x0 :0x000000004F00007C
				1062	x1 :0x0000000007FFFFFF
				1063	x2 :0x0000000004014D50
				1064	x3 :0x0000000000000000
				1065	x4 :0x0000000088007998
				1066	x5 :0x00000000001343AC
				1067	x6 :0x0000000000000016
				1068	x7 :0x00000000000B8A38
				1069	x8 :0x00000000001343AC
				1070	x9 :0x00000000000101A8
				1071	x10 :0x0000000000000002
				1072	x11 :0x000000000000011C
				1073	x12 :0x00000000FEFDC644
				1074	x13 :0x00000000FED93FFC
				1075	x14 :0x0000000000247950
				1076	x15 :0x00000000000007A2
				1077	x16 :0x00000000000007A4
				1078	x17 :0x0000000000247950
				1079	x18 :0x0000000000000000
				1080	x19 :0x00000000FFFFFFFF
				1081	x20 :0x0000000004014D50
				1082	x21 :0x000000000400A38C
				1083	x22 :0x0000000000247950
				1084	x23 :0x0000000000000010
				1085	x24 :0x0000000000000024
				1086	x25 :0x00000000FEFDC868
				1087	x26 :0x00000000FEFDC86A
				1088	x27 :0x00000000019EDEDC
				1089	x28 :0x000000000A7CFDAA
				1090	x29 :0x0000000004010780
				1091	x30 :0x000000000400F004
				1092	scr_el3 :0x0000000000000D3D
				1093	sctlr_el3 :0x0000000000C8181F
				1094	cptr_el3 :0x0000000000000000
				1095	tcr_el3 :0x0000000080803520
				1096	daif :0x00000000000003C0
				1097	mair_el3 :0x00000000000004FF
				1098	spsr_el3 :0x00000000800003CC
				1099	elr_el3 :0x000000000400C0CC
				1100	ttbr0_el3 :0x00000000040172A0
				1101	esr_el3 :0x0000000096000210
				1102	sp_el3 :0x0000000004014D50
				1103	far_el3 :0x000000004F00007C
				1104	spsr_el1 :0x0000000000000000
				1105	elr_el1 :0x0000000000000000
				1106	spsr_abt :0x0000000000000000
				1107	spsr_und :0x0000000000000000
				1108	spsr_irq :0x0000000000000000
				1109	spsr_fiq :0x0000000000000000
				1110	sctlr_el1 :0x0000000030C81807
				1111	actlr_el1 :0x0000000000000000
				1112	cpacr_el1 :0x0000000000300000
				1113	csselr_el1 :0x0000000000000002
				1114	sp_el1 :0x0000000004028800
				1115	esr_el1 :0x0000000000000000
				1116	ttbr0_el1 :0x000000000402C200
				1117	ttbr1_el1 :0x0000000000000000
				1118	mair_el1 :0x00000000000004FF
				1119	amair_el1 :0x0000000000000000
				1120	tcr_el1 :0x0000000000003520
				1121	tpidr_el1 :0x0000000000000000
				1122	tpidr_el0 :0x0000000000000000
				1123	tpidrro_el0 :0x0000000000000000
				1124	dacr32_el2 :0x0000000000000000
				1125	ifsr32_el2 :0x0000000000000000
				1126	par_el1 :0x0000000000000000
				1127	far_el1 :0x0000000000000000
				1128	afsr0_el1 :0x0000000000000000
				1129	afsr1_el1 :0x0000000000000000
				1130	contextidr_el1 :0x0000000000000000
				1131	vbar_el1 :0x0000000004027000
				1132	cntp_ctl_el0 :0x0000000000000000
				1133	cntp_cval_el0 :0x0000000000000000
				1134	cntv_ctl_el0 :0x0000000000000000
				1135	cntv_cval_el0 :0x0000000000000000
				1136	cntkctl_el1 :0x0000000000000000
				1137	fpexc32_el2 :0x0000000004000700
				1138	sp_el0 :0x0000000004010780
				1139
				1140	Guidelines for Reset Handlers
				1141	-----------------------------
				1142
				1143	Trusted Firmware implements a framework that allows CPU and platform ports to
				1144	perform actions very early after a CPU is released from reset in both the cold
				1145	and warm boot paths. This is done by calling the ``reset_handler()`` function in
				1146	both the BL1 and BL31 images. It in turn calls the platform and CPU specific
				1147	reset handling functions.
				1148
				1149	Details for implementing a CPU specific reset handler can be found in
				1150	Section 8. Details for implementing a platform specific reset handler can be
				1151	found in the `Porting Guide`_ (see the ``plat_reset_handler()`` function).
				1152
				1153	When adding functionality to a reset handler, keep in mind that if a different
				1154	reset handling behavior is required between the first and the subsequent
				1155	invocations of the reset handling code, this should be detected at runtime.
				1156	In other words, the reset handler should be able to detect whether an action has
				1157	already been performed and act as appropriate. Possible courses of actions are,
				1158	e.g. skip the action the second time, or undo/redo it.
				1159
				1160	CPU specific operations framework
				1161	---------------------------------
				1162
				1163	Certain aspects of the ARMv8 architecture are implementation defined,
				1164	that is, certain behaviours are not architecturally defined, but must be defined
				1165	and documented by individual processor implementations. The ARM Trusted
				1166	Firmware implements a framework which categorises the common implementation
				1167	defined behaviours and allows a processor to export its implementation of that
				1168	behaviour. The categories are:
				1169
				1170	#. Processor specific reset sequence.
				1171
				1172	#. Processor specific power down sequences.
				1173
				1174	#. Processor specific register dumping as a part of crash reporting.
				1175
				1176	#. Errata status reporting.
				1177
				1178	Each of the above categories fulfils a different requirement.
				1179
				1180	#. allows any processor specific initialization before the caches and MMU
				1181	are turned on, like implementation of errata workarounds, entry into
				1182	the intra-cluster coherency domain etc.
				1183
				1184	#. allows each processor to implement the power down sequence mandated in
				1185	its Technical Reference Manual (TRM).
				1186
				1187	#. allows a processor to provide additional information to the developer
				1188	in the event of a crash, for example Cortex-A53 has registers which
				1189	can expose the data cache contents.
				1190
				1191	#. allows a processor to define a function that inspects and reports the status
				1192	of all errata workarounds on that processor.
				1193
				1194	Please note that only 2. is mandated by the TRM.
				1195
				1196	The CPU specific operations framework scales to accommodate a large number of
				1197	different CPUs during power down and reset handling. The platform can specify
				1198	any CPU optimization it wants to enable for each CPU. It can also specify
				1199	the CPU errata workarounds to be applied for each CPU type during reset
				1200	handling by defining CPU errata compile time macros. Details on these macros
				1201	can be found in the `cpu-specific-build-macros.rst`_ file.
				1202
				1203	The CPU specific operations framework depends on the ``cpu_ops`` structure which
				1204	needs to be exported for each type of CPU in the platform. It is defined in
				1205	``include/lib/cpus/aarch64/cpu_macros.S`` and has the following fields : ``midr``,
				1206	``reset_func()``, ``cpu_pwr_down_ops`` (array of power down functions) and
				1207	``cpu_reg_dump()``.
				1208
				1209	The CPU specific files in ``lib/cpus`` export a ``cpu_ops`` data structure with
				1210	suitable handlers for that CPU. For example, ``lib/cpus/aarch64/cortex_a53.S``
				1211	exports the ``cpu_ops`` for Cortex-A53 CPU. According to the platform
				1212	configuration, these CPU specific files must be included in the build by
				1213	the platform makefile. The generic CPU specific operations framework code exists
				1214	in ``lib/cpus/aarch64/cpu_helpers.S``.
				1215
				1216	CPU specific Reset Handling
				1217	~~~~~~~~~~~~~~~~~~~~~~~~~~~
				1218
				1219	After a reset, the state of the CPU when it calls generic reset handler is:
				1220	MMU turned off, both instruction and data caches turned off and not part
				1221	of any coherency domain.
				1222
				1223	The BL entrypoint code first invokes the ``plat_reset_handler()`` to allow
				1224	the platform to perform any system initialization required and any system
				1225	errata workarounds that needs to be applied. The ``get_cpu_ops_ptr()`` reads
				1226	the current CPU midr, finds the matching ``cpu_ops`` entry in the ``cpu_ops``
				1227	array and returns it. Note that only the part number and implementer fields
				1228	in midr are used to find the matching ``cpu_ops`` entry. The ``reset_func()`` in
				1229	the returned ``cpu_ops`` is then invoked which executes the required reset
				1230	handling for that CPU and also any errata workarounds enabled by the platform.
				1231	This function must preserve the values of general purpose registers x20 to x29.
				1232
				1233	Refer to Section "Guidelines for Reset Handlers" for general guidelines
				1234	regarding placement of code in a reset handler.
				1235
				1236	CPU specific power down sequence
				1237	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				1238
				1239	During the BL31 initialization sequence, the pointer to the matching ``cpu_ops``
				1240	entry is stored in per-CPU data by ``init_cpu_ops()`` so that it can be quickly
				1241	retrieved during power down sequences.
				1242
				1243	Various CPU drivers register handlers to perform power down at certain power
				1244	levels for that specific CPU. The PSCI service, upon receiving a power down
				1245	request, determines the highest power level at which to execute power down
				1246	sequence for a particular CPU. It uses the ``prepare_cpu_pwr_dwn()`` function to
				1247	pick the right power down handler for the requested level. The function
				1248	retrieves ``cpu_ops`` pointer member of per-CPU data, and from that, further
				1249	retrieves ``cpu_pwr_down_ops`` array, and indexes into the required level. If the
				1250	requested power level is higher than what a CPU driver supports, the handler
				1251	registered for highest level is invoked.
				1252
				1253	At runtime the platform hooks for power down are invoked by the PSCI service to
				1254	perform platform specific operations during a power down sequence, for example
				1255	turning off CCI coherency during a cluster power down.
				1256
				1257	CPU specific register reporting during crash
				1258	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				1259
				1260	If the crash reporting is enabled in BL31, when a crash occurs, the crash
				1261	reporting framework calls ``do_cpu_reg_dump`` which retrieves the matching
				1262	``cpu_ops`` using ``get_cpu_ops_ptr()`` function. The ``cpu_reg_dump()`` in
				1263	``cpu_ops`` is invoked, which then returns the CPU specific register values to
				1264	be reported and a pointer to the ASCII list of register names in a format
				1265	expected by the crash reporting framework.
				1266
				1267	CPU errata status reporting
				1268	~~~~~~~~~~~~~~~~~~~~~~~~~~~
				1269
				1270	Errata workarounds for CPUs supported in ARM Trusted Firmware are applied during
				1271	both cold and warm boots, shortly after reset. Individual Errata workarounds are
				1272	enabled as build options. Some errata workarounds have potential run-time
				1273	implications; therefore some are enabled by default, others not. Platform ports
				1274	shall override build options to enable or disable errata as appropriate. The CPU
				1275	drivers take care of applying errata workarounds that are enabled and applicable
				1276	to a given CPU. Refer to the section titled CPU Errata Workarounds in `CPUBM`_
				1277	for more information.
				1278
				1279	Functions in CPU drivers that apply errata workaround must follow the
				1280	conventions listed below.
				1281
				1282	The errata workaround must be authored as two separate functions:
				1283
				1284	- One that checks for errata. This function must determine whether that errata
				1285	applies to the current CPU. Typically this involves matching the current
				1286	CPUs revision and variant against a value that's known to be affected by the
				1287	errata. If the function determines that the errata applies to this CPU, it
				1288	must return ``ERRATA_APPLIES``; otherwise, it must return
				1289	``ERRATA_NOT_APPLIES``. The utility functions ``cpu_get_rev_var`` and
				1290	``cpu_rev_var_ls`` functions may come in handy for this purpose.
				1291
				1292	For an errata identified as ``E``, the check function must be named
				1293	``check_errata_E``.
				1294
				1295	This function will be invoked at different times, both from assembly and from
				1296	C run time. Therefore it must follow AAPCS, and must not use stack.
				1297
				1298	- Another one that applies the errata workaround. This function would call the
				1299	check function described above, and applies errata workaround if required.
				1300
				1301	CPU drivers that apply errata workaround can optionally implement an assembly
				1302	function that report the status of errata workarounds pertaining to that CPU.
				1303	For a driver that registers the CPU, for example, ``cpux`` via. ``declare_cpu_ops``
				1304	macro, the errata reporting function, if it exists, must be named
				1305	``cpux_errata_report``. This function will always be called with MMU enabled; it
				1306	must follow AAPCS and may use stack.
				1307
				1308	In a debug build of ARM Trusted Firmware, on a CPU that comes out of reset, both
				1309	BL1 and the run time firmware (BL31 in AArch64, and BL32 in AArch32) will invoke
				1310	errata status reporting function, if one exists, for that type of CPU.
				1311
				1312	To report the status of each errata workaround, the function shall use the
				1313	assembler macro ``report_errata``, passing it:
				1314
				1315	- The build option that enables the errata;
				1316
				1317	- The name of the CPU: this must be the same identifier that CPU driver
				1318	registered itself with, using ``declare_cpu_ops``;
				1319
				1320	- And the errata identifier: the identifier must match what's used in the
				1321	errata's check function described above.
				1322
				1323	The errata status reporting function will be called once per CPU type/errata
				1324	combination during the software's active life time.
				1325
				1326	It's expected that whenever an errata workaround is submitted to ARM Trusted
				1327	Firmware, the errata reporting function is appropriately extended to report its
				1328	status as well.
				1329
				1330	Reporting the status of errata workaround is for informational purpose only; it
				1331	has no functional significance.
				1332
				1333	Memory layout of BL images
				1334	--------------------------
				1335
				1336	Each bootloader image can be divided in 2 parts:
				1337
				1338	- the static contents of the image. These are data actually stored in the
				1339	binary on the disk. In the ELF terminology, they are called ``PROGBITS``
				1340	sections;
				1341
				1342	- the run-time contents of the image. These are data that don't occupy any
				1343	space in the binary on the disk. The ELF binary just contains some
				1344	metadata indicating where these data will be stored at run-time and the
				1345	corresponding sections need to be allocated and initialized at run-time.
				1346	In the ELF terminology, they are called ``NOBITS`` sections.
				1347
				1348	All PROGBITS sections are grouped together at the beginning of the image,
				1349	followed by all NOBITS sections. This is true for all Trusted Firmware images
				1350	and it is governed by the linker scripts. This ensures that the raw binary
				1351	images are as small as possible. If a NOBITS section was inserted in between
				1352	PROGBITS sections then the resulting binary file would contain zero bytes in
				1353	place of this NOBITS section, making the image unnecessarily bigger. Smaller
				1354	images allow faster loading from the FIP to the main memory.
				1355
				1356	Linker scripts and symbols
				1357	~~~~~~~~~~~~~~~~~~~~~~~~~~
				1358
				1359	Each bootloader stage image layout is described by its own linker script. The
				1360	linker scripts export some symbols into the program symbol table. Their values
				1361	correspond to particular addresses. The trusted firmware code can refer to these
				1362	symbols to figure out the image memory layout.
				1363
				1364	Linker symbols follow the following naming convention in the trusted firmware.
				1365
				1366	- ``__<SECTION>_START__``
				1367
				1368	Start address of a given section named ``<SECTION>``.
				1369
				1370	- ``__<SECTION>_END__``
				1371
				1372	End address of a given section named ``<SECTION>``. If there is an alignment
				1373	constraint on the section's end address then ``__<SECTION>_END__`` corresponds
				1374	to the end address of the section's actual contents, rounded up to the right
				1375	boundary. Refer to the value of ``__<SECTION>_UNALIGNED_END__`` to know the
				1376	actual end address of the section's contents.
				1377
				1378	- ``__<SECTION>_UNALIGNED_END__``
				1379
				1380	End address of a given section named ``<SECTION>`` without any padding or
				1381	rounding up due to some alignment constraint.
				1382
				1383	- ``__<SECTION>_SIZE__``
				1384
				1385	Size (in bytes) of a given section named ``<SECTION>``. If there is an
				1386	alignment constraint on the section's end address then ``__<SECTION>_SIZE__``
				1387	corresponds to the size of the section's actual contents, rounded up to the
				1388	right boundary. In other words, ``__<SECTION>_SIZE__ = __<SECTION>_END__ - _<SECTION>_START__``. Refer to the value of ``__<SECTION>_UNALIGNED_SIZE__``
				1389	to know the actual size of the section's contents.
				1390
				1391	- ``__<SECTION>_UNALIGNED_SIZE__``
				1392
				1393	Size (in bytes) of a given section named ``<SECTION>`` without any padding or
				1394	rounding up due to some alignment constraint. In other words,
				1395	``__<SECTION>_UNALIGNED_SIZE__ = __<SECTION>_UNALIGNED_END__ - __<SECTION>_START__``.
				1396
				1397	Some of the linker symbols are mandatory as the trusted firmware code relies on
				1398	them to be defined. They are listed in the following subsections. Some of them
				1399	must be provided for each bootloader stage and some are specific to a given
				1400	bootloader stage.
				1401
				1402	The linker scripts define some extra, optional symbols. They are not actually
				1403	used by any code but they help in understanding the bootloader images' memory
				1404	layout as they are easy to spot in the link map files.
				1405
				1406	Common linker symbols
				1407	^^^^^^^^^^^^^^^^^^^^^
				1408
				1409	All BL images share the following requirements:
				1410
				1411	- The BSS section must be zero-initialised before executing any C code.
				1412	- The coherent memory section (if enabled) must be zero-initialised as well.
				1413	- The MMU setup code needs to know the extents of the coherent and read-only
				1414	memory regions to set the right memory attributes. When
				1415	``SEPARATE_CODE_AND_RODATA=1``, it needs to know more specifically how the
				1416	read-only memory region is divided between code and data.
				1417
				1418	The following linker symbols are defined for this purpose:
				1419
				1420	- ``__BSS_START__``
				1421	- ``__BSS_SIZE__``
				1422	- ``__COHERENT_RAM_START__`` Must be aligned on a page-size boundary.
				1423	- ``__COHERENT_RAM_END__`` Must be aligned on a page-size boundary.
				1424	- ``__COHERENT_RAM_UNALIGNED_SIZE__``
				1425	- ``__RO_START__``
				1426	- ``__RO_END__``
				1427	- ``__TEXT_START__``
				1428	- ``__TEXT_END__``
				1429	- ``__RODATA_START__``
				1430	- ``__RODATA_END__``
				1431
				1432	BL1's linker symbols
				1433	^^^^^^^^^^^^^^^^^^^^
				1434
				1435	BL1 being the ROM image, it has additional requirements. BL1 resides in ROM and
				1436	it is entirely executed in place but it needs some read-write memory for its
				1437	mutable data. Its ``.data`` section (i.e. its allocated read-write data) must be
				1438	relocated from ROM to RAM before executing any C code.
				1439
				1440	The following additional linker symbols are defined for BL1:
				1441
				1442	- ``__BL1_ROM_END__`` End address of BL1's ROM contents, covering its code
				1443	and ``.data`` section in ROM.
				1444	- ``__DATA_ROM_START__`` Start address of the ``.data`` section in ROM. Must be
				1445	aligned on a 16-byte boundary.
				1446	- ``__DATA_RAM_START__`` Address in RAM where the ``.data`` section should be
				1447	copied over. Must be aligned on a 16-byte boundary.
				1448	- ``__DATA_SIZE__`` Size of the ``.data`` section (in ROM or RAM).
				1449	- ``__BL1_RAM_START__`` Start address of BL1 read-write data.
				1450	- ``__BL1_RAM_END__`` End address of BL1 read-write data.
				1451
				1452	How to choose the right base addresses for each bootloader stage image
				1453	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				1454
				1455	There is currently no support for dynamic image loading in the Trusted Firmware.
				1456	This means that all bootloader images need to be linked against their ultimate
				1457	runtime locations and the base addresses of each image must be chosen carefully
				1458	such that images don't overlap each other in an undesired way. As the code
				1459	grows, the base addresses might need adjustments to cope with the new memory
				1460	layout.
				1461
				1462	The memory layout is completely specific to the platform and so there is no
				1463	general recipe for choosing the right base addresses for each bootloader image.
				1464	However, there are tools to aid in understanding the memory layout. These are
				1465	the link map files: ``build/<platform>/<build-type>/bl<x>/bl<x>.map``, with ``<x>``
				1466	being the stage bootloader. They provide a detailed view of the memory usage of
				1467	each image. Among other useful information, they provide the end address of
				1468	each image.
				1469
				1470	- ``bl1.map`` link map file provides ``__BL1_RAM_END__`` address.
				1471	- ``bl2.map`` link map file provides ``__BL2_END__`` address.
				1472	- ``bl31.map`` link map file provides ``__BL31_END__`` address.
				1473	- ``bl32.map`` link map file provides ``__BL32_END__`` address.
				1474
				1475	For each bootloader image, the platform code must provide its start address
				1476	as well as a limit address that it must not overstep. The latter is used in the
				1477	linker scripts to check that the image doesn't grow past that address. If that
				1478	happens, the linker will issue a message similar to the following:
				1479
				1480	::
				1481
				1482	aarch64-none-elf-ld: BLx has exceeded its limit.
				1483
				1484	Additionally, if the platform memory layout implies some image overlaying like
				1485	on FVP, BL31 and TSP need to know the limit address that their PROGBITS
				1486	sections must not overstep. The platform code must provide those.
				1487
				1488	When LOAD\_IMAGE\_V2 is disabled, Trusted Firmware provides a mechanism to
				1489	verify at boot time that the memory to load a new image is free to prevent
				1490	overwriting a previously loaded image. For this mechanism to work, the platform
				1491	must specify the memory available in the system as regions, where each region
				1492	consists of base address, total size and the free area within it (as defined
				1493	in the ``meminfo_t`` structure). Trusted Firmware retrieves these memory regions
				1494	by calling the corresponding platform API:
				1495
				1496	- ``meminfo_t *bl1_plat_sec_mem_layout(void)``
				1497	- ``meminfo_t *bl2_plat_sec_mem_layout(void)``
				1498	- ``void bl2_plat_get_scp_bl2_meminfo(meminfo_t *scp_bl2_meminfo)``
				1499	- ``void bl2_plat_get_bl32_meminfo(meminfo_t *bl32_meminfo)``
				1500	- ``void bl2_plat_get_bl33_meminfo(meminfo_t *bl33_meminfo)``
				1501
				1502	For example, in the case of BL1 loading BL2, ``bl1_plat_sec_mem_layout()`` will
				1503	return the region defined by the platform where BL1 intends to load BL2. The
				1504	``load_image()`` function will check that the memory where BL2 will be loaded is
				1505	within the specified region and marked as free.
				1506
				1507	The actual number of regions and their base addresses and sizes is platform
				1508	specific. The platform may return the same region or define a different one for
				1509	each API. However, the overlap verification mechanism applies only to a single
				1510	region. Hence, it is the platform responsibility to guarantee that different
				1511	regions do not overlap, or that if they do, the overlapping images are not
				1512	accessed at the same time. This could be used, for example, to load temporary
				1513	images (e.g. certificates) or firmware images prior to being transfered to its
				1514	corresponding processor (e.g. the SCP BL2 image).
				1515
				1516	To reduce fragmentation and simplify the tracking of free memory, all the free
				1517	memory within a region is always located in one single buffer defined by its
				1518	base address and size. Trusted Firmware implements a top/bottom load approach:
				1519	after a new image is loaded, it checks how much memory remains free above and
				1520	below the image. The smallest area is marked as unavailable, while the larger
				1521	area becomes the new free memory buffer. Platforms should take this behaviour
				1522	into account when defining the base address for each of the images. For example,
				1523	if an image is loaded near the middle of the region, small changes in image size
				1524	could cause a flip between a top load and a bottom load, which may result in an
				1525	unexpected memory layout.
				1526
				1527	The following diagram is an example of an image loaded in the bottom part of
				1528	the memory region. The region is initially free (nothing has been loaded yet):
				1529
				1530	::
				1531
				1532	Memory region
				1533	+----------+
				1534	\| \|
				1535	\| \| <<<<<<<<<<<<< Free
				1536	\| \|
				1537	\|----------\| +------------+
				1538	\| image \| <<<<<<<<<<<<< \| image \|
				1539	\|----------\| +------------+
				1540	\| xxxxxxxx \| <<<<<<<<<<<<< Marked as unavailable
				1541	+----------+
				1542
				1543	And the following diagram is an example of an image loaded in the top part:
				1544
				1545	::
				1546
				1547	Memory region
				1548	+----------+
				1549	\| xxxxxxxx \| <<<<<<<<<<<<< Marked as unavailable
				1550	\|----------\| +------------+
				1551	\| image \| <<<<<<<<<<<<< \| image \|
				1552	\|----------\| +------------+
				1553	\| \|
				1554	\| \| <<<<<<<<<<<<< Free
				1555	\| \|
				1556	+----------+
				1557
				1558	When LOAD\_IMAGE\_V2 is enabled, Trusted Firmware does not provide any mechanism
				1559	to verify at boot time that the memory to load a new image is free to prevent
				1560	overwriting a previously loaded image. The platform must specify the memory
				1561	available in the system for all the relevant BL images to be loaded.
				1562
				1563	For example, in the case of BL1 loading BL2, ``bl1_plat_sec_mem_layout()`` will
				1564	return the region defined by the platform where BL1 intends to load BL2. The
				1565	``load_image()`` function performs bounds check for the image size based on the
				1566	base and maximum image size provided by the platforms. Platforms must take
				1567	this behaviour into account when defining the base/size for each of the images.
				1568
				1569	Memory layout on ARM development platforms
				1570	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1571
				1572	The following list describes the memory layout on the ARM development platforms:
				1573
				1574	- A 4KB page of shared memory is used for communication between Trusted
				1575	Firmware and the platform's power controller. This is located at the base of
				1576	Trusted SRAM. The amount of Trusted SRAM available to load the bootloader
				1577	images is reduced by the size of the shared memory.
				1578
				1579	The shared memory is used to store the CPUs' entrypoint mailbox. On Juno,
				1580	this is also used for the MHU payload when passing messages to and from the
				1581	SCP.
				1582
				1583	- On FVP, BL1 is originally sitting in the Trusted ROM at address ``0x0``. On
				1584	Juno, BL1 resides in flash memory at address ``0x0BEC0000``. BL1 read-write
				1585	data are relocated to the top of Trusted SRAM at runtime.
				1586
				1587	- EL3 Runtime Software, BL31 for AArch64 and BL32 for AArch32 (e.g. SP\_MIN),
				1588	is loaded at the top of the Trusted SRAM, such that its NOBITS sections will
				1589	overwrite BL1 R/W data. This implies that BL1 global variables remain valid
				1590	only until execution reaches the EL3 Runtime Software entry point during a
				1591	cold boot.
				1592
				1593	- BL2 is loaded below EL3 Runtime Software.
				1594
				1595	- On Juno, SCP\_BL2 is loaded temporarily into the EL3 Runtime Software memory
				1596	region and transfered to the SCP before being overwritten by EL3 Runtime
				1597	Software.
				1598
				1599	- BL32 (for AArch64) can be loaded in one of the following locations:
				1600
				1601	- Trusted SRAM
				1602	- Trusted DRAM (FVP only)
				1603	- Secure region of DRAM (top 16MB of DRAM configured by the TrustZone
				1604	controller)
				1605
				1606	When BL32 (for AArch64) is loaded into Trusted SRAM, its NOBITS sections
				1607	are allowed to overlay BL2. This memory layout is designed to give the
				1608	BL32 image as much memory as possible when it is loaded into Trusted SRAM.
				1609
				1610	When LOAD\_IMAGE\_V2 is disabled the memory regions for the overlap detection
				1611	mechanism at boot time are defined as follows (shown per API):
				1612
				1613	- ``meminfo_t *bl1_plat_sec_mem_layout(void)``
				1614
				1615	This region corresponds to the whole Trusted SRAM except for the shared
				1616	memory at the base. This region is initially free. At boot time, BL1 will
				1617	mark the BL1(rw) section within this region as occupied. The BL1(rw) section
				1618	is placed at the top of Trusted SRAM.
				1619
				1620	- ``meminfo_t *bl2_plat_sec_mem_layout(void)``
				1621
				1622	This region corresponds to the whole Trusted SRAM as defined by
				1623	``bl1_plat_sec_mem_layout()``, but with the BL1(rw) section marked as
				1624	occupied. This memory region is used to check that BL2 and BL31 do not
				1625	overlap with each other. BL2\_BASE and BL1\_RW\_BASE are carefully chosen so
				1626	that the memory for BL31 is top loaded above BL2.
				1627
				1628	- ``void bl2_plat_get_scp_bl2_meminfo(meminfo_t *scp_bl2_meminfo)``
				1629
				1630	This region is an exact copy of the region defined by
				1631	``bl2_plat_sec_mem_layout()``. Being a disconnected copy means that all the
				1632	changes made to this region by the Trusted Firmware will not be propagated.
				1633	This approach is valid because the SCP BL2 image is loaded temporarily
				1634	while it is being transferred to the SCP, so this memory is reused
				1635	afterwards.
				1636
				1637	- ``void bl2_plat_get_bl32_meminfo(meminfo_t *bl32_meminfo)``
				1638
				1639	This region depends on the location of the BL32 image. Currently, ARM
				1640	platforms support three different locations (detailed below): Trusted SRAM,
				1641	Trusted DRAM and the TZC-Secured DRAM.
				1642
				1643	- ``void bl2_plat_get_bl33_meminfo(meminfo_t *bl33_meminfo)``
				1644
				1645	This region corresponds to the Non-Secure DDR-DRAM, excluding the
				1646	TZC-Secured area.
				1647
				1648	The location of the BL32 image will result in different memory maps. This is
				1649	illustrated for both FVP and Juno in the following diagrams, using the TSP as
				1650	an example.
				1651
				1652	Note: Loading the BL32 image in TZC secured DRAM doesn't change the memory
				1653	layout of the other images in Trusted SRAM.
				1654
				1655	FVP with TSP in Trusted SRAM (default option):
				1656	(These diagrams only cover the AArch64 case)
				1657
				1658	::
				1659
				1660	Trusted SRAM
				1661	0x04040000 +----------+ loaded by BL2 ------------------
				1662	\| BL1 (rw) \| <<<<<<<<<<<<< \| BL31 NOBITS \|
				1663	\|----------\| <<<<<<<<<<<<< \|----------------\|
				1664	\| \| <<<<<<<<<<<<< \| BL31 PROGBITS \|
				1665	\|----------\| ------------------
				1666	\| BL2 \| <<<<<<<<<<<<< \| BL32 NOBITS \|
				1667	\|----------\| <<<<<<<<<<<<< \|----------------\|
				1668	\| \| <<<<<<<<<<<<< \| BL32 PROGBITS \|
				1669	0x04001000 +----------+ ------------------
				1670	\| Shared \|
				1671	0x04000000 +----------+
				1672
				1673	Trusted ROM
				1674	0x04000000 +----------+
				1675	\| BL1 (ro) \|
				1676	0x00000000 +----------+
				1677
				1678	FVP with TSP in Trusted DRAM:
				1679
				1680	::
				1681
				1682	Trusted DRAM
				1683	0x08000000 +----------+
				1684	\| BL32 \|
				1685	0x06000000 +----------+
				1686
				1687	Trusted SRAM
				1688	0x04040000 +----------+ loaded by BL2 ------------------
				1689	\| BL1 (rw) \| <<<<<<<<<<<<< \| BL31 NOBITS \|
				1690	\|----------\| <<<<<<<<<<<<< \|----------------\|
				1691	\| \| <<<<<<<<<<<<< \| BL31 PROGBITS \|
				1692	\|----------\| ------------------
				1693	\| BL2 \|
				1694	\|----------\|
				1695	\| \|
				1696	0x04001000 +----------+
				1697	\| Shared \|
				1698	0x04000000 +----------+
				1699
				1700	Trusted ROM
				1701	0x04000000 +----------+
				1702	\| BL1 (ro) \|
				1703	0x00000000 +----------+
				1704
				1705	FVP with TSP in TZC-Secured DRAM:
				1706
				1707	::
				1708
				1709	DRAM
				1710	0xffffffff +----------+
				1711	\| BL32 \| (secure)
				1712	0xff000000 +----------+
				1713	\| \|
				1714	: : (non-secure)
				1715	\| \|
				1716	0x80000000 +----------+
				1717
				1718	Trusted SRAM
				1719	0x04040000 +----------+ loaded by BL2 ------------------
				1720	\| BL1 (rw) \| <<<<<<<<<<<<< \| BL31 NOBITS \|
				1721	\|----------\| <<<<<<<<<<<<< \|----------------\|
				1722	\| \| <<<<<<<<<<<<< \| BL31 PROGBITS \|
				1723	\|----------\| ------------------
				1724	\| BL2 \|
				1725	\|----------\|
				1726	\| \|
				1727	0x04001000 +----------+
				1728	\| Shared \|
				1729	0x04000000 +----------+
				1730
				1731	Trusted ROM
				1732	0x04000000 +----------+
				1733	\| BL1 (ro) \|
				1734	0x00000000 +----------+
				1735
				1736	Juno with BL32 in Trusted SRAM (default option):
				1737
				1738	::
				1739
				1740	Flash0
				1741	0x0C000000 +----------+
				1742	: :
				1743	0x0BED0000 \|----------\|
				1744	\| BL1 (ro) \|
				1745	0x0BEC0000 \|----------\|
				1746	: :
				1747	0x08000000 +----------+ BL31 is loaded
				1748	after SCP_BL2 has
				1749	Trusted SRAM been sent to SCP
				1750	0x04040000 +----------+ loaded by BL2 ------------------
				1751	\| BL1 (rw) \| <<<<<<<<<<<<< \| BL31 NOBITS \|
				1752	\|----------\| <<<<<<<<<<<<< \|----------------\|
				1753	\| SCP_BL2 \| <<<<<<<<<<<<< \| BL31 PROGBITS \|
				1754	\|----------\| ------------------
				1755	\| BL2 \| <<<<<<<<<<<<< \| BL32 NOBITS \|
				1756	\|----------\| <<<<<<<<<<<<< \|----------------\|
				1757	\| \| <<<<<<<<<<<<< \| BL32 PROGBITS \|
				1758	0x04001000 +----------+ ------------------
				1759	\| MHU \|
				1760	0x04000000 +----------+
				1761
				1762	Juno with BL32 in TZC-secured DRAM:
				1763
				1764	::
				1765
				1766	DRAM
				1767	0xFFE00000 +----------+
				1768	\| BL32 \| (secure)
				1769	0xFF000000 \|----------\|
				1770	\| \|
				1771	: : (non-secure)
				1772	\| \|
				1773	0x80000000 +----------+
				1774
				1775	Flash0
				1776	0x0C000000 +----------+
				1777	: :
				1778	0x0BED0000 \|----------\|
				1779	\| BL1 (ro) \|
				1780	0x0BEC0000 \|----------\|
				1781	: :
				1782	0x08000000 +----------+ BL31 is loaded
				1783	after SCP_BL2 has
				1784	Trusted SRAM been sent to SCP
				1785	0x04040000 +----------+ loaded by BL2 ------------------
				1786	\| BL1 (rw) \| <<<<<<<<<<<<< \| BL31 NOBITS \|
				1787	\|----------\| <<<<<<<<<<<<< \|----------------\|
				1788	\| SCP_BL2 \| <<<<<<<<<<<<< \| BL31 PROGBITS \|
				1789	\|----------\| ------------------
				1790	\| BL2 \|
				1791	\|----------\|
				1792	\| \|
				1793	0x04001000 +----------+
				1794	\| MHU \|
				1795	0x04000000 +----------+
				1796
				1797	Firmware Image Package (FIP)
				1798	----------------------------
				1799
				1800	Using a Firmware Image Package (FIP) allows for packing bootloader images (and
				1801	potentially other payloads) into a single archive that can be loaded by the ARM
				1802	Trusted Firmware from non-volatile platform storage. A driver to load images
				1803	from a FIP has been added to the storage layer and allows a package to be read
				1804	from supported platform storage. A tool to create Firmware Image Packages is
				1805	also provided and described below.
				1806
				1807	Firmware Image Package layout
				1808	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				1809
				1810	The FIP layout consists of a table of contents (ToC) followed by payload data.
				1811	The ToC itself has a header followed by one or more table entries. The ToC is
				1812	terminated by an end marker entry. All ToC entries describe some payload data
				1813	that has been appended to the end of the binary package. With the information
				1814	provided in the ToC entry the corresponding payload data can be retrieved.
				1815
				1816	::
				1817
				1818	------------------
				1819	\| ToC Header \|
				1820	\|----------------\|
				1821	\| ToC Entry 0 \|
				1822	\|----------------\|
				1823	\| ToC Entry 1 \|
				1824	\|----------------\|
				1825	\| ToC End Marker \|
				1826	\|----------------\|
				1827	\| \|
				1828	\| Data 0 \|
				1829	\| \|
				1830	\|----------------\|
				1831	\| \|
				1832	\| Data 1 \|
				1833	\| \|
				1834	------------------
				1835
				1836	The ToC header and entry formats are described in the header file
				1837	``include/tools_share/firmware_image_package.h``. This file is used by both the
				1838	tool and the ARM Trusted firmware.
				1839
				1840	The ToC header has the following fields:
				1841
				1842	::
				1843
				1844	`name`: The name of the ToC. This is currently used to validate the header.
				1845	`serial_number`: A non-zero number provided by the creation tool
				1846	`flags`: Flags associated with this data.
				1847	Bits 0-31: Reserved
				1848	Bits 32-47: Platform defined
				1849	Bits 48-63: Reserved
				1850
				1851	A ToC entry has the following fields:
				1852
				1853	::
				1854
				1855	`uuid`: All files are referred to by a pre-defined Universally Unique
				1856	IDentifier [UUID] . The UUIDs are defined in
				1857	`include/tools_share/firmware_image_package.h`. The platform translates
				1858	the requested image name into the corresponding UUID when accessing the
				1859	package.
				1860	`offset_address`: The offset address at which the corresponding payload data
				1861	can be found. The offset is calculated from the ToC base address.
				1862	`size`: The size of the corresponding payload data in bytes.
				1863	`flags`: Flags associated with this entry. Non are yet defined.
				1864
				1865	Firmware Image Package creation tool
				1866	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				1867
				1868	The FIP creation tool can be used to pack specified images into a binary package
				1869	that can be loaded by the ARM Trusted Firmware from platform storage. The tool
				1870	currently only supports packing bootloader images. Additional image definitions
				1871	can be added to the tool as required.
				1872
				1873	The tool can be found in ``tools/fiptool``.
				1874
				1875	Loading from a Firmware Image Package (FIP)
				1876	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				1877
				1878	The Firmware Image Package (FIP) driver can load images from a binary package on
				1879	non-volatile platform storage. For the ARM development platforms, this is
				1880	currently NOR FLASH.
				1881
				1882	Bootloader images are loaded according to the platform policy as specified by
				1883	the function ``plat_get_image_source()``. For the ARM development platforms, this
				1884	means the platform will attempt to load images from a Firmware Image Package
				1885	located at the start of NOR FLASH0.
				1886
				1887	The ARM development platforms' policy is to only allow loading of a known set of
				1888	images. The platform policy can be modified to allow additional images.
				1889
				1890	Use of coherent memory in Trusted Firmware
				1891	------------------------------------------
				1892
				1893	There might be loss of coherency when physical memory with mismatched
				1894	shareability, cacheability and memory attributes is accessed by multiple CPUs
				1895	(refer to section B2.9 of `ARM ARM`_ for more details). This possibility occurs
				1896	in Trusted Firmware during power up/down sequences when coherency, MMU and
				1897	caches are turned on/off incrementally.
				1898
				1899	Trusted Firmware defines coherent memory as a region of memory with Device
				1900	nGnRE attributes in the translation tables. The translation granule size in
				1901	Trusted Firmware is 4KB. This is the smallest possible size of the coherent
				1902	memory region.
				1903
				1904	By default, all data structures which are susceptible to accesses with
				1905	mismatched attributes from various CPUs are allocated in a coherent memory
				1906	region (refer to section 2.1 of `Porting Guide`_). The coherent memory region
				1907	accesses are Outer Shareable, non-cacheable and they can be accessed
				1908	with the Device nGnRE attributes when the MMU is turned on. Hence, at the
				1909	expense of at least an extra page of memory, Trusted Firmware is able to work
				1910	around coherency issues due to mismatched memory attributes.
				1911
				1912	The alternative to the above approach is to allocate the susceptible data
				1913	structures in Normal WriteBack WriteAllocate Inner shareable memory. This
				1914	approach requires the data structures to be designed so that it is possible to
				1915	work around the issue of mismatched memory attributes by performing software
				1916	cache maintenance on them.
				1917
				1918	Disabling the use of coherent memory in Trusted Firmware
				1919	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				1920
				1921	It might be desirable to avoid the cost of allocating coherent memory on
				1922	platforms which are memory constrained. Trusted Firmware enables inclusion of
				1923	coherent memory in firmware images through the build flag ``USE_COHERENT_MEM``.
				1924	This flag is enabled by default. It can be disabled to choose the second
				1925	approach described above.
				1926
				1927	The below sections analyze the data structures allocated in the coherent memory
				1928	region and the changes required to allocate them in normal memory.
				1929
				1930	Coherent memory usage in PSCI implementation
				1931	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				1932
				1933	The ``psci_non_cpu_pd_nodes`` data structure stores the platform's power domain
				1934	tree information for state management of power domains. By default, this data
				1935	structure is allocated in the coherent memory region in the Trusted Firmware
				1936	because it can be accessed by multple CPUs, either with caches enabled or
				1937	disabled.
				1938
				1939	.. code:: c
				1940
				1941	typedef struct non_cpu_pwr_domain_node {
				1942	/*
				1943	* Index of the first CPU power domain node level 0 which has this node
				1944	* as its parent.
				1945	*/
				1946	unsigned int cpu_start_idx;
				1947
				1948	/*
				1949	* Number of CPU power domains which are siblings of the domain indexed
				1950	* by 'cpu_start_idx' i.e. all the domains in the range 'cpu_start_idx
				1951	* -> cpu_start_idx + ncpus' have this node as their parent.
				1952	*/
				1953	unsigned int ncpus;
				1954
				1955	/*
				1956	* Index of the parent power domain node.
				1957	* TODO: Figure out whether to whether using pointer is more efficient.
				1958	*/
				1959	unsigned int parent_node;
				1960
				1961	plat_local_state_t local_state;
				1962
				1963	unsigned char level;
				1964
				1965	/* For indexing the psci_lock array*/
				1966	unsigned char lock_index;
				1967	} non_cpu_pd_node_t;
				1968
				1969	In order to move this data structure to normal memory, the use of each of its
				1970	fields must be analyzed. Fields like ``cpu_start_idx``, ``ncpus``, ``parent_node``
				1971	``level`` and ``lock_index`` are only written once during cold boot. Hence removing
				1972	them from coherent memory involves only doing a clean and invalidate of the
				1973	cache lines after these fields are written.
				1974
				1975	The field ``local_state`` can be concurrently accessed by multiple CPUs in
				1976	different cache states. A Lamport's Bakery lock ``psci_locks`` is used to ensure
				1977	mutual exlusion to this field and a clean and invalidate is needed after it
				1978	is written.
				1979
				1980	Bakery lock data
				1981	~~~~~~~~~~~~~~~~
				1982
				1983	The bakery lock data structure ``bakery_lock_t`` is allocated in coherent memory
				1984	and is accessed by multiple CPUs with mismatched attributes. ``bakery_lock_t`` is
				1985	defined as follows:
				1986
				1987	.. code:: c
				1988
				1989	typedef struct bakery_lock {
				1990	/*
				1991	* The lock_data is a bit-field of 2 members:
				1992	* Bit[0] : choosing. This field is set when the CPU is
				1993	* choosing its bakery number.
				1994	* Bits[1 - 15] : number. This is the bakery number allocated.
				1995	*/
				1996	volatile uint16_t lock_data[BAKERY_LOCK_MAX_CPUS];
				1997	} bakery_lock_t;
				1998
				1999	It is a characteristic of Lamport's Bakery algorithm that the volatile per-CPU
				2000	fields can be read by all CPUs but only written to by the owning CPU.
				2001
				2002	Depending upon the data cache line size, the per-CPU fields of the
				2003	``bakery_lock_t`` structure for multiple CPUs may exist on a single cache line.
				2004	These per-CPU fields can be read and written during lock contention by multiple
				2005	CPUs with mismatched memory attributes. Since these fields are a part of the
				2006	lock implementation, they do not have access to any other locking primitive to
				2007	safeguard against the resulting coherency issues. As a result, simple software
				2008	cache maintenance is not enough to allocate them in coherent memory. Consider
				2009	the following example.
				2010
				2011	CPU0 updates its per-CPU field with data cache enabled. This write updates a
				2012	local cache line which contains a copy of the fields for other CPUs as well. Now
				2013	CPU1 updates its per-CPU field of the ``bakery_lock_t`` structure with data cache
				2014	disabled. CPU1 then issues a DCIVAC operation to invalidate any stale copies of
				2015	its field in any other cache line in the system. This operation will invalidate
				2016	the update made by CPU0 as well.
				2017
				2018	To use bakery locks when ``USE_COHERENT_MEM`` is disabled, the lock data structure
				2019	has been redesigned. The changes utilise the characteristic of Lamport's Bakery
				2020	algorithm mentioned earlier. The bakery\_lock structure only allocates the memory
				2021	for a single CPU. The macro ``DEFINE_BAKERY_LOCK`` allocates all the bakery locks
				2022	needed for a CPU into a section ``bakery_lock``. The linker allocates the memory
				2023	for other cores by using the total size allocated for the bakery\_lock section
				2024	and multiplying it with (PLATFORM\_CORE\_COUNT - 1). This enables software to
				2025	perform software cache maintenance on the lock data structure without running
				2026	into coherency issues associated with mismatched attributes.
				2027
				2028	The bakery lock data structure ``bakery_info_t`` is defined for use when
				2029	``USE_COHERENT_MEM`` is disabled as follows:
				2030
				2031	.. code:: c
				2032
				2033	typedef struct bakery_info {
				2034	/*
				2035	* The lock_data is a bit-field of 2 members:
				2036	* Bit[0] : choosing. This field is set when the CPU is
				2037	* choosing its bakery number.
				2038	* Bits[1 - 15] : number. This is the bakery number allocated.
				2039	*/
				2040	volatile uint16_t lock_data;
				2041	} bakery_info_t;
				2042
				2043	The ``bakery_info_t`` represents a single per-CPU field of one lock and
				2044	the combination of corresponding ``bakery_info_t`` structures for all CPUs in the
				2045	system represents the complete bakery lock. The view in memory for a system
				2046	with n bakery locks are:
				2047
				2048	::
				2049
				2050	bakery_lock section start
				2051	\|----------------\|
				2052	\| `bakery_info_t`\| <-- Lock_0 per-CPU field
				2053	\| Lock_0 \| for CPU0
				2054	\|----------------\|
				2055	\| `bakery_info_t`\| <-- Lock_1 per-CPU field
				2056	\| Lock_1 \| for CPU0
				2057	\|----------------\|
				2058	\| .... \|
				2059	\|----------------\|
				2060	\| `bakery_info_t`\| <-- Lock_N per-CPU field
				2061	\| Lock_N \| for CPU0
				2062	------------------
				2063	\| XXXXX \|
				2064	\| Padding to \|
				2065	\| next Cache WB \| <--- Calculate PERCPU_BAKERY_LOCK_SIZE, allocate
				2066	\| Granule \| continuous memory for remaining CPUs.
				2067	------------------
				2068	\| `bakery_info_t`\| <-- Lock_0 per-CPU field
				2069	\| Lock_0 \| for CPU1
				2070	\|----------------\|
				2071	\| `bakery_info_t`\| <-- Lock_1 per-CPU field
				2072	\| Lock_1 \| for CPU1
				2073	\|----------------\|
				2074	\| .... \|
				2075	\|----------------\|
				2076	\| `bakery_info_t`\| <-- Lock_N per-CPU field
				2077	\| Lock_N \| for CPU1
				2078	------------------
				2079	\| XXXXX \|
				2080	\| Padding to \|
				2081	\| next Cache WB \|
				2082	\| Granule \|
				2083	------------------
				2084
				2085	Consider a system of 2 CPUs with 'N' bakery locks as shown above. For an
				2086	operation on Lock\_N, the corresponding ``bakery_info_t`` in both CPU0 and CPU1
				2087	``bakery_lock`` section need to be fetched and appropriate cache operations need
				2088	to be performed for each access.
				2089
				2090	On ARM Platforms, bakery locks are used in psci (``psci_locks``) and power controller
				2091	driver (``arm_lock``).
				2092
				2093	Non Functional Impact of removing coherent memory
				2094	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				2095
				2096	Removal of the coherent memory region leads to the additional software overhead
				2097	of performing cache maintenance for the affected data structures. However, since
				2098	the memory where the data structures are allocated is cacheable, the overhead is
				2099	mostly mitigated by an increase in performance.
				2100
				2101	There is however a performance impact for bakery locks, due to:
				2102
				2103	- Additional cache maintenance operations, and
				2104	- Multiple cache line reads for each lock operation, since the bakery locks
				2105	for each CPU are distributed across different cache lines.
				2106
				2107	The implementation has been optimized to minimize this additional overhead.
				2108	Measurements indicate that when bakery locks are allocated in Normal memory, the
				2109	minimum latency of acquiring a lock is on an average 3-4 micro seconds whereas
				2110	in Device memory the same is 2 micro seconds. The measurements were done on the
				2111	Juno ARM development platform.
				2112
				2113	As mentioned earlier, almost a page of memory can be saved by disabling
				2114	``USE_COHERENT_MEM``. Each platform needs to consider these trade-offs to decide
				2115	whether coherent memory should be used. If a platform disables
				2116	``USE_COHERENT_MEM`` and needs to use bakery locks in the porting layer, it can
				2117	optionally define macro ``PLAT_PERCPU_BAKERY_LOCK_SIZE`` (see the
				2118	`Porting Guide`_). Refer to the reference platform code for examples.
				2119
				2120	Isolating code and read-only data on separate memory pages
				2121	----------------------------------------------------------
				2122
				2123	In the ARMv8 VMSA, translation table entries include fields that define the
				2124	properties of the target memory region, such as its access permissions. The
				2125	smallest unit of memory that can be addressed by a translation table entry is
				2126	a memory page. Therefore, if software needs to set different permissions on two
				2127	memory regions then it needs to map them using different memory pages.
				2128
				2129	The default memory layout for each BL image is as follows:
				2130
				2131	::
				2132
				2133	\| ... \|
				2134	+-------------------+
				2135	\| Read-write data \|
				2136	+-------------------+ Page boundary
				2137	\| <Padding> \|
				2138	+-------------------+
				2139	\| Exception vectors \|
				2140	+-------------------+ 2 KB boundary
				2141	\| <Padding> \|
				2142	+-------------------+
				2143	\| Read-only data \|
				2144	+-------------------+
				2145	\| Code \|
				2146	+-------------------+ BLx_BASE
				2147
				2148	Note: The 2KB alignment for the exception vectors is an architectural
				2149	requirement.
				2150
				2151	The read-write data start on a new memory page so that they can be mapped with
				2152	read-write permissions, whereas the code and read-only data below are configured
				2153	as read-only.
				2154
				2155	However, the read-only data are not aligned on a page boundary. They are
				2156	contiguous to the code. Therefore, the end of the code section and the beginning
				2157	of the read-only data one might share a memory page. This forces both to be
				2158	mapped with the same memory attributes. As the code needs to be executable, this
				2159	means that the read-only data stored on the same memory page as the code are
				2160	executable as well. This could potentially be exploited as part of a security
				2161	attack.
				2162
				2163	TF provides the build flag ``SEPARATE_CODE_AND_RODATA`` to isolate the code and
				2164	read-only data on separate memory pages. This in turn allows independent control
				2165	of the access permissions for the code and read-only data. In this case,
				2166	platform code gets a finer-grained view of the image layout and can
				2167	appropriately map the code region as executable and the read-only data as
				2168	execute-never.
				2169
				2170	This has an impact on memory footprint, as padding bytes need to be introduced
				2171	between the code and read-only data to ensure the segragation of the two. To
				2172	limit the memory cost, this flag also changes the memory layout such that the
				2173	code and exception vectors are now contiguous, like so:
				2174
				2175	::
				2176
				2177	\| ... \|
				2178	+-------------------+
				2179	\| Read-write data \|
				2180	+-------------------+ Page boundary
				2181	\| <Padding> \|
				2182	+-------------------+
				2183	\| Read-only data \|
				2184	+-------------------+ Page boundary
				2185	\| <Padding> \|
				2186	+-------------------+
				2187	\| Exception vectors \|
				2188	+-------------------+ 2 KB boundary
				2189	\| <Padding> \|
				2190	+-------------------+
				2191	\| Code \|
				2192	+-------------------+ BLx_BASE
				2193
				2194	With this more condensed memory layout, the separation of read-only data will
				2195	add zero or one page to the memory footprint of each BL image. Each platform
				2196	should consider the trade-off between memory footprint and security.
				2197
				2198	This build flag is disabled by default, minimising memory footprint. On ARM
				2199	platforms, it is enabled.
				2200
				2201	Performance Measurement Framework
				2202	---------------------------------
				2203
				2204	The Performance Measurement Framework (PMF) facilitates collection of
				2205	timestamps by registered services and provides interfaces to retrieve
				2206	them from within the ARM Trusted Firmware. A platform can choose to
				2207	expose appropriate SMCs to retrieve these collected timestamps.
				2208
				2209	By default, the global physical counter is used for the timestamp
				2210	value and is read via ``CNTPCT_EL0``. The framework allows to retrieve
				2211	timestamps captured by other CPUs.
				2212
				2213	Timestamp identifier format
				2214	~~~~~~~~~~~~~~~~~~~~~~~~~~~
				2215
				2216	A PMF timestamp is uniquely identified across the system via the
				2217	timestamp ID or ``tid``. The ``tid`` is composed as follows:
				2218
				2219	::
				2220
				2221	Bits 0-7: The local timestamp identifier.
				2222	Bits 8-9: Reserved.
				2223	Bits 10-15: The service identifier.
				2224	Bits 16-31: Reserved.
				2225
				2226	#. The service identifier. Each PMF service is identified by a
				2227	service name and a service identifier. Both the service name and
				2228	identifier are unique within the system as a whole.
				2229
				2230	#. The local timestamp identifier. This identifier is unique within a given
				2231	service.
				2232
				2233	Registering a PMF service
				2234	~~~~~~~~~~~~~~~~~~~~~~~~~
				2235
				2236	To register a PMF service, the ``PMF_REGISTER_SERVICE()`` macro from ``pmf.h``
				2237	is used. The arguments required are the service name, the service ID,
				2238	the total number of local timestamps to be captured and a set of flags.
				2239
				2240	The ``flags`` field can be specified as a bitwise-OR of the following values:
				2241
				2242	::
				2243
				2244	PMF_STORE_ENABLE: The timestamp is stored in memory for later retrieval.
				2245	PMF_DUMP_ENABLE: The timestamp is dumped on the serial console.
				2246
				2247	The ``PMF_REGISTER_SERVICE()`` reserves memory to store captured
				2248	timestamps in a PMF specific linker section at build time.
				2249	Additionally, it defines necessary functions to capture and
				2250	retrieve a particular timestamp for the given service at runtime.
				2251
				2252	The macro ``PMF_REGISTER_SERVICE()`` only enables capturing PMF
				2253	timestamps from within ARM Trusted Firmware. In order to retrieve
				2254	timestamps from outside of ARM Trusted Firmware, the
				2255	``PMF_REGISTER_SERVICE_SMC()`` macro must be used instead. This macro
				2256	accepts the same set of arguments as the ``PMF_REGISTER_SERVICE()``
				2257	macro but additionally supports retrieving timestamps using SMCs.
				2258
				2259	Capturing a timestamp
				2260	~~~~~~~~~~~~~~~~~~~~~
				2261
				2262	PMF timestamps are stored in a per-service timestamp region. On a
				2263	system with multiple CPUs, each timestamp is captured and stored
				2264	in a per-CPU cache line aligned memory region.
				2265
				2266	Having registered the service, the ``PMF_CAPTURE_TIMESTAMP()`` macro can be
				2267	used to capture a timestamp at the location where it is used. The macro
				2268	takes the service name, a local timestamp identifier and a flag as arguments.
				2269
				2270	The ``flags`` field argument can be zero, or ``PMF_CACHE_MAINT`` which
				2271	instructs PMF to do cache maintenance following the capture. Cache
				2272	maintenance is required if any of the service's timestamps are captured
				2273	with data cache disabled.
				2274
				2275	To capture a timestamp in assembly code, the caller should use
				2276	``pmf_calc_timestamp_addr`` macro (defined in ``pmf_asm_macros.S``) to
				2277	calculate the address of where the timestamp would be stored. The
				2278	caller should then read ``CNTPCT_EL0`` register to obtain the timestamp
				2279	and store it at the determined address for later retrieval.
				2280
				2281	Retrieving a timestamp
				2282	~~~~~~~~~~~~~~~~~~~~~~
				2283
				2284	From within ARM Trusted Firmware, timestamps for individual CPUs can
				2285	be retrieved using either ``PMF_GET_TIMESTAMP_BY_MPIDR()`` or
				2286	``PMF_GET_TIMESTAMP_BY_INDEX()`` macros. These macros accept the CPU's MPIDR
				2287	value, or its ordinal position, respectively.
				2288
				2289	From outside ARM Trusted Firmware, timestamps for individual CPUs can be
				2290	retrieved by calling into ``pmf_smc_handler()``.
				2291
				2292	.. code:: c
				2293
				2294	Interface : pmf_smc_handler()
				2295	Argument : unsigned int smc_fid, u_register_t x1,
				2296	u_register_t x2, u_register_t x3,
				2297	u_register_t x4, void *cookie,
				2298	void *handle, u_register_t flags
				2299	Return : uintptr_t
				2300
				2301	smc_fid: Holds the SMC identifier which is either `PMF_SMC_GET_TIMESTAMP_32`
				2302	when the caller of the SMC is running in AArch32 mode
				2303	or `PMF_SMC_GET_TIMESTAMP_64` when the caller is running in AArch64 mode.
				2304	x1: Timestamp identifier.
				2305	x2: The `mpidr` of the CPU for which the timestamp has to be retrieved.
				2306	This can be the `mpidr` of a different core to the one initiating
				2307	the SMC. In that case, service specific cache maintenance may be
				2308	required to ensure the updated copy of the timestamp is returned.
				2309	x3: A flags value that is either 0 or `PMF_CACHE_MAINT`. If
				2310	`PMF_CACHE_MAINT` is passed, then the PMF code will perform a
				2311	cache invalidate before reading the timestamp. This ensures
				2312	an updated copy is returned.
				2313
				2314	The remaining arguments, ``x4``, ``cookie``, ``handle`` and ``flags`` are unused
				2315	in this implementation.
				2316
				2317	PMF code structure
				2318	~~~~~~~~~~~~~~~~~~
				2319
				2320	#. ``pmf_main.c`` consists of core functions that implement service registration,
				2321	initialization, storing, dumping and retrieving timestamps.
				2322
				2323	#. ``pmf_smc.c`` contains the SMC handling for registered PMF services.
				2324
				2325	#. ``pmf.h`` contains the public interface to Performance Measurement Framework.
				2326
				2327	#. ``pmf_asm_macros.S`` consists of macros to facilitate capturing timestamps in
				2328	assembly code.
				2329
				2330	#. ``pmf_helpers.h`` is an internal header used by ``pmf.h``.
				2331
				2332	#. .. rubric:: ARMv8 Architecture Extensions
				2333	:name: armv8-architecture-extensions
				2334
				2335	ARM Trusted Firmware makes use of ARMv8 Architecture Extensions where
				2336	applicable. This section lists the usage of Architecture Extensions, and build
				2337	flags controlling them.
				2338
				2339	In general, and unless individually mentioned, the build options
				2340	``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` selects the Architecture Extension to
				2341	target when building ARM Trusted Firmware. Subsequent ARM Architecture
				2342	Extensions are backward compatible with previous versions.
				2343
				2344	The build system only requires that ``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` have a
				2345	valid numeric value. These build options only control whether or not
				2346	Architecture Extension-specific code is included in the build. Otherwise, ARM
				2347	Trusted Firmware targets the base ARMv8.0 architecture; i.e. as if
				2348	``ARM_ARCH_MAJOR`` == 8 and ``ARM_ARCH_MINOR`` == 0, which are also their respective
				2349	default values.
				2350
				2351	See also the Summary of build options in `User Guide`_.
				2352
				2353	For details on the Architecture Extension and available features, please refer
				2354	to the respective Architecture Extension Supplement.
				2355
				2356	ARMv8.1
				2357	~~~~~~~
				2358
				2359	This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` >= 8, or when
				2360	``ARM_ARCH_MAJOR`` == 8 and ``ARM_ARCH_MINOR`` >= 1.
				2361
				2362	- The Compare and Swap instruction is used to implement spinlocks. Otherwise,
				2363	the load-/store-exclusive instruction pair is used.
				2364
				2365	Code Structure
				2366	--------------
				2367
				2368	Trusted Firmware code is logically divided between the three boot loader
				2369	stages mentioned in the previous sections. The code is also divided into the
				2370	following categories (present as directories in the source code):
				2371
				2372	- Platform specific. Choice of architecture specific code depends upon
				2373	the platform.
				2374	- Common code. This is platform and architecture agnostic code.
				2375	- Library code. This code comprises of functionality commonly used by all
				2376	other code. The PSCI implementation and other EL3 runtime frameworks reside
				2377	as Library components.
				2378	- Stage specific. Code specific to a boot stage.
				2379	- Drivers.
				2380	- Services. EL3 runtime services (eg: SPD). Specific SPD services
				2381	reside in the ``services/spd`` directory (e.g. ``services/spd/tspd``).
				2382
				2383	Each boot loader stage uses code from one or more of the above mentioned
				2384	categories. Based upon the above, the code layout looks like this:
				2385
				2386	::
				2387
				2388	Directory Used by BL1? Used by BL2? Used by BL31?
				2389	bl1 Yes No No
				2390	bl2 No Yes No
				2391	bl31 No No Yes
				2392	plat Yes Yes Yes
				2393	drivers Yes No Yes
				2394	common Yes Yes Yes
				2395	lib Yes Yes Yes
				2396	services No No Yes
				2397
				2398	The build system provides a non configurable build option IMAGE\_BLx for each
				2399	boot loader stage (where x = BL stage). e.g. for BL1 , IMAGE\_BL1 will be
				2400	defined by the build system. This enables the Trusted Firmware to compile
				2401	certain code only for specific boot loader stages
				2402
				2403	All assembler files have the ``.S`` extension. The linker source files for each
				2404	boot stage have the extension ``.ld.S``. These are processed by GCC to create the
				2405	linker scripts which have the extension ``.ld``.
				2406
				2407	FDTs provide a description of the hardware platform and are used by the Linux
				2408	kernel at boot time. These can be found in the ``fdts`` directory.
				2409
				2410	References
				2411	----------
				2412
				2413	#. Trusted Board Boot Requirements CLIENT PDD (ARM DEN 0006B-5). Available
				2414	under NDA through your ARM account representative.
				2415
				2416	#. `Power State Coordination Interface PDD`_
				2417
				2418	#. `SMC Calling Convention PDD`_
				2419
				2420	#. `ARM Trusted Firmware Interrupt Management Design guide`_.
				2421
				2422	--------------
				2423
				2424	Copyright (c) 2013-2016, ARM Limited and Contributors. All rights reserved.
				2425
				2426	.. _Reset Design: ./reset-design.rst
				2427	.. _Porting Guide: ./porting-guide.rst
				2428	.. _Firmware Update: ./firmware-update.rst
				2429	.. _PSCI PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf
				2430	.. _SMC calling convention PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
				2431	.. _PSCI Library integration guide: ./psci-lib-integration-guide.rst
				2432	.. _SMCCC: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
				2433	.. _PSCI: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf
				2434	.. _Power State Coordination Interface PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf
				2435	.. _here: ./psci-lib-integration-guide.rst
				2436	.. _cpu-specific-build-macros.rst: ./cpu-specific-build-macros.rst
				2437	.. _CPUBM: ./cpu-specific-build-macros.rst
				2438	.. _ARM ARM: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0487a.e/index.html
				2439	.. _User Guide: ./user-guide.rst
				2440	.. _SMC Calling Convention PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
				2441	.. _ARM Trusted Firmware Interrupt Management Design guide: ./interrupt-framework-design.rst
				2442
				2443	.. \|Image 1\| image:: diagrams/rt-svc-descs-layout.png?raw=true