+Abstracting a Chain of Trust
+.. contents::
+The aim of this document is to describe the authentication framework
+implemented in Trusted Firmware-A (TF-A). This framework fulfills the
+following requirements:
+#. It should be possible for a platform port to specify the Chain of Trust in
+   terms of certificate hierarchy and the mechanisms used to verify a
+   particular image/certificate.
+#. The framework should distinguish between:
+   -  The mechanism used to encode and transport information, e.g. DER encoded
+      X.509v3 certificates to ferry Subject Public Keys, hashes and non-volatile
+      counters.
+   -  The mechanism used to verify the transported information i.e. the
+      cryptographic libraries.
+The framework has been designed following a modular approach illustrated in the
+next diagram:
+        +---------------+---------------+------------+
+        | Trusted       | Trusted       | Trusted    |
+        | Firmware      | Firmware      | Firmware   |
+        | Generic       | IO Framework  | Platform   |
+        | Code i.e.     | (IO)          | Port       |
+        | BL1/BL2 (GEN) |               | (PP)       |
+        +---------------+---------------+------------+
+               ^               ^               ^
+               |               |               |
+               v               v               v
+         +-----------+   +-----------+   +-----------+
+         |           |   |           |   | Image     |
+         | Crypto    |   | Auth      |   | Parser    |
+         | Module    |<->| Module    |<->| Module    |
+         | (CM)      |   | (AM)      |   | (IPM)     |
+         |           |   |           |   |           |
+         +-----------+   +-----------+   +-----------+
+               ^                               ^
+               |                               |
+               v                               v
+        +----------------+             +-----------------+
+        | Cryptographic  |             | Image Parser    |
+        | Libraries (CL) |             | Libraries (IPL) |
+        +----------------+             +-----------------+
+                      |                |
+                      |                |
+                      |                |
+                      v                v
+                     +-----------------+
+                     | Misc. Libs e.g. |
+                     | ASN.1 decoder   |
+                     |                 |
+                     +-----------------+
+        DIAGRAM 1.
+This document describes the inner details of the authentication framework and
+the abstraction mechanisms available to specify a Chain of Trust.
+Framework design
+This section describes some aspects of the framework design and the rationale
+behind them. These aspects are key to verify a Chain of Trust.
+Chain of Trust
+A CoT is basically a sequence of authentication images which usually starts with
+a root of trust and culminates in a single data image. The following diagram
+illustrates how this maps to a CoT for the BL31 image described in the
+`TBBR-Client specification`_.
+        +------------------+       +-------------------+
+        | ROTPK/ROTPK Hash |------>| Trusted Key       |
+        +------------------+       | Certificate       |
+                                   | (Auth Image)      |
+                                  /+-------------------+
+                                 /            |
+                                /             |
+                               /              |
+                              /               |
+                             L                v
+        +------------------+       +-------------------+
+        | Trusted World    |------>| BL31 Key          |
+        | Public Key       |       | Certificate       |
+        +------------------+       | (Auth Image)      |
+                                   +-------------------+
+                                  /           |
+                                 /            |
+                                /             |
+                               /              |
+                              /               v
+        +------------------+ L     +-------------------+
+        | BL31 Content     |------>| BL31 Content      |
+        | Certificate PK   |       | Certificate       |
+        +------------------+       | (Auth Image)      |
+                                   +-------------------+
+                                  /           |
+                                 /            |
+                                /             |
+                               /              |
+                              /               v
+        +------------------+ L     +-------------------+
+        | BL31 Hash        |------>| BL31 Image        |
+        |                  |       | (Data Image)      |
+        +------------------+       |                   |
+                                   +-------------------+
+        DIAGRAM 2.
+The root of trust is usually a public key (ROTPK) that has been burnt in the
+platform and cannot be modified.
+Image types
+Images in a CoT are categorised as authentication and data images. An
+authentication image contains information to authenticate a data image or
+another authentication image. A data image is usually a boot loader binary, but
+it could be any other data that requires authentication.
+Component responsibilities
+For every image in a Chain of Trust, the following high level operations are
+performed to verify it:
+#. Allocate memory for the image either statically or at runtime.
+#. Identify the image and load it in the allocated memory.
+#. Check the integrity of the image as per its type.
+#. Authenticate the image as per the cryptographic algorithms used.
+#. If the image is an authentication image, extract the information that will
+   be used to authenticate the next image in the CoT.
+In Diagram 1, each component is responsible for one or more of these operations.
+The responsibilities are briefly described below.
+TF-A Generic code and IO framework (GEN/IO)
+These components are responsible for initiating the authentication process for a
+particular image in BL1 or BL2. For each BL image that requires authentication,
+the Generic code asks recursively the Authentication module what is the parent
+image until either an authenticated image or the ROT is reached. Then the
+Generic code calls the IO framework to load the image and calls the
+Authentication module to authenticate it, following the CoT from ROT to Image.
+TF-A Platform Port (PP)
+The platform is responsible for:
+#. Specifying the CoT for each image that needs to be authenticated. Details of
+   how a CoT can be specified by the platform are explained later. The platform
+   also specifies the authentication methods and the parsing method used for
+   each image.
+#. Statically allocating memory for each parameter in each image which is
+   used for verifying the CoT, e.g. memory for public keys, hashes etc.
+#. Providing the ROTPK or a hash of it.
+#. Providing additional information to the IPM to enable it to identify and
+   extract authentication parameters contained in an image, e.g. if the
+   parameters are stored as X509v3 extensions, the corresponding OID must be
+   provided.
+#. Fulfill any other memory requirements of the IPM and the CM (not currently
+   described in this document).
+#. Export functions to verify an image which uses an authentication method that
+   cannot be interpreted by the CM, e.g. if an image has to be verified using a
+   NV counter, then the value of the counter to compare with can only be
+   provided by the platform.
+#. Export a custom IPM if a proprietary image format is being used (described
+   later).
+Authentication Module (AM)
+It is responsible for:
+#. Providing the necessary abstraction mechanisms to describe a CoT. Amongst
+   other things, the authentication and image parsing methods must be specified
+   by the PP in the CoT.
+#. Verifying the CoT passed by GEN by utilising functionality exported by the
+   PP, IPM and CM.
+#. Tracking which images have been verified. In case an image is a part of
+   multiple CoTs then it should be verified only once e.g. the Trusted World
+   Key Certificate in the TBBR-Client spec. contains information to verify
+   SCP_BL2, BL31, BL32 each of which have a separate CoT. (This
+   responsibility has not been described in this document but should be
+   trivial to implement).
+#. Reusing memory meant for a data image to verify authentication images e.g.
+   in the CoT described in Diagram 2, each certificate can be loaded and
+   verified in the memory reserved by the platform for the BL31 image. By the
+   time BL31 (the data image) is loaded, all information to authenticate it
+   will have been extracted from the parent image i.e. BL31 content
+   certificate. It is assumed that the size of an authentication image will
+   never exceed the size of a data image. It should be possible to verify this
+   at build time using asserts.
+Cryptographic Module (CM)
+The CM is responsible for providing an API to:
+#. Verify a digital signature.
+#. Verify a hash.
+The CM does not include any cryptography related code, but it relies on an
+external library to perform the cryptographic operations. A Crypto-Library (CL)
+linking the CM and the external library must be implemented. The following
+functions must be provided by the CL:
+.. code:: c
+    void (*init)(void);
+    int (*verify_signature)(void *data_ptr, unsigned int data_len,
+                            void *sig_ptr, unsigned int sig_len,
+                            void *sig_alg, unsigned int sig_alg_len,
+                            void *pk_ptr, unsigned int pk_len);
+    int (*verify_hash)(void *data_ptr, unsigned int data_len,
+                       void *digest_info_ptr, unsigned int digest_info_len);
+These functions are registered in the CM using the macro:
+.. code:: c
+    REGISTER_CRYPTO_LIB(_name, _init, _verify_signature, _verify_hash);
+``_name`` must be a string containing the name of the CL. This name is used for
+debugging purposes.
+Image Parser Module (IPM)
+The IPM is responsible for:
+#. Checking the integrity of each image loaded by the IO framework.
+#. Extracting parameters used for authenticating an image based upon a
+   description provided by the platform in the CoT descriptor.
+Images may have different formats (for example, authentication images could be
+x509v3 certificates, signed ELF files or any other platform specific format).
+The IPM allows to register an Image Parser Library (IPL) for every image format
+used in the CoT. This library must implement the specific methods to parse the
+image. The IPM obtains the image format from the CoT and calls the right IPL to
+check the image integrity and extract the authentication parameters.
+See Section "Describing the image parsing methods" for more details about the
+mechanism the IPM provides to define and register IPLs.
+Authentication methods
+The AM supports the following authentication methods:
+#. Hash
+#. Digital signature
+The platform may specify these methods in the CoT in case it decides to define
+a custom CoT instead of reusing a predefined one.
+If a data image uses multiple methods, then all the methods must be a part of
+the same CoT. The number and type of parameters are method specific. These
+parameters should be obtained from the parent image using the IPM.
+#. Hash
+   Parameters:
+   #. A pointer to data to hash
+   #. Length of the data
+   #. A pointer to the hash
+   #. Length of the hash
+   The hash will be represented by the DER encoding of the following ASN.1
+   type:
+   ::
+       DigestInfo ::= SEQUENCE {
+           digestAlgorithm  DigestAlgorithmIdentifier,
+           digest           Digest
+       }
+   This ASN.1 structure makes it possible to remove any assumption about the
+   type of hash algorithm used as this information accompanies the hash. This
+   should allow the Cryptography Library (CL) to support multiple hash
+   algorithm implementations.
+#. Digital Signature
+   Parameters:
+   #. A pointer to data to sign
+   #. Length of the data
+   #. Public Key Algorithm
+   #. Public Key value
+   #. Digital Signature Algorithm
+   #. Digital Signature value
+   The Public Key parameters will be represented by the DER encoding of the
+   following ASN.1 type:
+   ::
+       SubjectPublicKeyInfo  ::=  SEQUENCE  {
+           algorithm         AlgorithmIdentifier{PUBLIC-KEY,{PublicKeyAlgorithms}},
+           subjectPublicKey  BIT STRING  }
+   The Digital Signature Algorithm will be represented by the DER encoding of
+   the following ASN.1 types.
+   ::
+       AlgorithmIdentifier {ALGORITHM:IOSet } ::= SEQUENCE {
+           algorithm         ALGORITHM.&id({IOSet}),
+           parameters        ALGORITHM.&Type({IOSet}{@algorithm}) OPTIONAL
+       }
+   The digital signature will be represented by:
+   ::
+       signature  ::=  BIT STRING
+The authentication framework will use the image descriptor to extract all the
+information related to authentication.
+Specifying a Chain of Trust
+A CoT can be described as a set of image descriptors linked together in a
+particular order. The order dictates the sequence in which they must be
+verified. Each image has a set of properties which allow the AM to verify it.
+These properties are described below.
+The PP is responsible for defining a single or multiple CoTs for a data image.
+Unless otherwise specified, the data structures described in the following
+sections are populated by the PP statically.
+Describing the image parsing methods
+The parsing method refers to the format of a particular image. For example, an
+authentication image that represents a certificate could be in the X.509v3
+format. A data image that represents a boot loader stage could be in raw binary
+or ELF format. The IPM supports three parsing methods. An image has to use one
+of the three methods described below. An IPL is responsible for interpreting a
+single parsing method. There has to be one IPL for every method used by the
+#. Raw format: This format is effectively a nop as an image using this method
+   is treated as being in raw binary format e.g. boot loader images used by
+   TF-A. This method should only be used by data images.
+#. X509V3 method: This method uses industry standards like X.509 to represent
+   PKI certificates (authentication images). It is expected that open source
+   libraries will be available which can be used to parse an image represented
+   by this method. Such libraries can be used to write the corresponding IPL
+   e.g. the X.509 parsing library code in mbed TLS.
+#. Platform defined method: This method caters for platform specific
+   proprietary standards to represent authentication or data images. For
+   example, The signature of a data image could be appended to the data image
+   raw binary. A header could be prepended to the combined blob to specify the
+   extents of each component. The platform will have to implement the
+   corresponding IPL to interpret such a format.
+The following enum can be used to define these three methods.
+.. code:: c
+    typedef enum img_type_enum {
+        IMG_RAW,            /* Binary image */
+        IMG_PLAT,           /* Platform specific format */
+        IMG_CERT,           /* X509v3 certificate */
+        IMG_MAX_TYPES,
+    } img_type_t;
+An IPL must provide functions with the following prototypes:
+.. code:: c
+    void init(void);
+    int check_integrity(void *img, unsigned int img_len);
+    int get_auth_param(const auth_param_type_desc_t *type_desc,
+                          void *img, unsigned int img_len,
+                          void **param, unsigned int *param_len);
+An IPL for each type must be registered using the following macro:
+    REGISTER_IMG_PARSER_LIB(_type, _name, _init, _check_int, _get_param)
+-  ``_type``: one of the types described above.
+-  ``_name``: a string containing the IPL name for debugging purposes.
+-  ``_init``: initialization function pointer.
+-  ``_check_int``: check image integrity function pointer.
+-  ``_get_param``: extract authentication parameter function pointer.
+The ``init()`` function will be used to initialize the IPL.
+The ``check_integrity()`` function is passed a pointer to the memory where the
+image has been loaded by the IO framework and the image length. It should ensure
+that the image is in the format corresponding to the parsing method and has not
+been tampered with. For example, RFC-2459 describes a validation sequence for an
+X.509 certificate.
+The ``get_auth_param()`` function is passed a parameter descriptor containing
+information about the parameter (``type_desc`` and ``cookie``) to identify and
+extract the data corresponding to that parameter from an image. This data will
+be used to verify either the current or the next image in the CoT sequence.
+Each image in the CoT will specify the parsing method it uses. This information
+will be used by the IPM to find the right parser descriptor for the image.
+Describing the authentication method(s)
+As part of the CoT, each image has to specify one or more authentication methods
+which will be used to verify it. As described in the Section "Authentication
+methods", there are three methods supported by the AM.
+.. code:: c
+    typedef enum {
+    } auth_method_type_t;
+The AM defines the type of each parameter used by an authentication method. It
+uses this information to:
+#. Specify to the ``get_auth_param()`` function exported by the IPM, which
+   parameter should be extracted from an image.
+#. Correctly marshall the parameters while calling the verification function
+   exported by the CM and PP.
+#. Extract authentication parameters from a parent image in order to verify a
+   child image e.g. to verify the certificate image, the public key has to be
+   obtained from the parent image.
+.. code:: c
+    typedef enum {
+        AUTH_PARAM_RAW_DATA,        /* Raw image data */
+        AUTH_PARAM_SIG,         /* The image signature */
+        AUTH_PARAM_SIG_ALG,     /* The image signature algorithm */
+        AUTH_PARAM_HASH,        /* A hash (including the algorithm) */
+        AUTH_PARAM_PUB_KEY,     /* A public key */
+    } auth_param_type_t;
+The AM defines the following structure to identify an authentication parameter
+required to verify an image.
+.. code:: c
+    typedef struct auth_param_type_desc_s {
+        auth_param_type_t type;
+        void *cookie;
+    } auth_param_type_desc_t;
+``cookie`` is used by the platform to specify additional information to the IPM
+which enables it to uniquely identify the parameter that should be extracted
+from an image. For example, the hash of a BL3x image in its corresponding
+content certificate is stored in an X509v3 custom extension field. An extension
+field can only be identified using an OID. In this case, the ``cookie`` could
+contain the pointer to the OID defined by the platform for the hash extension
+field while the ``type`` field could be set to ``AUTH_PARAM_HASH``. A value of 0 for
+the ``cookie`` field means that it is not used.
+For each method, the AM defines a structure with the parameters required to
+verify the image.
+.. code:: c
+    /*
+     * Parameters for authentication by hash matching
+     */
+    typedef struct auth_method_param_hash_s {
+        auth_param_type_desc_t *data;   /* Data to hash */
+        auth_param_type_desc_t *hash;   /* Hash to match with */
+    } auth_method_param_hash_t;
+    /*
+     * Parameters for authentication by signature
+     */
+    typedef struct auth_method_param_sig_s {
+        auth_param_type_desc_t *pk; /* Public key */
+        auth_param_type_desc_t *sig;    /* Signature to check */
+        auth_param_type_desc_t *alg;    /* Signature algorithm */
+        auth_param_type_desc_t *tbs;    /* Data signed */
+    } auth_method_param_sig_t;
+The AM defines the following structure to describe an authentication method for
+verifying an image
+.. code:: c
+    /*
+     * Authentication method descriptor
+     */
+    typedef struct auth_method_desc_s {
+        auth_method_type_t type;
+        union {
+            auth_method_param_hash_t hash;
+            auth_method_param_sig_t sig;
+        } param;
+    } auth_method_desc_t;
+Using the method type specified in the ``type`` field, the AM finds out what field
+needs to access within the ``param`` union.
+Storing Authentication parameters
+A parameter described by ``auth_param_type_desc_t`` to verify an image could be
+obtained from either the image itself or its parent image. The memory allocated
+for loading the parent image will be reused for loading the child image. Hence
+parameters which are obtained from the parent for verifying a child image need
+to have memory allocated for them separately where they can be stored. This
+memory must be statically allocated by the platform port.
+The AM defines the following structure to store the data corresponding to an
+authentication parameter.
+.. code:: c
+    typedef struct auth_param_data_desc_s {
+        void *auth_param_ptr;
+        unsigned int auth_param_len;
+    } auth_param_data_desc_t;
+The ``auth_param_ptr`` field is initialized by the platform. The ``auth_param_len``
+field is used to specify the length of the data in the memory.
+For parameters that can be obtained from the child image itself, the IPM is
+responsible for populating the ``auth_param_ptr`` and ``auth_param_len`` fields
+while executing the ``img_get_auth_param()`` function.
+The AM defines the following structure to enable an image to describe the
+parameters that should be extracted from it and used to verify the next image
+(child) in a CoT.
+.. code:: c
+    typedef struct auth_param_desc_s {
+        auth_param_type_desc_t type_desc;
+        auth_param_data_desc_t data;
+    } auth_param_desc_t;
+Describing an image in a CoT
+An image in a CoT is a consolidation of the following aspects of a CoT described
+#. A unique identifier specified by the platform which allows the IO framework
+   to locate the image in a FIP and load it in the memory reserved for the data
+   image in the CoT.
+#. A parsing method which is used by the AM to find the appropriate IPM.
+#. Authentication methods and their parameters as described in the previous
+   section. These are used to verify the current image.
+#. Parameters which are used to verify the next image in the current CoT. These
+   parameters are specified only by authentication images and can be extracted
+   from the current image once it has been verified.
+The following data structure describes an image in a CoT.
+.. code:: c
+    typedef struct auth_img_desc_s {
+        unsigned int img_id;
+        const struct auth_img_desc_s *parent;
+        img_type_t img_type;
+        const auth_method_desc_t *const img_auth_methods;
+        const auth_param_desc_t *const authenticated_data;
+    } auth_img_desc_t;
+A CoT is defined as an array of pointers to ``auth_image_desc_t`` structures
+linked together by the ``parent`` field. Those nodes with no parent must be
+authenticated using the ROTPK stored in the platform.
+Implementation example
+This section is a detailed guide explaining a trusted boot implementation using
+the authentication framework. This example corresponds to the Applicative
+Functional Mode (AFM) as specified in the TBBR-Client document. It is
+recommended to read this guide along with the source code.
+The CoT can be found in ``drivers/auth/tbbr/tbbr_cot.c``. This CoT consists of
+an array of pointers to image descriptors and it is registered in the framework
+using the macro ``REGISTER_COT(cot_desc)``, where 'cot_desc' must be the name
+of the array (passing a pointer or any other type of indirection will cause the
+registration process to fail).
+The number of images participating in the boot process depends on the CoT.
+There is, however, a minimum set of images that are mandatory in TF-A and thus
+all CoTs must present:
+-  ``BL2``
+-  ``SCP_BL2`` (platform specific)
+-  ``BL31``
+-  ``BL32`` (optional)
+-  ``BL33``
+The TBBR specifies the additional certificates that must accompany these images
+for a proper authentication. Details about the TBBR CoT may be found in the
+`Trusted Board Boot`_ document.
+Following the `Platform Porting Guide`_, a platform must provide unique
+identifiers for all the images and certificates that will be loaded during the
+boot process. If a platform is using the TBBR as a reference for trusted boot,
+these identifiers can be obtained from ``include/common/tbbr/tbbr_img_def.h``.
+Arm platforms include this file in ``include/plat/arm/common/arm_def.h``. Other
+platforms may also include this file or provide their own identifiers.
+**Important**: the authentication module uses these identifiers to index the
+CoT array, so the descriptors location in the array must match the identifiers.
+Each image descriptor must specify:
+-  ``img_id``: the corresponding image unique identifier defined by the platform.
+-  ``img_type``: the image parser module uses the image type to call the proper
+   parsing library to check the image integrity and extract the required
+   authentication parameters. Three types of images are currently supported:
+   -  ``IMG_RAW``: image is a raw binary. No parsing functions are available,
+      other than reading the whole image.
+   -  ``IMG_PLAT``: image format is platform specific. The platform may use this
+      type for custom images not directly supported by the authentication
+      framework.
+   -  ``IMG_CERT``: image is an x509v3 certificate.
+-  ``parent``: pointer to the parent image descriptor. The parent will contain
+   the information required to authenticate the current image. If the parent
+   is NULL, the authentication parameters will be obtained from the platform
+   (i.e. the BL2 and Trusted Key certificates are signed with the ROT private
+   key, whose public part is stored in the platform).
+-  ``img_auth_methods``: this points to an array which defines the
+   authentication methods that must be checked to consider an image
+   authenticated. Each method consists of a type and a list of parameter
+   descriptors. A parameter descriptor consists of a type and a cookie which
+   will point to specific information required to extract that parameter from
+   the image (i.e. if the parameter is stored in an x509v3 extension, the
+   cookie will point to the extension OID). Depending on the method type, a
+   different number of parameters must be specified. This pointer should not be
+   NULL.
+   Supported methods are:
+   -  ``AUTH_METHOD_HASH``: the hash of the image must match the hash extracted
+      from the parent image. The following parameter descriptors must be
+      specified:
+      -  ``data``: data to be hashed (obtained from current image)
+      -  ``hash``: reference hash (obtained from parent image)
+   -  ``AUTH_METHOD_SIG``: the image (usually a certificate) must be signed with
+      the private key whose public part is extracted from the parent image (or
+      the platform if the parent is NULL). The following parameter descriptors
+      must be specified:
+      -  ``pk``: the public key (obtained from parent image)
+      -  ``sig``: the digital signature (obtained from current image)
+      -  ``alg``: the signature algorithm used (obtained from current image)
+      -  ``data``: the data to be signed (obtained from current image)
+-  ``authenticated_data``: this array pointer indicates what authentication
+   parameters must be extracted from an image once it has been authenticated.
+   Each parameter consists of a parameter descriptor and the buffer
+   address/size to store the parameter. The CoT is responsible for allocating
+   the required memory to store the parameters. This pointer may be NULL.
+In the ``tbbr_cot.c`` file, a set of buffers are allocated to store the parameters
+extracted from the certificates. In the case of the TBBR CoT, these parameters
+are hashes and public keys. In DER format, an RSA-2048 public key requires 294
+bytes, and a hash requires 51 bytes. Depending on the CoT and the authentication
+process, some of the buffers may be reused at different stages during the boot.
+Next in that file, the parameter descriptors are defined. These descriptors will
+be used to extract the parameter data from the corresponding image.
+Example: the BL31 Chain of Trust
+Four image descriptors form the BL31 Chain of Trust:
+.. code:: c
+    static const auth_img_desc_t trusted_key_cert = {
+            .img_id = TRUSTED_KEY_CERT_ID,
+            .img_type = IMG_CERT,
+            .parent = NULL,
+            .img_auth_methods =  (const auth_method_desc_t[AUTH_METHOD_NUM]) {
+                    [0] = {
+                            .type = AUTH_METHOD_SIG,
+                            .param.sig = {
+                                    .pk = &subject_pk,
+                                    .sig = &sig,
+                                    .alg = &sig_alg,
+                                    .data = &raw_data
+                            }
+                    },
+                    [1] = {
+                            .type = AUTH_METHOD_NV_CTR,
+                            .param.nv_ctr = {
+                                    .cert_nv_ctr = &trusted_nv_ctr,
+                                    .plat_nv_ctr = &trusted_nv_ctr
+                            }
+                    }
+            },
+            .authenticated_data = (const auth_param_desc_t[COT_MAX_VERIFIED_PARAMS]) {
+                    [0] = {
+                            .type_desc = &trusted_world_pk,
+                            .data = {
+                                    .ptr = (void *)trusted_world_pk_buf,
+                                    .len = (unsigned int)PK_DER_LEN
+                            }
+                    },
+                    [1] = {
+                            .type_desc = &non_trusted_world_pk,
+                            .data = {
+                                    .ptr = (void *)non_trusted_world_pk_buf,
+                                    .len = (unsigned int)PK_DER_LEN
+                            }
+                    }
+            }
+    };
+    static const auth_img_desc_t soc_fw_key_cert = {
+            .img_id = SOC_FW_KEY_CERT_ID,
+            .img_type = IMG_CERT,
+            .parent = &trusted_key_cert,
+            .img_auth_methods =  (const auth_method_desc_t[AUTH_METHOD_NUM]) {
+                    [0] = {
+                            .type = AUTH_METHOD_SIG,
+                            .param.sig = {
+                                    .pk = &trusted_world_pk,
+                                    .sig = &sig,
+                                    .alg = &sig_alg,
+                                    .data = &raw_data
+                            }
+                    },
+                    [1] = {
+                            .type = AUTH_METHOD_NV_CTR,
+                            .param.nv_ctr = {
+                                    .cert_nv_ctr = &trusted_nv_ctr,
+                                    .plat_nv_ctr = &trusted_nv_ctr
+                            }
+                    }
+            },
+            .authenticated_data = (const auth_param_desc_t[COT_MAX_VERIFIED_PARAMS]) {
+                    [0] = {
+                            .type_desc = &soc_fw_content_pk,
+                            .data = {
+                                    .ptr = (void *)content_pk_buf,
+                                    .len = (unsigned int)PK_DER_LEN
+                            }
+                    }
+            }
+    };
+    static const auth_img_desc_t soc_fw_content_cert = {
+            .img_id = SOC_FW_CONTENT_CERT_ID,
+            .img_type = IMG_CERT,
+            .parent = &soc_fw_key_cert,
+            .img_auth_methods =  (const auth_method_desc_t[AUTH_METHOD_NUM]) {
+                    [0] = {
+                            .type = AUTH_METHOD_SIG,
+                            .param.sig = {
+                                    .pk = &soc_fw_content_pk,
+                                    .sig = &sig,
+                                    .alg = &sig_alg,
+                                    .data = &raw_data
+                            }
+                    },
+                    [1] = {
+                            .type = AUTH_METHOD_NV_CTR,
+                            .param.nv_ctr = {
+                                    .cert_nv_ctr = &trusted_nv_ctr,
+                                    .plat_nv_ctr = &trusted_nv_ctr
+                            }
+                    }
+            },
+            .authenticated_data = (const auth_param_desc_t[COT_MAX_VERIFIED_PARAMS]) {
+                    [0] = {
+                            .type_desc = &soc_fw_hash,
+                            .data = {
+                                    .ptr = (void *)soc_fw_hash_buf,
+                                    .len = (unsigned int)HASH_DER_LEN
+                            }
+                    },
+                    [1] = {
+                            .type_desc = &soc_fw_config_hash,
+                            .data = {
+                                    .ptr = (void *)soc_fw_config_hash_buf,
+                                    .len = (unsigned int)HASH_DER_LEN
+                            }
+                    }
+            }
+    };
+    static const auth_img_desc_t bl31_image = {
+            .img_id = BL31_IMAGE_ID,
+            .img_type = IMG_RAW,
+            .parent = &soc_fw_content_cert,
+            .img_auth_methods =  (const auth_method_desc_t[AUTH_METHOD_NUM]) {
+                    [0] = {
+                            .type = AUTH_METHOD_HASH,
+                            .param.hash = {
+                                    .data = &raw_data,
+                                    .hash = &soc_fw_hash
+                            }
+                    }
+            }
+    };
+The **Trusted Key certificate** is signed with the ROT private key and contains
+the Trusted World public key and the Non-Trusted World public key as x509v3
+extensions. This must be specified in the image descriptor using the
+``img_auth_methods`` and ``authenticated_data`` arrays, respectively.
+The Trusted Key certificate is authenticated by checking its digital signature
+using the ROTPK. Four parameters are required to check a signature: the public
+key, the algorithm, the signature and the data that has been signed. Therefore,
+four parameter descriptors must be specified with the authentication method:
+-  ``subject_pk``: parameter descriptor of type ``AUTH_PARAM_PUB_KEY``. This type
+   is used to extract a public key from the parent image. If the cookie is an
+   OID, the key is extracted from the corresponding x509v3 extension. If the
+   cookie is NULL, the subject public key is retrieved. In this case, because
+   the parent image is NULL, the public key is obtained from the platform
+   (this key will be the ROTPK).
+-  ``sig``: parameter descriptor of type ``AUTH_PARAM_SIG``. It is used to extract
+   the signature from the certificate.
+-  ``sig_alg``: parameter descriptor of type ``AUTH_PARAM_SIG``. It is used to
+   extract the signature algorithm from the certificate.
+-  ``raw_data``: parameter descriptor of type ``AUTH_PARAM_RAW_DATA``. It is used
+   to extract the data to be signed from the certificate.
+Once the signature has been checked and the certificate authenticated, the
+Trusted World public key needs to be extracted from the certificate. A new entry
+is created in the ``authenticated_data`` array for that purpose. In that entry,
+the corresponding parameter descriptor must be specified along with the buffer
+address to store the parameter value. In this case, the ``tz_world_pk`` descriptor
+is used to extract the public key from an x509v3 extension with OID
+``TRUSTED_WORLD_PK_OID``. The BL31 key certificate will use this descriptor as
+parameter in the signature authentication method. The key is stored in the
+``plat_tz_world_pk_buf`` buffer.
+The **BL31 Key certificate** is authenticated by checking its digital signature
+using the Trusted World public key obtained previously from the Trusted Key
+certificate. In the image descriptor, we specify a single authentication method
+by signature whose public key is the ``tz_world_pk``. Once this certificate has
+been authenticated, we have to extract the BL31 public key, stored in the
+extension specified by ``bl31_content_pk``. This key will be copied to the
+``plat_content_pk`` buffer.
+The **BL31 certificate** is authenticated by checking its digital signature
+using the BL31 public key obtained previously from the BL31 Key certificate.
+We specify the authentication method using ``bl31_content_pk`` as public key.
+After authentication, we need to extract the BL31 hash, stored in the extension
+specified by ``bl31_hash``. This hash will be copied to the ``plat_bl31_hash_buf``
+The **BL31 image** is authenticated by calculating its hash and matching it
+with the hash obtained from the BL31 certificate. The image descriptor contains
+a single authentication method by hash. The parameters to the hash method are
+the reference hash, ``bl31_hash``, and the data to be hashed. In this case, it is
+the whole image, so we specify ``raw_data``.
+The image parser library
+The image parser module relies on libraries to check the image integrity and
+extract the authentication parameters. The number and type of parser libraries
+depend on the images used in the CoT. Raw images do not need a library, so
+only an x509v3 library is required for the TBBR CoT.
+Arm platforms will use an x509v3 library based on mbed TLS. This library may be
+found in ``drivers/auth/mbedtls/mbedtls_x509_parser.c``. It exports three
+.. code:: c
+    void init(void);
+    int check_integrity(void *img, unsigned int img_len);
+    int get_auth_param(const auth_param_type_desc_t *type_desc,
+                       void *img, unsigned int img_len,
+                       void **param, unsigned int *param_len);
+The library is registered in the framework using the macro
+``REGISTER_IMG_PARSER_LIB()``. Each time the image parser module needs to access
+an image of type ``IMG_CERT``, it will call the corresponding function exported
+in this file.
+The build system must be updated to include the corresponding library and
+mbed TLS sources. Arm platforms use the ```` file to pull the
+The cryptographic library
+The cryptographic module relies on a library to perform the required operations,
+i.e. verify a hash or a digital signature. Arm platforms will use a library
+based on mbed TLS, which can be found in
+``drivers/auth/mbedtls/mbedtls_crypto.c``. This library is registered in the
+authentication framework using the macro ``REGISTER_CRYPTO_LIB()`` and exports
+three functions:
+.. code:: c
+    void init(void);
+    int verify_signature(void *data_ptr, unsigned int data_len,
+                         void *sig_ptr, unsigned int sig_len,
+                         void *sig_alg, unsigned int sig_alg_len,
+                         void *pk_ptr, unsigned int pk_len);
+    int verify_hash(void *data_ptr, unsigned int data_len,
+                    void *digest_info_ptr, unsigned int digest_info_len);
+The mbedTLS library algorithm support is configured by the
+``TF_MBEDTLS_KEY_ALG`` variable which can take in 3 values: `rsa`, `ecdsa` or
+`rsa+ecdsa`. This variable allows the Makefile to include the corresponding
+sources in the build for the various algorithms. Setting the variable to
+`rsa+ecdsa` enables support for both rsa and ecdsa algorithms in the mbedTLS
+Note: If code size is a concern, the build option ``MBEDTLS_SHA256_SMALLER`` can
+be defined in the platform Makefile. It will make mbed TLS use an implementation
+of SHA-256 with smaller memory footprint (~1.5 KB less) but slower (~30%).
+*Copyright (c) 2017-2019, Arm Limited and Contributors. All rights reserved.*
+.. _Trusted Board Boot: ./trusted-board-boot.rst
+.. _Platform Porting Guide: ../getting_started/porting-guide.rst
+.. _TBBR-Client specification:
diff --git a/docs/design/cpu-specific-build-macros.rst b/docs/design/cpu-specific-build-macros.rst
new file mode 100644
index 0000000..d099ebe
--- /dev/null
+++ b/docs/design/cpu-specific-build-macros.rst
@@ -0,0 +1,298 @@
+Arm CPU Specific Build Macros
+.. contents::
+This document describes the various build options present in the CPU specific
+operations framework to enable errata workarounds and to enable optimizations
+for a specific CPU on a platform.
+Security Vulnerability Workarounds
+TF-A exports a series of build flags which control which security
+vulnerability workarounds should be applied at runtime.
+-  ``WORKAROUND_CVE_2017_5715``: Enables the security workaround for
+   `CVE-2017-5715`_. This flag can be set to 0 by the platform if none
+   of the PEs in the system need the workaround. Setting this flag to 0 provides
+   no performance benefit for non-affected platforms, it just helps to comply
+   with the recommendation in the spec regarding workaround discovery.
+   Defaults to 1.
+-  ``WORKAROUND_CVE_2018_3639``: Enables the security workaround for
+   `CVE-2018-3639`_. Defaults to 1. The TF-A project recommends to keep
+   the default value of 1 even on platforms that are unaffected by
+   CVE-2018-3639, in order to comply with the recommendation in the spec
+   regarding workaround discovery.
+-  ``DYNAMIC_WORKAROUND_CVE_2018_3639``: Enables dynamic mitigation for
+   `CVE-2018-3639`_. This build option should be set to 1 if the target
+   platform contains at least 1 CPU that requires dynamic mitigation.
+   Defaults to 0.
+CPU Errata Workarounds
+TF-A exports a series of build flags which control the errata workarounds that
+are applied to each CPU by the reset handler. The errata details can be found
+in the CPU specific errata documents published by Arm:
+-  `Cortex-A53 MPCore Software Developers Errata Notice`_
+-  `Cortex-A57 MPCore Software Developers Errata Notice`_
+-  `Cortex-A72 MPCore Software Developers Errata Notice`_
+The errata workarounds are implemented for a particular revision or a set of
+processor revisions. This is checked by the reset handler at runtime. Each
+errata workaround is identified by its ``ID`` as specified in the processor's
+errata notice document. The format of the define used to enable/disable the
+errata workaround is ``ERRATA_<Processor name>_<ID>``, where the ``Processor name``
+is for example ``A57`` for the ``Cortex_A57`` CPU.
+Refer to the section *CPU errata status reporting* in
+`Firmware Design guide`_ for information on how to write errata workaround
+All workarounds are disabled by default. The platform is responsible for
+enabling these workarounds according to its requirement by defining the
+errata workaround build flags in the platform specific makefile. In case
+these workarounds are enabled for the wrong CPU revision then the errata
+workaround is not applied. In the DEBUG build, this is indicated by
+printing a warning to the crash console.
+In the current implementation, a platform which has more than 1 variant
+with different revisions of a processor has no runtime mechanism available
+for it to specify which errata workarounds should be enabled or not.
+The value of the build flags is 0 by default, that is, disabled. A value of 1
+will enable it.
+For Cortex-A9, the following errata build flags are defined :
+-  ``ERRATA_A9_794073``: This applies errata 794073 workaround to Cortex-A9
+   CPU. This needs to be enabled for all revisions of the CPU.
+For Cortex-A15, the following errata build flags are defined :
+-  ``ERRATA_A15_816470``: This applies errata 816470 workaround to Cortex-A15
+   CPU. This needs to be enabled only for revision >= r3p0 of the CPU.
+-  ``ERRATA_A15_827671``: This applies errata 827671 workaround to Cortex-A15
+   CPU. This needs to be enabled only for revision >= r3p0 of the CPU.
+For Cortex-A17, the following errata build flags are defined :
+-  ``ERRATA_A17_852421``: This applies errata 852421 workaround to Cortex-A17
+   CPU. This needs to be enabled only for revision <= r1p2 of the CPU.
+-  ``ERRATA_A17_852423``: This applies errata 852423 workaround to Cortex-A17
+   CPU. This needs to be enabled only for revision <= r1p2 of the CPU.
+For Cortex-A35, the following errata build flags are defined :
+-  ``ERRATA_A35_855472``: This applies errata 855472 workaround to Cortex-A35
+   CPUs. This needs to be enabled only for revision r0p0 of Cortex-A35.
+For Cortex-A53, the following errata build flags are defined :
+-  ``ERRATA_A53_819472``: This applies errata 819472 workaround to all
+   CPUs. This needs to be enabled only for revision <= r0p1 of Cortex-A53.
+-  ``ERRATA_A53_824069``: This applies errata 824069 workaround to all
+   CPUs. This needs to be enabled only for revision <= r0p2 of Cortex-A53.
+-  ``ERRATA_A53_826319``: This applies errata 826319 workaround to Cortex-A53
+   CPU. This needs to be enabled only for revision <= r0p2 of the CPU.
+-  ``ERRATA_A53_827319``: This applies errata 827319 workaround to all
+   CPUs. This needs to be enabled only for revision <= r0p2 of Cortex-A53.
+-  ``ERRATA_A53_835769``: This applies erratum 835769 workaround at compile and
+   link time to Cortex-A53 CPU. This needs to be enabled for some variants of
+   revision <= r0p4. This workaround can lead the linker to create ``*.stub``
+   sections.
+-  ``ERRATA_A53_836870``: This applies errata 836870 workaround to Cortex-A53
+   CPU. This needs to be enabled only for revision <= r0p3 of the CPU. From
+   r0p4 and onwards, this errata is enabled by default in hardware.
+-  ``ERRATA_A53_843419``: This applies erratum 843419 workaround at link time
+   to Cortex-A53 CPU.  This needs to be enabled for some variants of revision
+   <= r0p4. This workaround can lead the linker to emit ``*.stub`` sections
+   which are 4kB aligned.
+-  ``ERRATA_A53_855873``: This applies errata 855873 workaround to Cortex-A53
+   CPUs. Though the erratum is present in every revision of the CPU,
+   this workaround is only applied to CPUs from r0p3 onwards, which feature
+   a chicken bit in CPUACTLR_EL1 to enable a hardware workaround.
+   Earlier revisions of the CPU have other errata which require the same
+   workaround in software, so they should be covered anyway.
+For Cortex-A55, the following errata build flags are defined :
+-  ``ERRATA_A55_768277``: This applies errata 768277 workaround to Cortex-A55
+   CPU. This needs to be enabled only for revision r0p0 of the CPU.
+-  ``ERRATA_A55_778703``: This applies errata 778703 workaround to Cortex-A55
+   CPU. This needs to be enabled only for revision r0p0 of the CPU.
+-  ``ERRATA_A55_798797``: This applies errata 798797 workaround to Cortex-A55
+   CPU. This needs to be enabled only for revision r0p0 of the CPU.
+-  ``ERRATA_A55_846532``: This applies errata 846532 workaround to Cortex-A55
+   CPU. This needs to be enabled only for revision <= r0p1 of the CPU.
+-  ``ERRATA_A55_903758``: This applies errata 903758 workaround to Cortex-A55
+   CPU. This needs to be enabled only for revision <= r0p1 of the CPU.
+For Cortex-A57, the following errata build flags are defined :
+-  ``ERRATA_A57_806969``: This applies errata 806969 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision r0p0 of the CPU.
+-  ``ERRATA_A57_813419``: This applies errata 813419 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision r0p0 of the CPU.
+-  ``ERRATA_A57_813420``: This applies errata 813420 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision r0p0 of the CPU.
+-  ``ERRATA_A57_814670``: This applies errata 814670 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision r0p0 of the CPU.
+-  ``ERRATA_A57_817169``: This applies errata 817169 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision <= r0p1 of the CPU.
+-  ``ERRATA_A57_826974``: This applies errata 826974 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision <= r1p1 of the CPU.
+-  ``ERRATA_A57_826977``: This applies errata 826977 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision <= r1p1 of the CPU.
+-  ``ERRATA_A57_828024``: This applies errata 828024 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision <= r1p1 of the CPU.
+-  ``ERRATA_A57_829520``: This applies errata 829520 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision <= r1p2 of the CPU.
+-  ``ERRATA_A57_833471``: This applies errata 833471 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision <= r1p2 of the CPU.
+-  ``ERRATA_A57_859972``: This applies errata 859972 workaround to Cortex-A57
+   CPU. This needs to be enabled only for revision <= r1p3 of the CPU.
+For Cortex-A72, the following errata build flags are defined :
+-  ``ERRATA_A72_859971``: This applies errata 859971 workaround to Cortex-A72
+   CPU. This needs to be enabled only for revision <= r0p3 of the CPU.
+For Cortex-A73, the following errata build flags are defined :
+-  ``ERRATA_A73_852427``: This applies errata 852427 workaround to Cortex-A73
+   CPU. This needs to be enabled only for revision r0p0 of the CPU.
+-  ``ERRATA_A73_855423``: This applies errata 855423 workaround to Cortex-A73
+   CPU. This needs to be enabled only for revision <= r0p1 of the CPU.
+For Cortex-A75, the following errata build flags are defined :
+-  ``ERRATA_A75_764081``: This applies errata 764081 workaround to Cortex-A75
+   CPU. This needs to be enabled only for revision r0p0 of the CPU.
+-  ``ERRATA_A75_790748``: This applies errata 790748 workaround to Cortex-A75
+    CPU. This needs to be enabled only for revision r0p0 of the CPU.
+For Cortex-A76, the following errata build flags are defined :
+-  ``ERRATA_A76_1073348``: This applies errata 1073348 workaround to Cortex-A76
+   CPU. This needs to be enabled only for revision <= r1p0 of the CPU.
+-  ``ERRATA_A76_1130799``: This applies errata 1130799 workaround to Cortex-A76
+   CPU. This needs to be enabled only for revision <= r2p0 of the CPU.
+-  ``ERRATA_A76_1220197``: This applies errata 1220197 workaround to Cortex-A76
+   CPU. This needs to be enabled only for revision <= r2p0 of the CPU.
+-  ``ERRATA_A76_1257314``: This applies errata 1257314 workaround to Cortex-A76
+   CPU. This needs to be enabled only for revision <= r3p0 of the CPU.
+-  ``ERRATA_A76_1262606``: This applies errata 1262606 workaround to Cortex-A76
+   CPU. This needs to be enabled only for revision <= r3p0 of the CPU.
+-  ``ERRATA_A76_1262888``: This applies errata 1262888 workaround to Cortex-A76
+   CPU. This needs to be enabled only for revision <= r3p0 of the CPU.
+-  ``ERRATA_A76_1275112``: This applies errata 1275112 workaround to Cortex-A76
+   CPU. This needs to be enabled only for revision <= r3p0 of the CPU.
+DSU Errata Workarounds
+Similar to CPU errata, TF-A also implements workarounds for DSU (DynamIQ
+Shared Unit) errata. The DSU errata details can be found in the respective Arm
+- `Arm DSU Software Developers Errata Notice`_.
+Each erratum is identified by an ``ID``, as defined in the DSU errata notice
+document. Thus, the build flags which enable/disable the errata workarounds
+have the format ``ERRATA_DSU_<ID>``. The implementation and application logic
+of DSU errata workarounds are similar to `CPU errata workarounds`_.
+For DSU errata, the following build flags are defined:
+-  ``ERRATA_DSU_798953``: This applies errata 798953 workaround for the
+   affected DSU configurations. This errata applies only for those DSUs that
+   revision is r0p0 (on r0p1 it is fixed). However, please note that this
+   workaround results in increased DSU power consumption on idle.
+-  ``ERRATA_DSU_936184``: This applies errata 936184 workaround for the
+   affected DSU configurations. This errata applies only for those DSUs that
+   contain the ACP interface **and** the DSU revision is older than r2p0 (on
+   r2p0 it is fixed). However, please note that this workaround results in
+   increased DSU power consumption on idle.
+CPU Specific optimizations
+This section describes some of the optimizations allowed by the CPU micro
+architecture that can be enabled by the platform as desired.
+-  ``SKIP_A57_L1_FLUSH_PWR_DWN``: This flag enables an optimization in the
+   Cortex-A57 cluster power down sequence by not flushing the Level 1 data
+   cache. The L1 data cache and the L2 unified cache are inclusive. A flush
+   of the L2 by set/way flushes any dirty lines from the L1 as well. This
+   is a known safe deviation from the Cortex-A57 TRM defined power down
+   sequence. Each Cortex-A57 based platform must make its own decision on
+   whether to use the optimization.
+-  ``A53_DISABLE_NON_TEMPORAL_HINT``: This flag disables the cache non-temporal
+   hint. The LDNP/STNP instructions as implemented on Cortex-A53 do not behave
+   in a way most programmers expect, and will most probably result in a
+   significant speed degradation to any code that employs them. The Armv8-A
+   architecture (see Arm DDI 0487A.h, section D3.4.3) allows cores to ignore
+   the non-temporal hint and treat LDNP/STNP as LDP/STP instead. Enabling this
+   flag enforces this behaviour. This needs to be enabled only for revisions
+   <= r0p3 of the CPU and is enabled by default.
+-  ``A57_DISABLE_NON_TEMPORAL_HINT``: This flag has the same behaviour as
+   ``A53_DISABLE_NON_TEMPORAL_HINT`` but for Cortex-A57. This needs to be
+   enabled only for revisions <= r1p2 of the CPU and is enabled by default,
+   as recommended in section "4.7 Non-Temporal Loads/Stores" of the
+   `Cortex-A57 Software Optimization Guide`_.
+*Copyright (c) 2014-2019, Arm Limited and Contributors. All rights reserved.*
+.. _CVE-2017-5715:
+.. _CVE-2018-3639:
+.. _Cortex-A53 MPCore Software Developers Errata Notice:
+.. _Cortex-A57 MPCore Software Developers Errata Notice:
+.. _Cortex-A72 MPCore Software Developers Errata Notice:
+.. _Firmware Design guide: firmware-design.rst
+.. _Cortex-A57 Software Optimization Guide:
+.. _Arm DSU Software Developers Errata Notice:
diff --git a/docs/design/firmware-design.rst b/docs/design/firmware-design.rst
new file mode 100644
index 0000000..e7107ba
--- /dev/null
+++ b/docs/design/firmware-design.rst
@@ -0,0 +1,2687 @@
+Trusted Firmware-A design
+.. contents::
+Trusted Firmware-A (TF-A) implements a subset of the Trusted Board Boot
+Requirements (TBBR) Platform Design Document (PDD) [1]_ for Arm reference
+platforms. The TBB sequence starts when the platform is powered on and runs up
+to the stage where it hands-off control to firmware running in the normal
+world in DRAM. This is the cold boot path.
+TF-A also implements the Power State Coordination Interface PDD [2]_ as a
+runtime service. PSCI is the interface from normal world software to firmware
+implementing power management use-cases (for example, secondary CPU boot,
+hotplug and idle). Normal world software can access TF-A runtime services via
+the Arm SMC (Secure Monitor Call) instruction. The SMC instruction must be
+used as mandated by the SMC Calling Convention [3]_.
+TF-A implements a framework for configuring and managing interrupts generated
+in either security state. The details of the interrupt management framework
+and its design can be found in TF-A Interrupt Management Design guide [4]_.
+TF-A also implements a library for setting up and managing the translation
+tables. The details of this library can be found in `Xlat_tables design`_.
+TF-A can be built to support either AArch64 or AArch32 execution state.
+Cold boot
+The cold boot path starts when the platform is physically turned on. If
+``COLD_BOOT_SINGLE_CPU=0``, one of the CPUs released from reset is chosen as the
+primary CPU, and the remaining CPUs are considered secondary CPUs. The primary
+CPU is chosen through platform-specific means. The cold boot path is mainly
+executed by the primary CPU, other than essential CPU initialization executed by
+all CPUs. The secondary CPUs are kept in a safe platform-specific state until
+the primary CPU has performed enough initialization to boot them.
+Refer to the `Reset Design`_ for more information on the effect of the
+``COLD_BOOT_SINGLE_CPU`` platform build option.
+The cold boot path in this implementation of TF-A depends on the execution
+state. For AArch64, it is divided into five steps (in order of execution):
+-  Boot Loader stage 1 (BL1) *AP Trusted ROM*
+-  Boot Loader stage 2 (BL2) *Trusted Boot Firmware*
+-  Boot Loader stage 3-1 (BL31) *EL3 Runtime Software*
+-  Boot Loader stage 3-2 (BL32) *Secure-EL1 Payload* (optional)
+-  Boot Loader stage 3-3 (BL33) *Non-trusted Firmware*
+For AArch32, it is divided into four steps (in order of execution):
+-  Boot Loader stage 1 (BL1) *AP Trusted ROM*
+-  Boot Loader stage 2 (BL2) *Trusted Boot Firmware*
+-  Boot Loader stage 3-2 (BL32) *EL3 Runtime Software*
+-  Boot Loader stage 3-3 (BL33) *Non-trusted Firmware*
+Arm development platforms (Fixed Virtual Platforms (FVPs) and Juno) implement a
+combination of the following types of memory regions. Each bootloader stage uses
+one or more of these memory regions.
+-  Regions accessible from both non-secure and secure states. For example,
+   non-trusted SRAM, ROM and DRAM.
+-  Regions accessible from only the secure state. For example, trusted SRAM and
+   ROM. The FVPs also implement the trusted DRAM which is statically
+   configured. Additionally, the Base FVPs and Juno development platform
+   configure the TrustZone Controller (TZC) to create a region in the DRAM
+   which is accessible only from the secure state.
+The sections below provide the following details:
+-  dynamic configuration of Boot Loader stages
+-  initialization and execution of the first three stages during cold boot
+-  specification of the EL3 Runtime Software (BL31 for AArch64 and BL32 for
+   AArch32) entrypoint requirements for use by alternative Trusted Boot
+   Firmware in place of the provided BL1 and BL2
+Dynamic Configuration during cold boot
+Each of the Boot Loader stages may be dynamically configured if required by the
+platform. The Boot Loader stage may optionally specify a firmware
+configuration file and/or hardware configuration file as listed below:
+-  HW_CONFIG - The hardware configuration file. Can be shared by all Boot Loader
+   stages and also by the Normal World Rich OS.
+-  TB_FW_CONFIG - Trusted Boot Firmware configuration file. Shared between BL1
+   and BL2.
+-  SOC_FW_CONFIG - SoC Firmware configuration file. Used by BL31.
+-  TOS_FW_CONFIG - Trusted OS Firmware configuration file. Used by Trusted OS
+   (BL32).
+-  NT_FW_CONFIG - Non Trusted Firmware configuration file. Used by Non-trusted
+   firmware (BL33).
+The Arm development platforms use the Flattened Device Tree format for the
+dynamic configuration files.
+Each Boot Loader stage can pass up to 4 arguments via registers to the next
+stage.  BL2 passes the list of the next images to execute to the *EL3 Runtime
+Software* (BL31 for AArch64 and BL32 for AArch32) via `arg0`. All the other
+arguments are platform defined. The Arm development platforms use the following
+-  BL1 passes the address of a meminfo_t structure to BL2 via ``arg1``. This
+   structure contains the memory layout available to BL2.
+-  When dynamic configuration files are present, the firmware configuration for
+   the next Boot Loader stage is populated in the first available argument and
+   the generic hardware configuration is passed the next available argument.
+   For example,
+   -  If TB_FW_CONFIG is loaded by BL1, then its address is passed in ``arg0``
+      to BL2.
+   -  If HW_CONFIG is loaded by BL1, then its address is passed in ``arg2`` to
+      BL2. Note, ``arg1`` is already used for meminfo_t.
+   -  If SOC_FW_CONFIG is loaded by BL2, then its address is passed in ``arg1``
+      to BL31. Note, ``arg0`` is used to pass the list of executable images.
+   -  Similarly, if HW_CONFIG is loaded by BL1 or BL2, then its address is
+      passed in ``arg2`` to BL31.
+   -  For other BL3x images, if the firmware configuration file is loaded by
+      BL2, then its address is passed in ``arg0`` and if HW_CONFIG is loaded
+      then its address is passed in ``arg1``.
+This stage begins execution from the platform's reset vector at EL3. The reset
+address is platform dependent but it is usually located in a Trusted ROM area.
+The BL1 data section is copied to trusted SRAM at runtime.
+On the Arm development platforms, BL1 code starts execution from the reset
+vector defined by the constant ``BL1_RO_BASE``. The BL1 data section is copied
+to the top of trusted SRAM as defined by the constant ``BL1_RW_BASE``.
+The functionality implemented by this stage is as follows.
+Determination of boot path
+Whenever a CPU is released from reset, BL1 needs to distinguish between a warm
+boot and a cold boot. This is done using platform-specific mechanisms (see the
+``plat_get_my_entrypoint()`` function in the `Porting Guide`_). In the case of a
+warm boot, a CPU is expected to continue execution from a separate
+entrypoint. In the case of a cold boot, the secondary CPUs are placed in a safe
+platform-specific state (see the ``plat_secondary_cold_boot_setup()`` function in
+the `Porting Guide`_) while the primary CPU executes the remaining cold boot path
+as described in the following sections.
+This step only applies when ``PROGRAMMABLE_RESET_ADDRESS=0``. Refer to the
+`Reset Design`_ for more information on the effect of the
+``PROGRAMMABLE_RESET_ADDRESS`` platform build option.
+Architectural initialization
+BL1 performs minimal architectural initialization as follows.
+-  Exception vectors
+   BL1 sets up simple exception vectors for both synchronous and asynchronous
+   exceptions. The default behavior upon receiving an exception is to populate
+   a status code in the general purpose register ``X0/R0`` and call the
+   ``plat_report_exception()`` function (see the `Porting Guide`_). The status
+   code is one of:
+   For AArch64:
+   ::
+       0x0 : Synchronous exception from Current EL with SP_EL0
+       0x1 : IRQ exception from Current EL with SP_EL0
+       0x2 : FIQ exception from Current EL with SP_EL0
+       0x3 : System Error exception from Current EL with SP_EL0
+       0x4 : Synchronous exception from Current EL with SP_ELx
+       0x5 : IRQ exception from Current EL with SP_ELx
+       0x6 : FIQ exception from Current EL with SP_ELx
+       0x7 : System Error exception from Current EL with SP_ELx
+       0x8 : Synchronous exception from Lower EL using aarch64
+       0x9 : IRQ exception from Lower EL using aarch64
+       0xa : FIQ exception from Lower EL using aarch64
+       0xb : System Error exception from Lower EL using aarch64
+       0xc : Synchronous exception from Lower EL using aarch32
+       0xd : IRQ exception from Lower EL using aarch32
+       0xe : FIQ exception from Lower EL using aarch32
+       0xf : System Error exception from Lower EL using aarch32
+   For AArch32:
+   ::
+       0x10 : User mode
+       0x11 : FIQ mode
+       0x12 : IRQ mode
+       0x13 : SVC mode
+       0x16 : Monitor mode
+       0x17 : Abort mode
+       0x1a : Hypervisor mode
+       0x1b : Undefined mode
+       0x1f : System mode
+   The ``plat_report_exception()`` implementation on the Arm FVP port programs
+   the Versatile Express System LED register in the following format to
+   indicate the occurrence of an unexpected exception:
+   ::
+       SYS_LED[0]   - Security state (Secure=0/Non-Secure=1)
+       SYS_LED[2:1] - Exception Level (EL3=0x3, EL2=0x2, EL1=0x1, EL0=0x0)
+                      For AArch32 it is always 0x0
+       SYS_LED[7:3] - Exception Class (Sync/Async & origin). This is the value
+                      of the status code
+   A write to the LED register reflects in the System LEDs (S6LED0..7) in the
+   CLCD window of the FVP.
+   BL1 does not expect to receive any exceptions other than the SMC exception.
+   For the latter, BL1 installs a simple stub. The stub expects to receive a
+   limited set of SMC types (determined by their function IDs in the general
+   purpose register ``X0/R0``):
+   -  ``BL1_SMC_RUN_IMAGE``: This SMC is raised by BL2 to make BL1 pass control
+      to EL3 Runtime Software.
+   -  All SMCs listed in section "BL1 SMC Interface" in the `Firmware Update`_
+      Design Guide are supported for AArch64 only. These SMCs are currently
+      not supported when BL1 is built for AArch32.
+   Any other SMC leads to an assertion failure.
+-  CPU initialization
+   BL1 calls the ``reset_handler()`` function which in turn calls the CPU
+   specific reset handler function (see the section: "CPU specific operations
+   framework").
+-  Control register setup (for AArch64)
+   -  ``SCTLR_EL3``. Instruction cache is enabled by setting the ``SCTLR_EL3.I``
+      bit. Alignment and stack alignment checking is enabled by setting the
+      ``SCTLR_EL3.A`` and ``SCTLR_EL3.SA`` bits. Exception endianness is set to
+      little-endian by clearing the ``SCTLR_EL3.EE`` bit.
+   -  ``SCR_EL3``. The register width of the next lower exception level is set
+      to AArch64 by setting the ``SCR.RW`` bit. The ``SCR.EA`` bit is set to trap
+      both External Aborts and SError Interrupts in EL3. The ``SCR.SIF`` bit is
+      also set to disable instruction fetches from Non-secure memory when in
+      secure state.
+   -  ``CPTR_EL3``. Accesses to the ``CPACR_EL1`` register from EL1 or EL2, or the
+      ``CPTR_EL2`` register from EL2 are configured to not trap to EL3 by
+      clearing the ``CPTR_EL3.TCPAC`` bit. Access to the trace functionality is
+      configured not to trap to EL3 by clearing the ``CPTR_EL3.TTA`` bit.
+      Instructions that access the registers associated with Floating Point
+      and Advanced SIMD execution are configured to not trap to EL3 by
+      clearing the ``CPTR_EL3.TFP`` bit.
+   -  ``DAIF``. The SError interrupt is enabled by clearing the SError interrupt
+      mask bit.
+   -  ``MDCR_EL3``. The trap controls, ``MDCR_EL3.TDOSA``, ``MDCR_EL3.TDA`` and
+      ``MDCR_EL3.TPM``, are set so that accesses to the registers they control
+      do not trap to EL3. AArch64 Secure self-hosted debug is disabled by
+      setting the ``MDCR_EL3.SDD`` bit. Also ``MDCR_EL3.SPD32`` is set to
+      disable AArch32 Secure self-hosted privileged debug from S-EL1.
+-  Control register setup (for AArch32)
+   -  ``SCTLR``. Instruction cache is enabled by setting the ``SCTLR.I`` bit.
+      Alignment checking is enabled by setting the ``SCTLR.A`` bit.
+      Exception endianness is set to little-endian by clearing the
+      ``SCTLR.EE`` bit.
+   -  ``SCR``. The ``SCR.SIF`` bit is set to disable instruction fetches from
+      Non-secure memory when in secure state.
+   -  ``CPACR``. Allow execution of Advanced SIMD instructions at PL0 and PL1,
+      by clearing the ``CPACR.ASEDIS`` bit. Access to the trace functionality
+      is configured not to trap to undefined mode by clearing the
+      ``CPACR.TRCDIS`` bit.
+   -  ``NSACR``. Enable non-secure access to Advanced SIMD functionality and
+      system register access to implemented trace registers.
+   -  ``FPEXC``. Enable access to the Advanced SIMD and floating-point
+      functionality from all Exception levels.
+   -  ``CPSR.A``. The Asynchronous data abort interrupt is enabled by clearing
+      the Asynchronous data abort interrupt mask bit.
+   -  ``SDCR``. The ``SDCR.SPD`` field is set to disable AArch32 Secure
+      self-hosted privileged debug.
+Platform initialization
+On Arm platforms, BL1 performs the following platform initializations:
+-  Enable the Trusted Watchdog.
+-  Initialize the console.
+-  Configure the Interconnect to enable hardware coherency.
+-  Enable the MMU and map the memory it needs to access.
+-  Configure any required platform storage to load the next bootloader image
+   (BL2).
+-  If the BL1 dynamic configuration file, ``TB_FW_CONFIG``, is available, then
+   load it to the platform defined address and make it available to BL2 via
+   ``arg0``.
+-  Configure the system timer and program the `CNTFRQ_EL0` for use by NS-BL1U
+   and NS-BL2U firmware update images.
+Firmware Update detection and execution
+After performing platform setup, BL1 common code calls
+``bl1_plat_get_next_image_id()`` to determine if `Firmware Update`_ is required or
+to proceed with the normal boot process. If the platform code returns
+``BL2_IMAGE_ID`` then the normal boot sequence is executed as described in the
+next section, else BL1 assumes that `Firmware Update`_ is required and execution
+passes to the first image in the `Firmware Update`_ process. In either case, BL1
+retrieves a descriptor of the next image by calling ``bl1_plat_get_image_desc()``.
+The image descriptor contains an ``entry_point_info_t`` structure, which BL1
+uses to initialize the execution state of the next image.
+BL2 image load and execution
+In the normal boot flow, BL1 execution continues as follows:
+#. BL1 prints the following string from the primary CPU to indicate successful
+   execution of the BL1 stage:
+   ::
+       "Booting Trusted Firmware"
+#. BL1 loads a BL2 raw binary image from platform storage, at a
+   platform-specific base address. Prior to the load, BL1 invokes
+   ``bl1_plat_handle_pre_image_load()`` which allows the platform to update or
+   use the image information. If the BL2 image file is not present or if
+   there is not enough free trusted SRAM the following error message is
+   printed:
+   ::
+       "Failed to load BL2 firmware."
+#. BL1 invokes ``bl1_plat_handle_post_image_load()`` which again is intended
+   for platforms to take further action after image load. This function must
+   populate the necessary arguments for BL2, which may also include the memory
+   layout. Further description of the memory layout can be found later
+   in this document.
+#. BL1 passes control to the BL2 image at Secure EL1 (for AArch64) or at
+   Secure SVC mode (for AArch32), starting from its load address.
+BL1 loads and passes control to BL2 at Secure-EL1 (for AArch64) or at Secure
+SVC mode (for AArch32) . BL2 is linked against and loaded at a platform-specific
+base address (more information can be found later in this document).
+The functionality implemented by BL2 is as follows.
+Architectural initialization
+For AArch64, BL2 performs the minimal architectural initialization required
+for subsequent stages of TF-A and normal world software. EL1 and EL0 are given
+access to Floating Point and Advanced SIMD registers by clearing the
+``CPACR.FPEN`` bits.
+For AArch32, the minimal architectural initialization required for subsequent
+stages of TF-A and normal world software is taken care of in BL1 as both BL1
+and BL2 execute at PL1.
+Platform initialization
+On Arm platforms, BL2 performs the following platform initializations:
+-  Initialize the console.
+-  Configure any required platform storage to allow loading further bootloader
+   images.
+-  Enable the MMU and map the memory it needs to access.
+-  Perform platform security setup to allow access to controlled components.
+-  Reserve some memory for passing information to the next bootloader image
+   EL3 Runtime Software and populate it.
+-  Define the extents of memory available for loading each subsequent
+   bootloader image.
+-  If BL1 has passed TB_FW_CONFIG dynamic configuration file in ``arg0``,
+   then parse it.
+Image loading in BL2
+BL2 generic code loads the images based on the list of loadable images
+provided by the platform. BL2 passes the list of executable images
+provided by the platform to the next handover BL image.
+The list of loadable images provided by the platform may also contain
+dynamic configuration files. The files are loaded and can be parsed as
+needed in the ``bl2_plat_handle_post_image_load()`` function. These
+configuration files can be passed to next Boot Loader stages as arguments
+by updating the corresponding entrypoint information in this function.
+SCP_BL2 (System Control Processor Firmware) image load
+Some systems have a separate System Control Processor (SCP) for power, clock,
+reset and system control. BL2 loads the optional SCP_BL2 image from platform
+storage into a platform-specific region of secure memory. The subsequent
+handling of SCP_BL2 is platform specific. For example, on the Juno Arm
+development platform port the image is transferred into SCP's internal memory
+using the Boot Over MHU (BOM) protocol after being loaded in the trusted SRAM
+memory. The SCP executes SCP_BL2 and signals to the Application Processor (AP)
+for BL2 execution to continue.
+EL3 Runtime Software image load
+BL2 loads the EL3 Runtime Software image from platform storage into a platform-
+specific address in trusted SRAM. If there is not enough memory to load the
+image or image is missing it leads to an assertion failure.
+AArch64 BL32 (Secure-EL1 Payload) image load
+BL2 loads the optional BL32 image from platform storage into a platform-
+specific region of secure memory. The image executes in the secure world. BL2
+relies on BL31 to pass control to the BL32 image, if present. Hence, BL2
+populates a platform-specific area of memory with the entrypoint/load-address
+of the BL32 image. The value of the Saved Processor Status Register (``SPSR``)
+for entry into BL32 is not determined by BL2, it is initialized by the
+Secure-EL1 Payload Dispatcher (see later) within BL31, which is responsible for
+managing interaction with BL32. This information is passed to BL31.
+BL33 (Non-trusted Firmware) image load
+BL2 loads the BL33 image (e.g. UEFI or other test or boot software) from
+platform storage into non-secure memory as defined by the platform.
+BL2 relies on EL3 Runtime Software to pass control to BL33 once secure state
+initialization is complete. Hence, BL2 populates a platform-specific area of
+memory with the entrypoint and Saved Program Status Register (``SPSR``) of the
+normal world software image. The entrypoint is the load address of the BL33
+image. The ``SPSR`` is determined as specified in Section 5.13 of the
+`PSCI PDD`_. This information is passed to the EL3 Runtime Software.
+AArch64 BL31 (EL3 Runtime Software) execution
+BL2 execution continues as follows:
+#. BL2 passes control back to BL1 by raising an SMC, providing BL1 with the
+   BL31 entrypoint. The exception is handled by the SMC exception handler
+   installed by BL1.
+#. BL1 turns off the MMU and flushes the caches. It clears the
+   ``SCTLR_EL3.M/I/C`` bits, flushes the data cache to the point of coherency
+   and invalidates the TLBs.
+#. BL1 passes control to BL31 at the specified entrypoint at EL3.
+Running BL2 at EL3 execution level
+Some platforms have a non-TF-A Boot ROM that expects the next boot stage
+to execute at EL3. On these platforms, TF-A BL1 is a waste of memory
+as its only purpose is to ensure TF-A BL2 is entered at S-EL1. To avoid
+this waste, a special mode enables BL2 to execute at EL3, which allows
+a non-TF-A Boot ROM to load and jump directly to BL2. This mode is selected
+when the build flag BL2_AT_EL3 is enabled. The main differences in this
+mode are:
+#. BL2 includes the reset code and the mailbox mechanism to differentiate
+   cold boot and warm boot. It runs at EL3 doing the arch
+   initialization required for EL3.
+#. BL2 does not receive the meminfo information from BL1 anymore. This
+   information can be passed by the Boot ROM or be internal to the
+   BL2 image.
+#. Since BL2 executes at EL3, BL2 jumps directly to the next image,
+   instead of invoking the RUN_IMAGE SMC call.
+We assume 3 different types of BootROM support on the platform:
+#. The Boot ROM always jumps to the same address, for both cold
+   and warm boot. In this case, we will need to keep a resident part
+   of BL2 whose memory cannot be reclaimed by any other image. The
+   linker script defines the symbols __TEXT_RESIDENT_START__ and
+   __TEXT_RESIDENT_END__ that allows the platform to configure
+   correctly the memory map.
+#. The platform has some mechanism to indicate the jump address to the
+   Boot ROM. Platform code can then program the jump address with
+   psci_warmboot_entrypoint during cold boot.
+#. The platform has some mechanism to program the reset address using
+   the PROGRAMMABLE_RESET_ADDRESS feature. Platform code can then
+   program the reset address with psci_warmboot_entrypoint during
+   cold boot, bypassing the boot ROM for warm boot.
+In the last 2 cases, no part of BL2 needs to remain resident at
+runtime. In the first 2 cases, we expect the Boot ROM to be able to
+differentiate between warm and cold boot, to avoid loading BL2 again
+during warm boot.
+This functionality can be tested with FVP loading the image directly
+in memory and changing the address where the system jumps at reset.
+For example:
+	-C cluster0.cpu0.RVBAR=0x4022000
+	--data cluster0.cpu0=bl2.bin@0x4022000
+With this configuration, FVP is like a platform of the first case,
+where the Boot ROM jumps always to the same address. For simplification,
+BL32 is loaded in DRAM in this case, to avoid other images reclaiming
+BL2 memory.
+AArch64 BL31
+The image for this stage is loaded by BL2 and BL1 passes control to BL31 at
+EL3. BL31 executes solely in trusted SRAM. BL31 is linked against and
+loaded at a platform-specific base address (more information can be found later
+in this document). The functionality implemented by BL31 is as follows.
+Architectural initialization
+Currently, BL31 performs a similar architectural initialization to BL1 as
+far as system register settings are concerned. Since BL1 code resides in ROM,
+architectural initialization in BL31 allows override of any previous
+initialization done by BL1.
+BL31 initializes the per-CPU data framework, which provides a cache of
+frequently accessed per-CPU data optimised for fast, concurrent manipulation
+on different CPUs. This buffer includes pointers to per-CPU contexts, crash
+buffer, CPU reset and power down operations, PSCI data, platform data and so on.
+It then replaces the exception vectors populated by BL1 with its own. BL31
+exception vectors implement more elaborate support for handling SMCs since this
+is the only mechanism to access the runtime services implemented by BL31 (PSCI
+for example). BL31 checks each SMC for validity as specified by the
+`SMC calling convention PDD`_ before passing control to the required SMC
+handler routine.
+BL31 programs the ``CNTFRQ_EL0`` register with the clock frequency of the system
+counter, which is provided by the platform.
+Platform initialization
+BL31 performs detailed platform initialization, which enables normal world
+software to function correctly.
+On Arm platforms, this consists of the following:
+-  Initialize the console.
+-  Configure the Interconnect to enable hardware coherency.
+-  Enable the MMU and map the memory it needs to access.
+-  Initialize the generic interrupt controller.
+-  Initialize the power controller device.
+-  Detect the system topology.
+Runtime services initialization
+BL31 is responsible for initializing the runtime services. One of them is PSCI.
+As part of the PSCI initializations, BL31 detects the system topology. It also
+initializes the data structures that implement the state machine used to track
+the state of power domain nodes. The state can be one of ``OFF``, ``RUN`` or
+``RETENTION``. All secondary CPUs are initially in the ``OFF`` state. The cluster
+that the primary CPU belongs to is ``ON``; any other cluster is ``OFF``. It also
+initializes the locks that protect them. BL31 accesses the state of a CPU or
+cluster immediately after reset and before the data cache is enabled in the
+warm boot path. It is not currently possible to use 'exclusive' based spinlocks,
+therefore BL31 uses locks based on Lamport's Bakery algorithm instead.
+The runtime service framework and its initialization is described in more
+detail in the "EL3 runtime services framework" section below.
+Details about the status of the PSCI implementation are provided in the
+"Power State Coordination Interface" section below.
+AArch64 BL32 (Secure-EL1 Payload) image initialization
+If a BL32 image is present then there must be a matching Secure-EL1 Payload
+Dispatcher (SPD) service (see later for details). During initialization
+that service must register a function to carry out initialization of BL32
+once the runtime services are fully initialized. BL31 invokes such a
+registered function to initialize BL32 before running BL33. This initialization
+is not necessary for AArch32 SPs.
+Details on BL32 initialization and the SPD's role are described in the
+"Secure-EL1 Payloads and Dispatchers" section below.
+BL33 (Non-trusted Firmware) execution
+EL3 Runtime Software initializes the EL2 or EL1 processor context for normal-
+world cold boot, ensuring that no secure state information finds its way into
+the non-secure execution state. EL3 Runtime Software uses the entrypoint
+information provided by BL2 to jump to the Non-trusted firmware image (BL33)
+at the highest available Exception Level (EL2 if available, otherwise EL1).
+Using alternative Trusted Boot Firmware in place of BL1 & BL2 (AArch64 only)
+Some platforms have existing implementations of Trusted Boot Firmware that
+would like to use TF-A BL31 for the EL3 Runtime Software. To enable this
+firmware architecture it is important to provide a fully documented and stable
+interface between the Trusted Boot Firmware and BL31.
+Future changes to the BL31 interface will be done in a backwards compatible
+way, and this enables these firmware components to be independently enhanced/
+updated to develop and exploit new functionality.
+Required CPU state when calling ``bl31_entrypoint()`` during cold boot
+This function must only be called by the primary CPU.
+On entry to this function the calling primary CPU must be executing in AArch64
+EL3, little-endian data access, and all interrupt sources masked:
+    PSTATE.EL = 3
+    PSTATE.RW = 1
+    PSTATE.DAIF = 0xf
+    SCTLR_EL3.EE = 0
+X0 and X1 can be used to pass information from the Trusted Boot Firmware to the
+platform code in BL31:
+    X0 : Reserved for common TF-A information
+    X1 : Platform specific information
+BL31 zero-init sections (e.g. ``.bss``) should not contain valid data on entry,
+these will be zero filled prior to invoking platform setup code.
+Use of the X0 and X1 parameters
+The parameters are platform specific and passed from ``bl31_entrypoint()`` to
+``bl31_early_platform_setup()``. The value of these parameters is never directly
+used by the common BL31 code.
+The convention is that ``X0`` conveys information regarding the BL31, BL32 and
+BL33 images from the Trusted Boot firmware and ``X1`` can be used for other
+platform specific purpose. This convention allows platforms which use TF-A's
+BL1 and BL2 images to transfer additional platform specific information from
+Secure Boot without conflicting with future evolution of TF-A using ``X0`` to
+pass a ``bl31_params`` structure.
+BL31 common and SPD initialization code depends on image and entrypoint
+information about BL33 and BL32, which is provided via BL31 platform APIs.
+This information is required until the start of execution of BL33. This
+information can be provided in a platform defined manner, e.g. compiled into
+the platform code in BL31, or provided in a platform defined memory location
+by the Trusted Boot firmware, or passed from the Trusted Boot Firmware via the
+Cold boot Initialization parameters. This data may need to be cleaned out of
+the CPU caches if it is provided by an earlier boot stage and then accessed by
+BL31 platform code before the caches are enabled.
+TF-A's BL2 implementation passes a ``bl31_params`` structure in
+``X0`` and the Arm development platforms interpret this in the BL31 platform
+MMU, Data caches & Coherency
+BL31 does not depend on the enabled state of the MMU, data caches or
+interconnect coherency on entry to ``bl31_entrypoint()``. If these are disabled
+on entry, these should be enabled during ``bl31_plat_arch_setup()``.
+Data structures used in the BL31 cold boot interface
+These structures are designed to support compatibility and independent
+evolution of the structures and the firmware images. For example, a version of
+BL31 that can interpret the BL3x image information from different versions of
+BL2, a platform that uses an extended entry_point_info structure to convey
+additional register information to BL31, or a ELF image loader that can convey
+more details about the firmware images.
+To support these scenarios the structures are versioned and sized, which enables
+BL31 to detect which information is present and respond appropriately. The
+``param_header`` is defined to capture this information:
+.. code:: c
+    typedef struct param_header {
+        uint8_t type;       /* type of the structure */
+        uint8_t version;    /* version of this structure */
+        uint16_t size;      /* size of this structure in bytes */
+        uint32_t attr;      /* attributes: unused bits SBZ */
+    } param_header_t;
+The structures using this format are ``entry_point_info``, ``image_info`` and
+``bl31_params``. The code that allocates and populates these structures must set
+the header fields appropriately, and the ``SET_PARAM_HEAD()`` a macro is defined
+to simplify this action.
+Required CPU state for BL31 Warm boot initialization
+When requesting a CPU power-on, or suspending a running CPU, TF-A provides
+the platform power management code with a Warm boot initialization
+entry-point, to be invoked by the CPU immediately after the reset handler.
+On entry to the Warm boot initialization function the calling CPU must be in
+AArch64 EL3, little-endian data access and all interrupt sources masked:
+    PSTATE.EL = 3
+    PSTATE.RW = 1
+    PSTATE.DAIF = 0xf
+    SCTLR_EL3.EE = 0
+The PSCI implementation will initialize the processor state and ensure that the
+platform power management code is then invoked as required to initialize all
+necessary system, cluster and CPU resources.
+AArch32 EL3 Runtime Software entrypoint interface
+To enable this firmware architecture it is important to provide a fully
+documented and stable interface between the Trusted Boot Firmware and the
+AArch32 EL3 Runtime Software.
+Future changes to the entrypoint interface will be done in a backwards
+compatible way, and this enables these firmware components to be independently
+enhanced/updated to develop and exploit new functionality.
+Required CPU state when entering during cold boot
+This function must only be called by the primary CPU.
+On entry to this function the calling primary CPU must be executing in AArch32
+EL3, little-endian data access, and all interrupt sources masked:
+    PSTATE.AIF = 0x7
+    SCTLR.EE = 0
+R0 and R1 are used to pass information from the Trusted Boot Firmware to the
+platform code in AArch32 EL3 Runtime Software:
+    R0 : Reserved for common TF-A information
+    R1 : Platform specific information
+Use of the R0 and R1 parameters
+The parameters are platform specific and the convention is that ``R0`` conveys
+information regarding the BL3x images from the Trusted Boot firmware and ``R1``
+can be used for other platform specific purpose. This convention allows
+platforms which use TF-A's BL1 and BL2 images to transfer additional platform
+specific information from Secure Boot without conflicting with future
+evolution of TF-A using ``R0`` to pass a ``bl_params`` structure.
+The AArch32 EL3 Runtime Software is responsible for entry into BL33. This
+information can be obtained in a platform defined manner, e.g. compiled into
+the AArch32 EL3 Runtime Software, or provided in a platform defined memory
+location by the Trusted Boot firmware, or passed from the Trusted Boot Firmware
+via the Cold boot Initialization parameters. This data may need to be cleaned
+out of the CPU caches if it is provided by an earlier boot stage and then
+accessed by AArch32 EL3 Runtime Software before the caches are enabled.
+When using AArch32 EL3 Runtime Software, the Arm development platforms pass a
+``bl_params`` structure in ``R0`` from BL2 to be interpreted by AArch32 EL3 Runtime
+Software platform code.
+MMU, Data caches & Coherency
+AArch32 EL3 Runtime Software must not depend on the enabled state of the MMU,
+data caches or interconnect coherency in its entrypoint. They must be explicitly
+enabled if required.
+Data structures used in cold boot interface
+The AArch32 EL3 Runtime Software cold boot interface uses ``bl_params`` instead
+of ``bl31_params``. The ``bl_params`` structure is based on the convention
+described in AArch64 BL31 cold boot interface section.
+Required CPU state for warm boot initialization
+When requesting a CPU power-on, or suspending a running CPU, AArch32 EL3
+Runtime Software must ensure execution of a warm boot initialization entrypoint.
+If TF-A BL1 is used and the PROGRAMMABLE_RESET_ADDRESS build flag is false,
+then AArch32 EL3 Runtime Software must ensure that BL1 branches to the warm
+boot entrypoint by arranging for the BL1 platform function,
+plat_get_my_entrypoint(), to return a non-zero value.
+In this case, the warm boot entrypoint must be in AArch32 EL3, little-endian
+data access and all interrupt sources masked:
+    PSTATE.AIF = 0x7
+    SCTLR.EE = 0
+The warm boot entrypoint may be implemented by using TF-A
+``psci_warmboot_entrypoint()`` function. In that case, the platform must fulfil
+the pre-requisites mentioned in the `PSCI Library integration guide`_.
+EL3 runtime services framework
+Software executing in the non-secure state and in the secure state at exception
+levels lower than EL3 will request runtime services using the Secure Monitor
+Call (SMC) instruction. These requests will follow the convention described in
+the SMC Calling Convention PDD (`SMCCC`_). The `SMCCC`_ assigns function
+identifiers to each SMC request and describes how arguments are passed and
+The EL3 runtime services framework enables the development of services by
+different providers that can be easily integrated into final product firmware.
+The following sections describe the framework which facilitates the
+registration, initialization and use of runtime services in EL3 Runtime
+Software (BL31).
+The design of the runtime services depends heavily on the concepts and
+definitions described in the `SMCCC`_, in particular SMC Function IDs, Owning
+Entity Numbers (OEN), Fast and Yielding calls, and the SMC32 and SMC64 calling
+conventions. Please refer to that document for more detailed explanation of
+these terms.
+The following runtime services are expected to be implemented first. They have
+not all been instantiated in the current implementation.
+#. Standard service calls
+   This service is for management of the entire system. The Power State
+   Coordination Interface (`PSCI`_) is the first set of standard service calls
+   defined by Arm (see PSCI section later).
+#. Secure-EL1 Payload Dispatcher service
+   If a system runs a Trusted OS or other Secure-EL1 Payload (SP) then
+   it also requires a *Secure Monitor* at EL3 to switch the EL1 processor
+   context between the normal world (EL1/EL2) and trusted world (Secure-EL1).
+   The Secure Monitor will make these world switches in response to SMCs. The
+   `SMCCC`_ provides for such SMCs with the Trusted OS Call and Trusted
+   Application Call OEN ranges.
+   The interface between the EL3 Runtime Software and the Secure-EL1 Payload is
+   not defined by the `SMCCC`_ or any other standard. As a result, each
+   Secure-EL1 Payload requires a specific Secure Monitor that runs as a runtime
+   service - within TF-A this service is referred to as the Secure-EL1 Payload
+   Dispatcher (SPD).
+   TF-A provides a Test Secure-EL1 Payload (TSP) and its associated Dispatcher
+   (TSPD). Details of SPD design and TSP/TSPD operation are described in the
+   "Secure-EL1 Payloads and Dispatchers" section below.
+#. CPU implementation service
+   This service will provide an interface to CPU implementation specific
+   services for a given platform e.g. access to processor errata workarounds.
+   This service is currently unimplemented.
+Additional services for Arm Architecture, SiP and OEM calls can be implemented.
+Each implemented service handles a range of SMC function identifiers as
+described in the `SMCCC`_.
+A runtime service is registered using the ``DECLARE_RT_SVC()`` macro, specifying
+the name of the service, the range of OENs covered, the type of service and
+initialization and call handler functions. This macro instantiates a ``const struct rt_svc_desc`` for the service with these details (see ``runtime_svc.h``).
+This structure is allocated in a special ELF section ``rt_svc_descs``, enabling
+the framework to find all service descriptors included into BL31.
+The specific service for a SMC Function is selected based on the OEN and call
+type of the Function ID, and the framework uses that information in the service
+descriptor to identify the handler for the SMC Call.
+The service descriptors do not include information to identify the precise set
+of SMC function identifiers supported by this service implementation, the
+security state from which such calls are valid nor the capability to support
+64-bit and/or 32-bit callers (using SMC32 or SMC64). Responding appropriately
+to these aspects of a SMC call is the responsibility of the service
+implementation, the framework is focused on integration of services from
+different providers and minimizing the time taken by the framework before the
+service handler is invoked.
+Details of the parameters, requirements and behavior of the initialization and
+call handling functions are provided in the following sections.
+``runtime_svc_init()`` in ``runtime_svc.c`` initializes the runtime services
+framework running on the primary CPU during cold boot as part of the BL31
+initialization. This happens prior to initializing a Trusted OS and running
+Normal world boot firmware that might in turn use these services.
+Initialization involves validating each of the declared runtime service
+descriptors, calling the service initialization function and populating the
+index used for runtime lookup of the service.
+The BL31 linker script collects all of the declared service descriptors into a
+single array and defines symbols that allow the framework to locate and traverse
+the array, and determine its size.
+The framework does basic validation of each descriptor to halt firmware
+initialization if service declaration errors are detected. The framework does
+not check descriptors for the following error conditions, and may behave in an
+unpredictable manner under such scenarios:
+#. Overlapping OEN ranges
+#. Multiple descriptors for the same range of OENs and ``call_type``
+#. Incorrect range of owning entity numbers for a given ``call_type``
+Once validated, the service ``init()`` callback is invoked. This function carries
+out any essential EL3 initialization before servicing requests. The ``init()``
+function is only invoked on the primary CPU during cold boot. If the service
+uses per-CPU data this must either be initialized for all CPUs during this call,
+or be done lazily when a CPU first issues an SMC call to that service. If
+``init()`` returns anything other than ``0``, this is treated as an initialization
+error and the service is ignored: this does not cause the firmware to halt.
+The OEN and call type fields present in the SMC Function ID cover a total of
+128 distinct services, but in practice a single descriptor can cover a range of
+OENs, e.g. SMCs to call a Trusted OS function. To optimize the lookup of a
+service handler, the framework uses an array of 128 indices that map every
+distinct OEN/call-type combination either to one of the declared services or to
+indicate the service is not handled. This ``rt_svc_descs_indices[]`` array is
+populated for all of the OENs covered by a service after the service ``init()``
+function has reported success. So a service that fails to initialize will never
+have it's ``handle()`` function invoked.
+The following figure shows how the ``rt_svc_descs_indices[]`` index maps the SMC
+Function ID call type and OEN onto a specific service handler in the
+``rt_svc_descs[]`` array.
+|Image 1|
+Handling an SMC
+When the EL3 runtime services framework receives a Secure Monitor Call, the SMC
+Function ID is passed in W0 from the lower exception level (as per the
+`SMCCC`_). If the calling register width is AArch32, it is invalid to invoke an
+SMC Function which indicates the SMC64 calling convention: such calls are
+ignored and return the Unknown SMC Function Identifier result code ``0xFFFFFFFF``
+in R0/X0.
+Bit[31] (fast/yielding call) and bits[29:24] (owning entity number) of the SMC
+Function ID are combined to index into the ``rt_svc_descs_indices[]`` array. The
+resulting value might indicate a service that has no handler, in this case the
+framework will also report an Unknown SMC Function ID. Otherwise, the value is
+used as a further index into the ``rt_svc_descs[]`` array to locate the required
+service and handler.
+The service's ``handle()`` callback is provided with five of the SMC parameters
+directly, the others are saved into memory for retrieval (if needed) by the
+handler. The handler is also provided with an opaque ``handle`` for use with the
+supporting library for parameter retrieval, setting return values and context
+manipulation; and with ``flags`` indicating the security state of the caller. The
+framework finally sets up the execution stack for the handler, and invokes the
+services ``handle()`` function.
+On return from the handler the result registers are populated in X0-X3 before
+restoring the stack and CPU state and returning from the original SMC.
+Exception Handling Framework
+Please refer to the `Exception Handling Framework`_ document.
+Power State Coordination Interface
+TODO: Provide design walkthrough of PSCI implementation.
+The PSCI v1.1 specification categorizes APIs as optional and mandatory. All the
+mandatory APIs in PSCI v1.1, PSCI v1.0 and in PSCI v0.2 draft specification
+`Power State Coordination Interface PDD`_ are implemented. The table lists
+the PSCI v1.1 APIs and their support in generic code.
+An API implementation might have a dependency on platform code e.g. CPU_SUSPEND
+requires the platform to export a part of the implementation. Hence the level
+of support of the mandatory APIs depends upon the support exported by the
+platform port as well. The Juno and FVP (all variants) platforms export all the
+required support.
+| PSCI v1.1 API               | Supported   | Comments                      |
+| ``PSCI_VERSION``            | Yes         | The version returned is 1.1   |
+| ``CPU_SUSPEND``             | Yes\*       |                               |
+| ``CPU_OFF``                 | Yes\*       |                               |
+| ``CPU_ON``                  | Yes\*       |                               |
+| ``AFFINITY_INFO``           | Yes         |                               |
+| ``MIGRATE``                 | Yes\*\*     |                               |
+| ``MIGRATE_INFO_TYPE``       | Yes\*\*     |                               |
+| ``MIGRATE_INFO_CPU``        | Yes\*\*     |                               |
+| ``SYSTEM_OFF``              | Yes\*       |                               |
+| ``SYSTEM_RESET``            | Yes\*       |                               |
+| ``PSCI_FEATURES``           | Yes         |                               |
+| ``CPU_FREEZE``              | No          |                               |
+| ``CPU_DEFAULT_SUSPEND``     | No          |                               |
+| ``NODE_HW_STATE``           | Yes\*       |                               |
+| ``SYSTEM_SUSPEND``          | Yes\*       |                               |
+| ``PSCI_SET_SUSPEND_MODE``   | No          |                               |
+| ``PSCI_STAT_RESIDENCY``     | Yes\*       |                               |
+| ``PSCI_STAT_COUNT``         | Yes\*       |                               |
+| ``SYSTEM_RESET2``           | Yes\*       |                               |
+| ``MEM_PROTECT``             | Yes\*       |                               |
+| ``MEM_PROTECT_CHECK_RANGE`` | Yes\*       |                               |
+\*Note : These PSCI APIs require platform power management hooks to be
+registered with the generic PSCI code to be supported.
+\*\*Note : These PSCI APIs require appropriate Secure Payload Dispatcher
+hooks to be registered with the generic PSCI code to be supported.
+The PSCI implementation in TF-A is a library which can be integrated with
+AArch64 or AArch32 EL3 Runtime Software for Armv8-A systems. A guide to
+integrating PSCI library with AArch32 EL3 Runtime Software can be found
+Secure-EL1 Payloads and Dispatchers
+On a production system that includes a Trusted OS running in Secure-EL1/EL0,
+the Trusted OS is coupled with a companion runtime service in the BL31
+firmware. This service is responsible for the initialisation of the Trusted
+OS and all communications with it. The Trusted OS is the BL32 stage of the
+boot flow in TF-A. The firmware will attempt to locate, load and execute a
+BL32 image.
+TF-A uses a more general term for the BL32 software that runs at Secure-EL1 -
+the *Secure-EL1 Payload* - as it is not always a Trusted OS.
+TF-A provides a Test Secure-EL1 Payload (TSP) and a Test Secure-EL1 Payload
+Dispatcher (TSPD) service as an example of how a Trusted OS is supported on a
+production system using the Runtime Services Framework. On such a system, the
+Test BL32 image and service are replaced by the Trusted OS and its dispatcher
+service. The TF-A build system expects that the dispatcher will define the
+build flag ``NEED_BL32`` to enable it to include the BL32 in the build either
+as a binary or to compile from source depending on whether the ``BL32`` build
+option is specified or not.
+The TSP runs in Secure-EL1. It is designed to demonstrate synchronous
+communication with the normal-world software running in EL1/EL2. Communication
+is initiated by the normal-world software
+-  either directly through a Fast SMC (as defined in the `SMCCC`_)
+-  or indirectly through a `PSCI`_ SMC. The `PSCI`_ implementation in turn
+   informs the TSPD about the requested power management operation. This allows
+   the TSP to prepare for or respond to the power state change
+The TSPD service is responsible for.
+-  Initializing the TSP
+-  Routing requests and responses between the secure and the non-secure
+   states during the two types of communications just described
+Initializing a BL32 Image
+The Secure-EL1 Payload Dispatcher (SPD) service is responsible for initializing
+the BL32 image. It needs access to the information passed by BL2 to BL31 to do
+so. This is provided by:
+.. code:: c
+    entry_point_info_t *bl31_plat_get_next_image_ep_info(uint32_t);
+which returns a reference to the ``entry_point_info`` structure corresponding to
+the image which will be run in the specified security state. The SPD uses this
+API to get entry point information for the SECURE image, BL32.
+In the absence of a BL32 image, BL31 passes control to the normal world
+bootloader image (BL33). When the BL32 image is present, it is typical
+that the SPD wants control to be passed to BL32 first and then later to BL33.
+To do this the SPD has to register a BL32 initialization function during
+initialization of the SPD service. The BL32 initialization function has this
+.. code:: c
+    int32_t init(void);
+and is registered using the ``bl31_register_bl32_init()`` function.
+TF-A supports two approaches for the SPD to pass control to BL32 before
+returning through EL3 and running the non-trusted firmware (BL33):
+#. In the BL32 setup function, use ``bl31_set_next_image_type()`` to
+   request that the exit from ``bl31_main()`` is to the BL32 entrypoint in
+   Secure-EL1. BL31 will exit to BL32 using the asynchronous method by
+   calling ``bl31_prepare_next_image_entry()`` and ``el3_exit()``.
+   When the BL32 has completed initialization at Secure-EL1, it returns to
+   BL31 by issuing an SMC, using a Function ID allocated to the SPD. On
+   receipt of this SMC, the SPD service handler should switch the CPU context
+   from trusted to normal world and use the ``bl31_set_next_image_type()`` and
+   ``bl31_prepare_next_image_entry()`` functions to set up the initial return to
+   the normal world firmware BL33. On return from the handler the framework
+   will exit to EL2 and run BL33.
+#. The BL32 setup function registers an initialization function using
+   ``bl31_register_bl32_init()`` which provides a SPD-defined mechanism to
+   invoke a 'world-switch synchronous call' to Secure-EL1 to run the BL32
+   entrypoint.
+   NOTE: The Test SPD service included with TF-A provides one implementation
+   of such a mechanism.
+   On completion BL32 returns control to BL31 via a SMC, and on receipt the
+   SPD service handler invokes the synchronous call return mechanism to return
+   to the BL32 initialization function. On return from this function,
+   ``bl31_main()`` will set up the return to the normal world firmware BL33 and
+   continue the boot process in the normal world.
+Crash Reporting in BL31
+BL31 implements a scheme for reporting the processor state when an unhandled
+exception is encountered. The reporting mechanism attempts to preserve all the
+register contents and report it via a dedicated UART (PL011 console). BL31
+reports the general purpose, EL3, Secure EL1 and some EL2 state registers.
+A dedicated per-CPU crash stack is maintained by BL31 and this is retrieved via
+the per-CPU pointer cache. The implementation attempts to minimise the memory
+required for this feature. The file ``crash_reporting.S`` contains the
+implementation for crash reporting.
+The sample crash output is shown below.
+    x0  :0x000000004F00007C
+    x1  :0x0000000007FFFFFF
+    x2  :0x0000000004014D50
+    x3  :0x0000000000000000
+    x4  :0x0000000088007998
+    x5  :0x00000000001343AC
+    x6  :0x0000000000000016
+    x7  :0x00000000000B8A38
+    x8  :0x00000000001343AC
+    x9  :0x00000000000101A8
+    x10 :0x0000000000000002
+    x11 :0x000000000000011C
+    x12 :0x00000000FEFDC644
+    x13 :0x00000000FED93FFC
+    x14 :0x0000000000247950
+    x15 :0x00000000000007A2
+    x16 :0x00000000000007A4
+    x17 :0x0000000000247950
+    x18 :0x0000000000000000
+    x19 :0x00000000FFFFFFFF
+    x20 :0x0000000004014D50
+    x21 :0x000000000400A38C
+    x22 :0x0000000000247950
+    x23 :0x0000000000000010
+    x24 :0x0000000000000024
+    x25 :0x00000000FEFDC868
+    x26 :0x00000000FEFDC86A
+    x27 :0x00000000019EDEDC
+    x28 :0x000000000A7CFDAA
+    x29 :0x0000000004010780
+    x30 :0x000000000400F004
+    scr_el3 :0x0000000000000D3D
+    sctlr_el3   :0x0000000000C8181F
+    cptr_el3    :0x0000000000000000
+    tcr_el3 :0x0000000080803520
+    daif    :0x00000000000003C0
+    mair_el3    :0x00000000000004FF
+    spsr_el3    :0x00000000800003CC
+    elr_el3 :0x000000000400C0CC
+    ttbr0_el3   :0x00000000040172A0
+    esr_el3 :0x0000000096000210
+    sp_el3  :0x0000000004014D50
+    far_el3 :0x000000004F00007C
+    spsr_el1    :0x0000000000000000
+    elr_el1 :0x0000000000000000
+    spsr_abt    :0x0000000000000000
+    spsr_und    :0x0000000000000000
+    spsr_irq    :0x0000000000000000
+    spsr_fiq    :0x0000000000000000
+    sctlr_el1   :0x0000000030C81807
+    actlr_el1   :0x0000000000000000
+    cpacr_el1   :0x0000000000300000
+    csselr_el1  :0x0000000000000002
+    sp_el1  :0x0000000004028800
+    esr_el1 :0x0000000000000000
+    ttbr0_el1   :0x000000000402C200
+    ttbr1_el1   :0x0000000000000000
+    mair_el1    :0x00000000000004FF
+    amair_el1   :0x0000000000000000
+    tcr_el1 :0x0000000000003520
+    tpidr_el1   :0x0000000000000000
+    tpidr_el0   :0x0000000000000000
+    tpidrro_el0 :0x0000000000000000
+    dacr32_el2  :0x0000000000000000
+    ifsr32_el2  :0x0000000000000000
+    par_el1 :0x0000000000000000
+    far_el1 :0x0000000000000000
+    afsr0_el1   :0x0000000000000000
+    afsr1_el1   :0x0000000000000000
+    contextidr_el1  :0x0000000000000000
+    vbar_el1    :0x0000000004027000
+    cntp_ctl_el0    :0x0000000000000000
+    cntp_cval_el0   :0x0000000000000000
+    cntv_ctl_el0    :0x0000000000000000
+    cntv_cval_el0   :0x0000000000000000
+    cntkctl_el1 :0x0000000000000000
+    sp_el0  :0x0000000004010780
+Guidelines for Reset Handlers
+TF-A implements a framework that allows CPU and platform ports to perform
+actions very early after a CPU is released from reset in both the cold and warm
+boot paths. This is done by calling the ``reset_handler()`` function in both
+the BL1 and BL31 images. It in turn calls the platform and CPU specific reset
+handling functions.
+Details for implementing a CPU specific reset handler can be found in
+Section 8. Details for implementing a platform specific reset handler can be
+found in the `Porting Guide`_ (see the ``plat_reset_handler()`` function).
+When adding functionality to a reset handler, keep in mind that if a different
+reset handling behavior is required between the first and the subsequent
+invocations of the reset handling code, this should be detected at runtime.
+In other words, the reset handler should be able to detect whether an action has
+already been performed and act as appropriate. Possible courses of actions are,
+e.g. skip the action the second time, or undo/redo it.
+Configuring secure interrupts
+The GIC driver is responsible for performing initial configuration of secure
+interrupts on the platform. To this end, the platform is expected to provide the
+GIC driver (either GICv2 or GICv3, as selected by the platform) with the
+interrupt configuration during the driver initialisation.
+Secure interrupt configuration are specified in an array of secure interrupt
+properties. In this scheme, in both GICv2 and GICv3 driver data structures, the
+``interrupt_props`` member points to an array of interrupt properties. Each
+element of the array specifies the interrupt number and its attributes
+(priority, group, configuration). Each element of the array shall be populated
+by the macro ``INTR_PROP_DESC()``. The macro takes the following arguments:
+- 10-bit interrupt number,
+- 8-bit interrupt priority,
+- Interrupt type (one of ``INTR_TYPE_EL3``, ``INTR_TYPE_S_EL1``,
+  ``INTR_TYPE_NS``),
+- Interrupt configuration (either ``GIC_INTR_CFG_LEVEL`` or
+CPU specific operations framework
+Certain aspects of the Armv8-A architecture are implementation defined,
+that is, certain behaviours are not architecturally defined, but must be
+defined and documented by individual processor implementations. TF-A
+implements a framework which categorises the common implementation defined
+behaviours and allows a processor to export its implementation of that
+behaviour. The categories are:
+#. Processor specific reset sequence.
+#. Processor specific power down sequences.
+#. Processor specific register dumping as a part of crash reporting.
+#. Errata status reporting.
+Each of the above categories fulfils a different requirement.
+#. allows any processor specific initialization before the caches and MMU
+   are turned on, like implementation of errata workarounds, entry into
+   the intra-cluster coherency domain etc.
+#. allows each processor to implement the power down sequence mandated in
+   its Technical Reference Manual (TRM).
+#. allows a processor to provide additional information to the developer
+   in the event of a crash, for example Cortex-A53 has registers which
+   can expose the data cache contents.
+#. allows a processor to define a function that inspects and reports the status
+   of all errata workarounds on that processor.
+Please note that only 2. is mandated by the TRM.
+The CPU specific operations framework scales to accommodate a large number of
+different CPUs during power down and reset handling. The platform can specify
+any CPU optimization it wants to enable for each CPU. It can also specify
+the CPU errata workarounds to be applied for each CPU type during reset
+handling by defining CPU errata compile time macros. Details on these macros
+can be found in the `cpu-specific-build-macros.rst`_ file.
+The CPU specific operations framework depends on the ``cpu_ops`` structure which
+needs to be exported for each type of CPU in the platform. It is defined in
+``include/lib/cpus/aarch64/cpu_macros.S`` and has the following fields : ``midr``,
+``reset_func()``, ``cpu_pwr_down_ops`` (array of power down functions) and
+The CPU specific files in ``lib/cpus`` export a ``cpu_ops`` data structure with
+suitable handlers for that CPU. For example, ``lib/cpus/aarch64/cortex_a53.S``
+exports the ``cpu_ops`` for Cortex-A53 CPU. According to the platform
+configuration, these CPU specific files must be included in the build by
+the platform makefile. The generic CPU specific operations framework code exists
+in ``lib/cpus/aarch64/cpu_helpers.S``.
+CPU specific Reset Handling
+After a reset, the state of the CPU when it calls generic reset handler is:
+MMU turned off, both instruction and data caches turned off and not part
+of any coherency domain.
+The BL entrypoint code first invokes the ``plat_reset_handler()`` to allow
+the platform to perform any system initialization required and any system
+errata workarounds that needs to be applied. The ``get_cpu_ops_ptr()`` reads
+the current CPU midr, finds the matching ``cpu_ops`` entry in the ``cpu_ops``
+array and returns it. Note that only the part number and implementer fields
+in midr are used to find the matching ``cpu_ops`` entry. The ``reset_func()`` in
+the returned ``cpu_ops`` is then invoked which executes the required reset
+handling for that CPU and also any errata workarounds enabled by the platform.
+This function must preserve the values of general purpose registers x20 to x29.
+Refer to Section "Guidelines for Reset Handlers" for general guidelines
+regarding placement of code in a reset handler.
+CPU specific power down sequence
+During the BL31 initialization sequence, the pointer to the matching ``cpu_ops``
+entry is stored in per-CPU data by ``init_cpu_ops()`` so that it can be quickly
+retrieved during power down sequences.
+Various CPU drivers register handlers to perform power down at certain power
+levels for that specific CPU. The PSCI service, upon receiving a power down
+request, determines the highest power level at which to execute power down
+sequence for a particular CPU. It uses the ``prepare_cpu_pwr_dwn()`` function to
+pick the right power down handler for the requested level. The function
+retrieves ``cpu_ops`` pointer member of per-CPU data, and from that, further
+retrieves ``cpu_pwr_down_ops`` array, and indexes into the required level. If the
+requested power level is higher than what a CPU driver supports, the handler
+registered for highest level is invoked.
+At runtime the platform hooks for power down are invoked by the PSCI service to
+perform platform specific operations during a power down sequence, for example
+turning off CCI coherency during a cluster power down.
+CPU specific register reporting during crash
+If the crash reporting is enabled in BL31, when a crash occurs, the crash
+reporting framework calls ``do_cpu_reg_dump`` which retrieves the matching
+``cpu_ops`` using ``get_cpu_ops_ptr()`` function. The ``cpu_reg_dump()`` in
+``cpu_ops`` is invoked, which then returns the CPU specific register values to
+be reported and a pointer to the ASCII list of register names in a format
+expected by the crash reporting framework.
+CPU errata status reporting
+Errata workarounds for CPUs supported in TF-A are applied during both cold and
+warm boots, shortly after reset. Individual Errata workarounds are enabled as
+build options. Some errata workarounds have potential run-time implications;
+therefore some are enabled by default, others not. Platform ports shall
+override build options to enable or disable errata as appropriate. The CPU
+drivers take care of applying errata workarounds that are enabled and applicable
+to a given CPU. Refer to the section titled *CPU Errata Workarounds* in `CPUBM`_
+for more information.
+Functions in CPU drivers that apply errata workaround must follow the
+conventions listed below.
+The errata workaround must be authored as two separate functions:
+-  One that checks for errata. This function must determine whether that errata
+   applies to the current CPU. Typically this involves matching the current
+   CPUs revision and variant against a value that's known to be affected by the
+   errata. If the function determines that the errata applies to this CPU, it
+   must return ``ERRATA_APPLIES``; otherwise, it must return
+   ``ERRATA_NOT_APPLIES``. The utility functions ``cpu_get_rev_var`` and
+   ``cpu_rev_var_ls`` functions may come in handy for this purpose.
+For an errata identified as ``E``, the check function must be named
+This function will be invoked at different times, both from assembly and from
+C run time. Therefore it must follow AAPCS, and must not use stack.
+-  Another one that applies the errata workaround. This function would call the
+   check function described above, and applies errata workaround if required.
+CPU drivers that apply errata workaround can optionally implement an assembly
+function that report the status of errata workarounds pertaining to that CPU.
+For a driver that registers the CPU, for example, ``cpux`` via ``declare_cpu_ops``
+macro, the errata reporting function, if it exists, must be named
+``cpux_errata_report``. This function will always be called with MMU enabled; it
+must follow AAPCS and may use stack.
+In a debug build of TF-A, on a CPU that comes out of reset, both BL1 and the
+runtime firmware (BL31 in AArch64, and BL32 in AArch32) will invoke errata
+status reporting function, if one exists, for that type of CPU.
+To report the status of each errata workaround, the function shall use the
+assembler macro ``report_errata``, passing it:
+-  The build option that enables the errata;
+-  The name of the CPU: this must be the same identifier that CPU driver
+   registered itself with, using ``declare_cpu_ops``;
+-  And the errata identifier: the identifier must match what's used in the
+   errata's check function described above.
+The errata status reporting function will be called once per CPU type/errata
+combination during the software's active life time.
+It's expected that whenever an errata workaround is submitted to TF-A, the
+errata reporting function is appropriately extended to report its status as
+Reporting the status of errata workaround is for informational purpose only; it
+has no functional significance.
+Memory layout of BL images
+Each bootloader image can be divided in 2 parts:
+-  the static contents of the image. These are data actually stored in the
+   binary on the disk. In the ELF terminology, they are called ``PROGBITS``
+   sections;
+-  the run-time contents of the image. These are data that don't occupy any
+   space in the binary on the disk. The ELF binary just contains some
+   metadata indicating where these data will be stored at run-time and the
+   corresponding sections need to be allocated and initialized at run-time.
+   In the ELF terminology, they are called ``NOBITS`` sections.
+All PROGBITS sections are grouped together at the beginning of the image,
+followed by all NOBITS sections. This is true for all TF-A images and it is
+governed by the linker scripts. This ensures that the raw binary images are
+as small as possible. If a NOBITS section was inserted in between PROGBITS
+sections then the resulting binary file would contain zero bytes in place of
+this NOBITS section, making the image unnecessarily bigger. Smaller images
+allow faster loading from the FIP to the main memory.
+Linker scripts and symbols
+Each bootloader stage image layout is described by its own linker script. The
+linker scripts export some symbols into the program symbol table. Their values
+correspond to particular addresses. TF-A code can refer to these symbols to
+figure out the image memory layout.
+Linker symbols follow the following naming convention in TF-A.
+-  ``__<SECTION>_START__``
+   Start address of a given section named ``<SECTION>``.
+-  ``__<SECTION>_END__``
+   End address of a given section named ``<SECTION>``. If there is an alignment
+   constraint on the section's end address then ``__<SECTION>_END__`` corresponds
+   to the end address of the section's actual contents, rounded up to the right
+   boundary. Refer to the value of ``__<SECTION>_UNALIGNED_END__`` to know the
+   actual end address of the section's contents.
+   End address of a given section named ``<SECTION>`` without any padding or
+   rounding up due to some alignment constraint.
+-  ``__<SECTION>_SIZE__``
+   Size (in bytes) of a given section named ``<SECTION>``. If there is an
+   alignment constraint on the section's end address then ``__<SECTION>_SIZE__``
+   corresponds to the size of the section's actual contents, rounded up to the
+   right boundary. In other words, ``__<SECTION>_SIZE__ = __<SECTION>_END__ - _<SECTION>_START__``. Refer to the value of ``__<SECTION>_UNALIGNED_SIZE__``
+   to know the actual size of the section's contents.
+   Size (in bytes) of a given section named ``<SECTION>`` without any padding or
+   rounding up due to some alignment constraint. In other words,
+Some of the linker symbols are mandatory as TF-A code relies on them to be
+defined. They are listed in the following subsections. Some of them must be
+provided for each bootloader stage and some are specific to a given bootloader
+The linker scripts define some extra, optional symbols. They are not actually
+used by any code but they help in understanding the bootloader images' memory
+layout as they are easy to spot in the link map files.
+Common linker symbols
+All BL images share the following requirements:
+-  The BSS section must be zero-initialised before executing any C code.
+-  The coherent memory section (if enabled) must be zero-initialised as well.
+-  The MMU setup code needs to know the extents of the coherent and read-only
+   memory regions to set the right memory attributes. When
+   ``SEPARATE_CODE_AND_RODATA=1``, it needs to know more specifically how the
+   read-only memory region is divided between code and data.
+The following linker symbols are defined for this purpose:
+-  ``__BSS_START__``
+-  ``__BSS_SIZE__``
+-  ``__COHERENT_RAM_START__`` Must be aligned on a page-size boundary.
+-  ``__COHERENT_RAM_END__`` Must be aligned on a page-size boundary.
+-  ``__RO_START__``
+-  ``__RO_END__``
+-  ``__TEXT_START__``
+-  ``__TEXT_END__``
+-  ``__RODATA_START__``
+-  ``__RODATA_END__``
+BL1's linker symbols
+BL1 being the ROM image, it has additional requirements. BL1 resides in ROM and
+it is entirely executed in place but it needs some read-write memory for its
+mutable data. Its ``.data`` section (i.e. its allocated read-write data) must be
+relocated from ROM to RAM before executing any C code.
+The following additional linker symbols are defined for BL1:
+-  ``__BL1_ROM_END__`` End address of BL1's ROM contents, covering its code
+   and ``.data`` section in ROM.
+-  ``__DATA_ROM_START__`` Start address of the ``.data`` section in ROM. Must be
+   aligned on a 16-byte boundary.
+-  ``__DATA_RAM_START__`` Address in RAM where the ``.data`` section should be
+   copied over. Must be aligned on a 16-byte boundary.
+-  ``__DATA_SIZE__`` Size of the ``.data`` section (in ROM or RAM).
+-  ``__BL1_RAM_START__`` Start address of BL1 read-write data.
+-  ``__BL1_RAM_END__`` End address of BL1 read-write data.
+How to choose the right base addresses for each bootloader stage image
+There is currently no support for dynamic image loading in TF-A. This means
+that all bootloader images need to be linked against their ultimate runtime
+locations and the base addresses of each image must be chosen carefully such
+that images don't overlap each other in an undesired way. As the code grows,
+the base addresses might need adjustments to cope with the new memory layout.
+The memory layout is completely specific to the platform and so there is no
+general recipe for choosing the right base addresses for each bootloader image.
+However, there are tools to aid in understanding the memory layout. These are
+the link map files: ``build/<platform>/<build-type>/bl<x>/bl<x>.map``, with ``<x>``
+being the stage bootloader. They provide a detailed view of the memory usage of
+each image. Among other useful information, they provide the end address of
+each image.
+-  ```` link map file provides ``__BL1_RAM_END__`` address.
+-  ```` link map file provides ``__BL2_END__`` address.
+-  ```` link map file provides ``__BL31_END__`` address.
+-  ```` link map file provides ``__BL32_END__`` address.
+For each bootloader image, the platform code must provide its start address
+as well as a limit address that it must not overstep. The latter is used in the
+linker scripts to check that the image doesn't grow past that address. If that
+happens, the linker will issue a message similar to the following:
+    aarch64-none-elf-ld: BLx has exceeded its limit.
+Additionally, if the platform memory layout implies some image overlaying like
+on FVP, BL31 and TSP need to know the limit address that their PROGBITS
+sections must not overstep. The platform code must provide those.
+TF-A does not provide any mechanism to verify at boot time that the memory
+to load a new image is free to prevent overwriting a previously loaded image.
+The platform must specify the memory available in the system for all the
+relevant BL images to be loaded.
+For example, in the case of BL1 loading BL2, ``bl1_plat_sec_mem_layout()`` will
+return the region defined by the platform where BL1 intends to load BL2. The
+``load_image()`` function performs bounds check for the image size based on the
+base and maximum image size provided by the platforms. Platforms must take
+this behaviour into account when defining the base/size for each of the images.
+Memory layout on Arm development platforms
+The following list describes the memory layout on the Arm development platforms:
+-  A 4KB page of shared memory is used for communication between Trusted
+   Firmware and the platform's power controller. This is located at the base of
+   Trusted SRAM. The amount of Trusted SRAM available to load the bootloader
+   images is reduced by the size of the shared memory.
+   The shared memory is used to store the CPUs' entrypoint mailbox. On Juno,
+   this is also used for the MHU payload when passing messages to and from the
+   SCP.
+-  Another 4 KB page is reserved for passing memory layout between BL1 and BL2
+   and also the dynamic firmware configurations.
+-  On FVP, BL1 is originally sitting in the Trusted ROM at address ``0x0``. On
+   Juno, BL1 resides in flash memory at address ``0x0BEC0000``. BL1 read-write
+   data are relocated to the top of Trusted SRAM at runtime.
+-  BL2 is loaded below BL1 RW
+-  EL3 Runtime Software, BL31 for AArch64 and BL32 for AArch32 (e.g. SP_MIN),
+   is loaded at the top of the Trusted SRAM, such that its NOBITS sections will
+   overwrite BL1 R/W data and BL2. This implies that BL1 global variables
+   remain valid only until execution reaches the EL3 Runtime Software entry
+   point during a cold boot.
+-  On Juno, SCP_BL2 is loaded temporarily into the EL3 Runtime Software memory
+   region and transfered to the SCP before being overwritten by EL3 Runtime
+   Software.
+-  BL32 (for AArch64) can be loaded in one of the following locations:
+   -  Trusted SRAM
+   -  Trusted DRAM (FVP only)
+   -  Secure region of DRAM (top 16MB of DRAM configured by the TrustZone
+      controller)
+   When BL32 (for AArch64) is loaded into Trusted SRAM, it is loaded below
+   BL31.
+The location of the BL32 image will result in different memory maps. This is
+illustrated for both FVP and Juno in the following diagrams, using the TSP as
+an example.
+Note: Loading the BL32 image in TZC secured DRAM doesn't change the memory
+layout of the other images in Trusted SRAM.
+CONFIG section in memory layouts shown below contains:
+    +--------------------+
+    |bl2_mem_params_descs|
+    |--------------------|
+    |     fw_configs     |
+    +--------------------+
+``bl2_mem_params_descs`` contains parameters passed from BL2 to next the
+BL image during boot.
+``fw_configs`` includes soc_fw_config, tos_fw_config and tb_fw_config.
+**FVP with TSP in Trusted SRAM with firmware configs :**
+(These diagrams only cover the AArch64 case)
+                   DRAM
+    0xffffffff +----------+
+               :          :
+               |----------|
+               |HW_CONFIG |
+    0x83000000 |----------|  (non-secure)
+               |          |
+    0x80000000 +----------+
+               Trusted SRAM
+    0x04040000 +----------+  loaded by BL2  +----------------+
+               | BL1 (rw) |  <<<<<<<<<<<<<  |                |
+               |----------|  <<<<<<<<<<<<<  |  BL31 NOBITS   |
+               |   BL2    |  <<<<<<<<<<<<<  |                |
+               |----------|  <<<<<<<<<<<<<  |----------------|
+               |          |  <<<<<<<<<<<<<  | BL31 PROGBITS  |
+               |          |  <<<<<<<<<<<<<  |----------------|
+               |          |  <<<<<<<<<<<<<  |     BL32       |
+    0x04002000 +----------+                 +----------------+
+               |  CONFIG  |
+    0x04001000 +----------+
+               |  Shared  |
+    0x04000000 +----------+
+               Trusted ROM
+    0x04000000 +----------+
+               | BL1 (ro) |
+    0x00000000 +----------+
+**FVP with TSP in Trusted DRAM with firmware configs (default option):**
+                     DRAM
+    0xffffffff +--------------+
+               :              :
+               |--------------|
+               |  HW_CONFIG   |
+    0x83000000 |--------------|  (non-secure)
+               |              |
+    0x80000000 +--------------+
+                Trusted DRAM
+    0x08000000 +--------------+
+               |     BL32     |
+    0x06000000 +--------------+
+                 Trusted SRAM
+    0x04040000 +--------------+  loaded by BL2  +----------------+
+               |   BL1 (rw)   |  <<<<<<<<<<<<<  |                |
+               |--------------|  <<<<<<<<<<<<<  |  BL31 NOBITS   |
+               |     BL2      |  <<<<<<<<<<<<<  |                |
+               |--------------|  <<<<<<<<<<<<<  |----------------|
+               |              |  <<<<<<<<<<<<<  | BL31 PROGBITS  |
+               |              |                 +----------------+
+               +--------------+
+               |    CONFIG    |
+    0x04001000 +--------------+
+               |    Shared    |
+    0x04000000 +--------------+
+                 Trusted ROM
+    0x04000000 +--------------+
+               |   BL1 (ro)   |
+    0x00000000 +--------------+
+**FVP with TSP in TZC-Secured DRAM with firmware configs :**
+                   DRAM
+    0xffffffff +----------+
+               |  BL32    |  (secure)
+    0xff000000 +----------+
+               |          |
+               |----------|
+               |HW_CONFIG |
+    0x83000000 |----------|  (non-secure)
+               |          |
+    0x80000000 +----------+
+               Trusted SRAM
+    0x04040000 +----------+  loaded by BL2  +----------------+
+               | BL1 (rw) |  <<<<<<<<<<<<<  |                |
+               |----------|  <<<<<<<<<<<<<  |  BL31 NOBITS   |
+               |   BL2    |  <<<<<<<<<<<<<  |                |
+               |----------|  <<<<<<<<<<<<<  |----------------|
+               |          |  <<<<<<<<<<<<<  | BL31 PROGBITS  |
+               |          |                 +----------------+
+    0x04002000 +----------+
+               |  CONFIG  |
+    0x04001000 +----------+
+               |  Shared  |
+    0x04000000 +----------+
+               Trusted ROM
+    0x04000000 +----------+
+               | BL1 (ro) |
+    0x00000000 +----------+
+**Juno with BL32 in Trusted SRAM :**
+                  Flash0
+    0x0C000000 +----------+
+               :          :
+    0x0BED0000 |----------|
+               | BL1 (ro) |
+    0x0BEC0000 |----------|
+               :          :
+    0x08000000 +----------+                  BL31 is loaded
+                                             after SCP_BL2 has
+               Trusted SRAM                  been sent to SCP
+    0x04040000 +----------+  loaded by BL2  +----------------+
+               | BL1 (rw) |  <<<<<<<<<<<<<  |                |
+               |----------|  <<<<<<<<<<<<<  |  BL31 NOBITS   |
+               |   BL2    |  <<<<<<<<<<<<<  |                |
+               |----------|  <<<<<<<<<<<<<  |----------------|
+               | SCP_BL2  |  <<<<<<<<<<<<<  | BL31 PROGBITS  |
+               |----------|  <<<<<<<<<<<<<  |----------------|
+               |          |  <<<<<<<<<<<<<  |     BL32       |
+               |          |                 +----------------+
+               |          |
+    0x04001000 +----------+
+               |   MHU    |
+    0x04000000 +----------+
+**Juno with BL32 in TZC-secured DRAM :**
+                   DRAM
+    0xFFE00000 +----------+
+               |  BL32    |  (secure)
+    0xFF000000 |----------|
+               |          |
+               :          :  (non-secure)
+               |          |
+    0x80000000 +----------+
+                  Flash0
+    0x0C000000 +----------+
+               :          :
+    0x0BED0000 |----------|
+               | BL1 (ro) |
+    0x0BEC0000 |----------|
+               :          :
+    0x08000000 +----------+                  BL31 is loaded
+                                             after SCP_BL2 has
+               Trusted SRAM                  been sent to SCP
+    0x04040000 +----------+  loaded by BL2  +----------------+
+               | BL1 (rw) |  <<<<<<<<<<<<<  |                |
+               |----------|  <<<<<<<<<<<<<  |  BL31 NOBITS   |
+               |   BL2    |  <<<<<<<<<<<<<  |                |
+               |----------|  <<<<<<<<<<<<<  |----------------|
+               | SCP_BL2  |  <<<<<<<<<<<<<  | BL31 PROGBITS  |
+               |----------|                 +----------------+
+    0x04001000 +----------+
+               |   MHU    |
+    0x04000000 +----------+
+Library at ROM
+Please refer to the `ROMLIB Design`_ document.
+Firmware Image Package (FIP)
+Using a Firmware Image Package (FIP) allows for packing bootloader images (and
+potentially other payloads) into a single archive that can be loaded by TF-A
+from non-volatile platform storage. A driver to load images from a FIP has
+been added to the storage layer and allows a package to be read from supported
+platform storage. A tool to create Firmware Image Packages is also provided
+and described below.
+Firmware Image Package layout
+The FIP layout consists of a table of contents (ToC) followed by payload data.
+The ToC itself has a header followed by one or more table entries. The ToC is
+terminated by an end marker entry, and since the size of the ToC is 0 bytes,
+the offset equals the total size of the FIP file. All ToC entries describe some
+payload data that has been appended to the end of the binary package. With the
+information provided in the ToC entry the corresponding payload data can be
+    ------------------
+    | ToC Header     |
+    |----------------|
+    | ToC Entry 0    |
+    |----------------|
+    | ToC Entry 1    |
+    |----------------|
+    | ToC End Marker |
+    |----------------|
+    |                |
+    |     Data 0     |
+    |                |
+    |----------------|
+    |                |
+    |     Data 1     |
+    |                |
+    ------------------
+The ToC header and entry formats are described in the header file
+``include/tools_share/firmware_image_package.h``. This file is used by both the
+tool and TF-A.
+The ToC header has the following fields:
+    `name`: The name of the ToC. This is currently used to validate the header.
+    `serial_number`: A non-zero number provided by the creation tool
+    `flags`: Flags associated with this data.
+        Bits 0-31: Reserved
+        Bits 32-47: Platform defined
+        Bits 48-63: Reserved
+A ToC entry has the following fields:
+    `uuid`: All files are referred to by a pre-defined Universally Unique
+        IDentifier [UUID] . The UUIDs are defined in
+        `include/tools_share/firmware_image_package.h`. The platform translates
+        the requested image name into the corresponding UUID when accessing the
+        package.
+    `offset_address`: The offset address at which the corresponding payload data
+        can be found. The offset is calculated from the ToC base address.
+    `size`: The size of the corresponding payload data in bytes.
+    `flags`: Flags associated with this entry. None are yet defined.
+Firmware Image Package creation tool
+The FIP creation tool can be used to pack specified images into a binary
+package that can be loaded by TF-A from platform storage. The tool currently
+only supports packing bootloader images. Additional image definitions can be
+added to the tool as required.
+The tool can be found in ``tools/fiptool``.
+Loading from a Firmware Image Package (FIP)
+The Firmware Image Package (FIP) driver can load images from a binary package on
+non-volatile platform storage. For the Arm development platforms, this is
+currently NOR FLASH.
+Bootloader images are loaded according to the platform policy as specified by
+the function ``plat_get_image_source()``. For the Arm development platforms, this
+means the platform will attempt to load images from a Firmware Image Package
+located at the start of NOR FLASH0.
+The Arm development platforms' policy is to only allow loading of a known set of
+images. The platform policy can be modified to allow additional images.
+Use of coherent memory in TF-A
+There might be loss of coherency when physical memory with mismatched
+shareability, cacheability and memory attributes is accessed by multiple CPUs
+(refer to section B2.9 of `Arm ARM`_ for more details). This possibility occurs
+in TF-A during power up/down sequences when coherency, MMU and caches are
+turned on/off incrementally.
+TF-A defines coherent memory as a region of memory with Device nGnRE attributes
+in the translation tables. The translation granule size in TF-A is 4KB. This
+is the smallest possible size of the coherent memory region.
+By default, all data structures which are susceptible to accesses with
+mismatched attributes from various CPUs are allocated in a coherent memory
+region (refer to section 2.1 of `Porting Guide`_). The coherent memory region
+accesses are Outer Shareable, non-cacheable and they can be accessed
+with the Device nGnRE attributes when the MMU is turned on. Hence, at the
+expense of at least an extra page of memory, TF-A is able to work around
+coherency issues due to mismatched memory attributes.
+The alternative to the above approach is to allocate the susceptible data
+structures in Normal WriteBack WriteAllocate Inner shareable memory. This
+approach requires the data structures to be designed so that it is possible to
+work around the issue of mismatched memory attributes by performing software
+cache maintenance on them.
+Disabling the use of coherent memory in TF-A
+It might be desirable to avoid the cost of allocating coherent memory on
+platforms which are memory constrained. TF-A enables inclusion of coherent
+memory in firmware images through the build flag ``USE_COHERENT_MEM``.
+This flag is enabled by default. It can be disabled to choose the second
+approach described above.
+The below sections analyze the data structures allocated in the coherent memory
+region and the changes required to allocate them in normal memory.
+Coherent memory usage in PSCI implementation
+The ``psci_non_cpu_pd_nodes`` data structure stores the platform's power domain
+tree information for state management of power domains. By default, this data
+structure is allocated in the coherent memory region in TF-A because it can be
+accessed by multiple CPUs, either with caches enabled or disabled.
+.. code:: c
+    typedef struct non_cpu_pwr_domain_node {
+        /*
+         * Index of the first CPU power domain node level 0 which has this node
+         * as its parent.
+         */
+        unsigned int cpu_start_idx;
+        /*
+         * Number of CPU power domains which are siblings of the domain indexed
+         * by 'cpu_start_idx' i.e. all the domains in the range 'cpu_start_idx
+         * -> cpu_start_idx + ncpus' have this node as their parent.
+         */
+        unsigned int ncpus;
+        /*
+         * Index of the parent power domain node.
+         */
+        unsigned int parent_node;
+        plat_local_state_t local_state;
+        unsigned char level;
+        /* For indexing the psci_lock array*/
+        unsigned char lock_index;
+    } non_cpu_pd_node_t;
+In order to move this data structure to normal memory, the use of each of its
+fields must be analyzed. Fields like ``cpu_start_idx``, ``ncpus``, ``parent_node``
+``level`` and ``lock_index`` are only written once during cold boot. Hence removing
+them from coherent memory involves only doing a clean and invalidate of the
+cache lines after these fields are written.
+The field ``local_state`` can be concurrently accessed by multiple CPUs in
+different cache states. A Lamport's Bakery lock ``psci_locks`` is used to ensure
+mutual exclusion to this field and a clean and invalidate is needed after it
+is written.
+Bakery lock data
+The bakery lock data structure ``bakery_lock_t`` is allocated in coherent memory
+and is accessed by multiple CPUs with mismatched attributes. ``bakery_lock_t`` is
+defined as follows:
+.. code:: c
+    typedef struct bakery_lock {
+        /*
+         * The lock_data is a bit-field of 2 members:
+         * Bit[0]       : choosing. This field is set when the CPU is
+         *                choosing its bakery number.
+         * Bits[1 - 15] : number. This is the bakery number allocated.
+         */
+        volatile uint16_t lock_data[BAKERY_LOCK_MAX_CPUS];
+    } bakery_lock_t;
+It is a characteristic of Lamport's Bakery algorithm that the volatile per-CPU
+fields can be read by all CPUs but only written to by the owning CPU.
+Depending upon the data cache line size, the per-CPU fields of the
+``bakery_lock_t`` structure for multiple CPUs may exist on a single cache line.
+These per-CPU fields can be read and written during lock contention by multiple
+CPUs with mismatched memory attributes. Since these fields are a part of the
+lock implementation, they do not have access to any other locking primitive to
+safeguard against the resulting coherency issues. As a result, simple software
+cache maintenance is not enough to allocate them in coherent memory. Consider
+the following example.
+CPU0 updates its per-CPU field with data cache enabled. This write updates a
+local cache line which contains a copy of the fields for other CPUs as well. Now
+CPU1 updates its per-CPU field of the ``bakery_lock_t`` structure with data cache
+disabled. CPU1 then issues a DCIVAC operation to invalidate any stale copies of
+its field in any other cache line in the system. This operation will invalidate
+the update made by CPU0 as well.
+To use bakery locks when ``USE_COHERENT_MEM`` is disabled, the lock data structure
+has been redesigned. The changes utilise the characteristic of Lamport's Bakery
+algorithm mentioned earlier. The bakery_lock structure only allocates the memory
+for a single CPU. The macro ``DEFINE_BAKERY_LOCK`` allocates all the bakery locks
+needed for a CPU into a section ``bakery_lock``. The linker allocates the memory
+for other cores by using the total size allocated for the bakery_lock section
+and multiplying it with (PLATFORM_CORE_COUNT - 1). This enables software to
+perform software cache maintenance on the lock data structure without running
+into coherency issues associated with mismatched attributes.
+The bakery lock data structure ``bakery_info_t`` is defined for use when
+``USE_COHERENT_MEM`` is disabled as follows:
+.. code:: c
+    typedef struct bakery_info {
+        /*
+         * The lock_data is a bit-field of 2 members:
+         * Bit[0]       : choosing. This field is set when the CPU is
+         *                choosing its bakery number.
+         * Bits[1 - 15] : number. This is the bakery number allocated.
+         */
+         volatile uint16_t lock_data;
+    } bakery_info_t;
+The ``bakery_info_t`` represents a single per-CPU field of one lock and
+the combination of corresponding ``bakery_info_t`` structures for all CPUs in the
+system represents the complete bakery lock. The view in memory for a system
+with n bakery locks are:
+    bakery_lock section start
+    |----------------|
+    | `bakery_info_t`| <-- Lock_0 per-CPU field
+    |    Lock_0      |     for CPU0
+    |----------------|
+    | `bakery_info_t`| <-- Lock_1 per-CPU field
+    |    Lock_1      |     for CPU0
+    |----------------|
+    | ....           |
+    |----------------|
+    | `bakery_info_t`| <-- Lock_N per-CPU field
+    |    Lock_N      |     for CPU0
+    ------------------
+    |    XXXXX       |
+    | Padding to     |
+    | next Cache WB  | <--- Calculate PERCPU_BAKERY_LOCK_SIZE, allocate
+    |  Granule       |       continuous memory for remaining CPUs.
+    ------------------
+    | `bakery_info_t`| <-- Lock_0 per-CPU field
+    |    Lock_0      |     for CPU1
+    |----------------|
+    | `bakery_info_t`| <-- Lock_1 per-CPU field
+    |    Lock_1      |     for CPU1
+    |----------------|
+    | ....           |
+    |----------------|
+    | `bakery_info_t`| <-- Lock_N per-CPU field
+    |    Lock_N      |     for CPU1
+    ------------------
+    |    XXXXX       |
+    | Padding to     |
+    | next Cache WB  |
+    |  Granule       |
+    ------------------
+Consider a system of 2 CPUs with 'N' bakery locks as shown above. For an
+operation on Lock_N, the corresponding ``bakery_info_t`` in both CPU0 and CPU1
+``bakery_lock`` section need to be fetched and appropriate cache operations need
+to be performed for each access.
+On Arm Platforms, bakery locks are used in psci (``psci_locks``) and power controller
+driver (``arm_lock``).
+Non Functional Impact of removing coherent memory
+Removal of the coherent memory region leads to the additional software overhead
+of performing cache maintenance for the affected data structures. However, since
+the memory where the data structures are allocated is cacheable, the overhead is
+mostly mitigated by an increase in performance.
+There is however a performance impact for bakery locks, due to:
+-  Additional cache maintenance operations, and
+-  Multiple cache line reads for each lock operation, since the bakery locks
+   for each CPU are distributed across different cache lines.
+The implementation has been optimized to minimize this additional overhead.
+Measurements indicate that when bakery locks are allocated in Normal memory, the
+minimum latency of acquiring a lock is on an average 3-4 micro seconds whereas
+in Device memory the same is 2 micro seconds. The measurements were done on the
+Juno Arm development platform.
+As mentioned earlier, almost a page of memory can be saved by disabling
+``USE_COHERENT_MEM``. Each platform needs to consider these trade-offs to decide
+whether coherent memory should be used. If a platform disables
+``USE_COHERENT_MEM`` and needs to use bakery locks in the porting layer, it can
+optionally define macro ``PLAT_PERCPU_BAKERY_LOCK_SIZE`` (see the
+`Porting Guide`_). Refer to the reference platform code for examples.
+Isolating code and read-only data on separate memory pages
+In the Armv8-A VMSA, translation table entries include fields that define the
+properties of the target memory region, such as its access permissions. The
+smallest unit of memory that can be addressed by a translation table entry is
+a memory page. Therefore, if software needs to set different permissions on two
+memory regions then it needs to map them using different memory pages.
+The default memory layout for each BL image is as follows:
+       |        ...        |
+       +-------------------+
+       |  Read-write data  |
+       +-------------------+ Page boundary
+       |     <Padding>     |
+       +-------------------+
+       | Exception vectors |
+       +-------------------+ 2 KB boundary
+       |     <Padding>     |
+       +-------------------+
+       |  Read-only data   |
+       +-------------------+
+       |       Code        |
+       +-------------------+ BLx_BASE
+Note: The 2KB alignment for the exception vectors is an architectural
+The read-write data start on a new memory page so that they can be mapped with
+read-write permissions, whereas the code and read-only data below are configured
+as read-only.
+However, the read-only data are not aligned on a page boundary. They are
+contiguous to the code. Therefore, the end of the code section and the beginning
+of the read-only data one might share a memory page. This forces both to be
+mapped with the same memory attributes. As the code needs to be executable, this
+means that the read-only data stored on the same memory page as the code are
+executable as well. This could potentially be exploited as part of a security
+TF provides the build flag ``SEPARATE_CODE_AND_RODATA`` to isolate the code and
+read-only data on separate memory pages. This in turn allows independent control
+of the access permissions for the code and read-only data. In this case,
+platform code gets a finer-grained view of the image layout and can
+appropriately map the code region as executable and the read-only data as
+This has an impact on memory footprint, as padding bytes need to be introduced
+between the code and read-only data to ensure the segregation of the two. To
+limit the memory cost, this flag also changes the memory layout such that the
+code and exception vectors are now contiguous, like so:
+       |        ...        |
+       +-------------------+
+       |  Read-write data  |
+       +-------------------+ Page boundary
+       |     <Padding>     |
+       +-------------------+
+       |  Read-only data   |
+       +-------------------+ Page boundary
+       |     <Padding>     |
+       +-------------------+
+       | Exception vectors |
+       +-------------------+ 2 KB boundary
+       |     <Padding>     |
+       +-------------------+
+       |       Code        |
+       +-------------------+ BLx_BASE
+With this more condensed memory layout, the separation of read-only data will
+add zero or one page to the memory footprint of each BL image. Each platform
+should consider the trade-off between memory footprint and security.
+This build flag is disabled by default, minimising memory footprint. On Arm
+platforms, it is enabled.
+Publish and Subscribe Framework
+The Publish and Subscribe Framework allows EL3 components to define and publish
+events, to which other EL3 components can subscribe.
+The following macros are provided by the framework:
+-  ``REGISTER_PUBSUB_EVENT(event)``: Defines an event, and takes one argument,
+   the event name, which must be a valid C identifier. All calls to
+   ``REGISTER_PUBSUB_EVENT`` macro must be placed in the file
+   ``pubsub_events.h``.
+-  ``PUBLISH_EVENT_ARG(event, arg)``: Publishes a defined event, by iterating
+   subscribed handlers and calling them in turn. The handlers will be passed the
+   parameter ``arg``. The expected use-case is to broadcast an event.
+-  ``PUBLISH_EVENT(event)``: Like ``PUBLISH_EVENT_ARG``, except that the value
+   ``NULL`` is passed to subscribed handlers.
+-  ``SUBSCRIBE_TO_EVENT(event, handler)``: Registers the ``handler`` to
+   subscribe to ``event``. The handler will be executed whenever the ``event``
+   is published.
+-  ``for_each_subscriber(event, subscriber)``: Iterates through all handlers
+   subscribed for ``event``. ``subscriber`` must be a local variable of type
+   ``pubsub_cb_t *``, and will point to each subscribed handler in turn during
+   iteration. This macro can be used for those patterns that none of the
+   ``PUBLISH_EVENT_*()`` macros cover.
+Publishing an event that wasn't defined using ``REGISTER_PUBSUB_EVENT`` will
+result in build error. Subscribing to an undefined event however won't.
+Subscribed handlers must be of type ``pubsub_cb_t``, with following function
+   typedef void* (*pubsub_cb_t)(const void *arg);
+There may be arbitrary number of handlers registered to the same event. The
+order in which subscribed handlers are notified when that event is published is
+not defined. Subscribed handlers may be executed in any order; handlers should
+not assume any relative ordering amongst them.
+Publishing an event on a PE will result in subscribed handlers executing on that
+PE only; it won't cause handlers to execute on a different PE.
+Note that publishing an event on a PE blocks until all the subscribed handlers
+finish executing on the PE.
+TF-A generic code publishes and subscribes to some events within. Platform
+ports are discouraged from subscribing to them. These events may be withdrawn,
+renamed, or have their semantics altered in the future. Platforms may however
+register, publish, and subscribe to platform-specific events.
+Publish and Subscribe Example
+A publisher that wants to publish event ``foo`` would:
+-  Define the event ``foo`` in the ``pubsub_events.h``.
+   ::
+-  Depending on the nature of event, use one of ``PUBLISH_EVENT_*()`` macros to
+   publish the event at the appropriate path and time of execution.
+A subscriber that wants to subscribe to event ``foo`` published above would
+.. code:: c
+    void *foo_handler(const void *arg)
+    {
+         void *result;
+         /* Do handling ... */
+         return result;
+    }
+    SUBSCRIBE_TO_EVENT(foo, foo_handler);
+Reclaiming the BL31 initialization code
+A significant amount of the code used for the initialization of BL31 is never
+needed again after boot time. In order to reduce the runtime memory
+footprint, the memory used for this code can be reclaimed after initialization
+has finished and be used for runtime data.
+The build option ``RECLAIM_INIT_CODE`` can be set to mark this boot time code
+with a ``.text.init.*`` attribute which can be filtered and placed suitably
+within the BL image for later reclamation by the platform. The platform can
+specify the filter and the memory region for this init section in BL31 via the
+plat.ld.S linker script. For example, on the FVP, this section is placed
+overlapping the secondary CPU stacks so that after the cold boot is done, this
+memory can be reclaimed for the stacks. The init memory section is initially
+mapped with ``RO``, ``EXECUTE`` attributes. After BL31 initialization has
+completed, the FVP changes the attributes of this section to ``RW``,
+``EXECUTE_NEVER`` allowing it to be used for runtime data. The memory attributes
+are changed within the ``bl31_plat_runtime_setup`` platform hook. The init
+section section can be reclaimed for any data which is accessed after cold
+boot initialization and it is upto the platform to make the decision.
+Performance Measurement Framework
+The Performance Measurement Framework (PMF) facilitates collection of
+timestamps by registered services and provides interfaces to retrieve them
+from within TF-A. A platform can choose to expose appropriate SMCs to
+retrieve these collected timestamps.
+By default, the global physical counter is used for the timestamp
+value and is read via ``CNTPCT_EL0``. The framework allows to retrieve
+timestamps captured by other CPUs.
+Timestamp identifier format
+A PMF timestamp is uniquely identified across the system via the
+timestamp ID or ``tid``. The ``tid`` is composed as follows:
+    Bits 0-7: The local timestamp identifier.
+    Bits 8-9: Reserved.
+    Bits 10-15: The service identifier.
+    Bits 16-31: Reserved.
+#. The service identifier. Each PMF service is identified by a
+   service name and a service identifier. Both the service name and
+   identifier are unique within the system as a whole.
+#. The local timestamp identifier. This identifier is unique within a given
+   service.
+Registering a PMF service
+To register a PMF service, the ``PMF_REGISTER_SERVICE()`` macro from ``pmf.h``
+is used. The arguments required are the service name, the service ID,
+the total number of local timestamps to be captured and a set of flags.
+The ``flags`` field can be specified as a bitwise-OR of the following values:
+    PMF_STORE_ENABLE: The timestamp is stored in memory for later retrieval.
+    PMF_DUMP_ENABLE: The timestamp is dumped on the serial console.
+The ``PMF_REGISTER_SERVICE()`` reserves memory to store captured
+timestamps in a PMF specific linker section at build time.
+Additionally, it defines necessary functions to capture and
+retrieve a particular timestamp for the given service at runtime.
+The macro ``PMF_REGISTER_SERVICE()`` only enables capturing PMF timestamps
+from within TF-A. In order to retrieve timestamps from outside of TF-A, the
+``PMF_REGISTER_SERVICE_SMC()`` macro must be used instead. This macro
+accepts the same set of arguments as the ``PMF_REGISTER_SERVICE()``
+macro but additionally supports retrieving timestamps using SMCs.
+Capturing a timestamp
+PMF timestamps are stored in a per-service timestamp region. On a
+system with multiple CPUs, each timestamp is captured and stored
+in a per-CPU cache line aligned memory region.
+Having registered the service, the ``PMF_CAPTURE_TIMESTAMP()`` macro can be
+used to capture a timestamp at the location where it is used. The macro
+takes the service name, a local timestamp identifier and a flag as arguments.
+The ``flags`` field argument can be zero, or ``PMF_CACHE_MAINT`` which
+instructs PMF to do cache maintenance following the capture. Cache
+maintenance is required if any of the service's timestamps are captured
+with data cache disabled.
+To capture a timestamp in assembly code, the caller should use
+``pmf_calc_timestamp_addr`` macro (defined in ``pmf_asm_macros.S``) to
+calculate the address of where the timestamp would be stored. The
+caller should then read ``CNTPCT_EL0`` register to obtain the timestamp
+and store it at the determined address for later retrieval.
+Retrieving a timestamp
+From within TF-A, timestamps for individual CPUs can be retrieved using either
+These macros accept the CPU's MPIDR value, or its ordinal position
+From outside TF-A, timestamps for individual CPUs can be retrieved by calling
+into ``pmf_smc_handler()``.
+.. code:: c
+    Interface : pmf_smc_handler()
+    Argument  : unsigned int smc_fid, u_register_t x1,
+                u_register_t x2, u_register_t x3,
+                u_register_t x4, void *cookie,
+                void *handle, u_register_t flags
+    Return    : uintptr_t
+    smc_fid: Holds the SMC identifier which is either `PMF_SMC_GET_TIMESTAMP_32`
+        when the caller of the SMC is running in AArch32 mode
+        or `PMF_SMC_GET_TIMESTAMP_64` when the caller is running in AArch64 mode.
+    x1: Timestamp identifier.
+    x2: The `mpidr` of the CPU for which the timestamp has to be retrieved.
+        This can be the `mpidr` of a different core to the one initiating
+        the SMC.  In that case, service specific cache maintenance may be
+        required to ensure the updated copy of the timestamp is returned.
+    x3: A flags value that is either 0 or `PMF_CACHE_MAINT`.  If
+        `PMF_CACHE_MAINT` is passed, then the PMF code will perform a
+        cache invalidate before reading the timestamp.  This ensures
+        an updated copy is returned.
+The remaining arguments, ``x4``, ``cookie``, ``handle`` and ``flags`` are unused
+in this implementation.
+PMF code structure
+#. ``pmf_main.c`` consists of core functions that implement service registration,
+   initialization, storing, dumping and retrieving timestamps.
+#. ``pmf_smc.c`` contains the SMC handling for registered PMF services.
+#. ``pmf.h`` contains the public interface to Performance Measurement Framework.
+#. ``pmf_asm_macros.S`` consists of macros to facilitate capturing timestamps in
+   assembly code.
+#. ``pmf_helpers.h`` is an internal header used by ``pmf.h``.
+Armv8-A Architecture Extensions
+TF-A makes use of Armv8-A Architecture Extensions where applicable. This
+section lists the usage of Architecture Extensions, and build flags
+controlling them.
+In general, and unless individually mentioned, the build options
+``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` select the Architecture Extension to
+target when building TF-A. Subsequent Arm Architecture Extensions are backward
+compatible with previous versions.
+The build system only requires that ``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` have a
+valid numeric value. These build options only control whether or not
+Architecture Extension-specific code is included in the build. Otherwise, TF-A
+targets the base Armv8.0-A architecture; i.e. as if ``ARM_ARCH_MAJOR`` == 8
+and ``ARM_ARCH_MINOR`` == 0, which are also their respective default values.
+See also the *Summary of build options* in `User Guide`_.
+For details on the Architecture Extension and available features, please refer
+to the respective Architecture Extension Supplement.
+This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` >= 8, or when
+``ARM_ARCH_MAJOR`` == 8 and ``ARM_ARCH_MINOR`` >= 1.
+-  The Compare and Swap instruction is used to implement spinlocks. Otherwise,
+   the load-/store-exclusive instruction pair is used.
+-  The presence of ARMv8.2-TTCNP is detected at runtime. When it is present, the
+   Common not Private (TTBRn_ELx.CnP) bit is enabled to indicate that multiple
+   Processing Elements in the same Inner Shareable domain use the same
+   translation table entries for a given stage of translation for a particular
+   translation regime.
+-  Pointer authentication features of Armv8.3-A are unconditionally enabled in
+   the Non-secure world so that lower ELs are allowed to use them without
+   causing a trap to EL3.
+   In order to enable the Secure world to use it, ``CTX_INCLUDE_PAUTH_REGS``
+   must be set to 1. This will add all pointer authentication system registers
+   to the context that is saved when doing a world switch.
+   The TF-A itself has support for pointer authentication at runtime
+   that can be enabled by setting both options ``ENABLE_PAUTH`` and
+   ``CTX_INCLUDE_PAUTH_REGS`` to 1. This enables pointer authentication in BL1,
+   BL2, BL31, and the TSP if it is used.
+   These options are experimental features.
+   Note that Pointer Authentication is enabled for Non-secure world irrespective
+   of the value of these build flags if the CPU supports it.
+   If ``ARM_ARCH_MAJOR == 8`` and ``ARM_ARCH_MINOR >= 3`` the code footprint of
+   enabling PAuth is lower because the compiler will use the optimized
+   PAuth instructions rather than the backwards-compatible ones.
+This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` == 7.
+There are several Armv7-A extensions available. Obviously the TrustZone
+extension is mandatory to support the TF-A bootloader and runtime services.
+Platform implementing an Armv7-A system can to define from its target
+Cortex-A architecture through ``ARM_CORTEX_A<X> = yes`` in their
+```` script. For example ``ARM_CORTEX_A15=yes`` for a
+Cortex-A15 target.
+Platform can also set ``ARM_WITH_NEON=yes`` to enable neon support.
+Note that using neon at runtime has constraints on non secure wolrd context.
+TF-A does not yet provide VFP context management.
+Directive ``ARM_CORTEX_A<x>`` and ``ARM_WITH_NEON`` are used to set
+the toolchain  target architecture directive.
+Platform may choose to not define straight the toolchain target architecture
+directive by defining ``MARCH32_DIRECTIVE``.
+   MARCH32_DIRECTIVE := -mach=armv7-a
+Code Structure
+TF-A code is logically divided between the three boot loader stages mentioned
+in the previous sections. The code is also divided into the following
+categories (present as directories in the source code):
+-  **Platform specific.** Choice of architecture specific code depends upon
+   the platform.
+-  **Common code.** This is platform and architecture agnostic code.
+-  **Library code.** This code comprises of functionality commonly used by all
+   other code. The PSCI implementation and other EL3 runtime frameworks reside
+   as Library components.
+-  **Stage specific.** Code specific to a boot stage.
+-  **Drivers.**
+-  **Services.** EL3 runtime services (eg: SPD). Specific SPD services
+   reside in the ``services/spd`` directory (e.g. ``services/spd/tspd``).
+Each boot loader stage uses code from one or more of the above mentioned
+categories. Based upon the above, the code layout looks like this:
+    Directory    Used by BL1?    Used by BL2?    Used by BL31?
+    bl1          Yes             No              No
+    bl2          No              Yes             No
+    bl31         No              No              Yes
+    plat         Yes             Yes             Yes
+    drivers      Yes             No              Yes
+    common       Yes             Yes             Yes
+    lib          Yes             Yes             Yes
+    services     No              No              Yes
+The build system provides a non configurable build option IMAGE_BLx for each
+boot loader stage (where x = BL stage). e.g. for BL1 , IMAGE_BL1 will be
+defined by the build system. This enables TF-A to compile certain code only
+for specific boot loader stages
+All assembler files have the ``.S`` extension. The linker source files for each
+boot stage have the extension ``.ld.S``. These are processed by GCC to create the
+linker scripts which have the extension ``.ld``.
+FDTs provide a description of the hardware platform and are used by the Linux
+kernel at boot time. These can be found in the ``fdts`` directory.
+.. [#] `Trusted Board Boot Requirements CLIENT (TBBR-CLIENT) Armv8-A (ARM DEN0006D)`_
+.. [#] `Power State Coordination Interface PDD`_
+.. [#] `SMC Calling Convention PDD`_
+.. [#] `TF-A Interrupt Management Design guide`_.
+*Copyright (c) 2013-2019, Arm Limited and Contributors. All rights reserved.*
+.. _Reset Design: ./reset-design.rst
+.. _Porting Guide: ../getting_started/porting-guide.rst
+.. _Firmware Update: ./firmware-update.rst
+.. _PSCI PDD:
+.. _SMC calling convention PDD:
+.. _PSCI Library integration guide: ../getting_started/psci-lib-integration-guide.rst
+.. _SMCCC:
+.. _PSCI:
+.. _Power State Coordination Interface PDD:
+.. _here: ../getting_started/psci-lib-integration-guide.rst
+.. _cpu-specific-build-macros.rst: ./cpu-specific-build-macros.rst
+.. _CPUBM: ./cpu-specific-build-macros.rst
+.. _Arm ARM:
+.. _User Guide: ../getting_started/user-guide.rst
+.. _SMC Calling Convention PDD:
+.. _TF-A Interrupt Management Design guide: ./interrupt-framework-design.rst
+.. _Xlat_tables design: xlat-tables-lib-v2-design.rst
+.. _Exception Handling Framework: exception-handling.rst
+.. _ROMLIB Design: romlib-design.rst
+.. _Trusted Board Boot Requirements CLIENT (TBBR-CLIENT) Armv8-A (ARM DEN0006D):
+.. |Image 1| image:: diagrams/rt-svc-descs-layout.png?raw=true
diff --git a/docs/design/index.rst b/docs/design/index.rst
new file mode 100644
index 0000000..a51a4eb
--- /dev/null
+++ b/docs/design/index.rst
@@ -0,0 +1,15 @@
+System Design
+.. toctree::
+   :maxdepth: 1
+   :caption: Contents
+   :numbered:
+   auth-framework
+   cpu-specific-build-macros
+   firmware-design
+   interrupt-framework-design
+   psci-pd-tree
+   reset-design
+   trusted-board-boot
diff --git a/docs/design/interrupt-framework-design.rst b/docs/design/interrupt-framework-design.rst
new file mode 100644
index 0000000..e4ec65a
--- /dev/null
+++ b/docs/design/interrupt-framework-design.rst
@@ -0,0 +1,1024 @@
+Trusted Firmware-A interrupt management design guide
+.. contents::
+This framework is responsible for managing interrupts routed to EL3. It also
+allows EL3 software to configure the interrupt routing behavior. Its main
+objective is to implement the following two requirements.
+#. It should be possible to route interrupts meant to be handled by secure
+   software (Secure interrupts) to EL3, when execution is in non-secure state
+   (normal world). The framework should then take care of handing control of
+   the interrupt to either software in EL3 or Secure-EL1 depending upon the
+   software configuration and the GIC implementation. This requirement ensures
+   that secure interrupts are under the control of the secure software with
+   respect to their delivery and handling without the possibility of
+   intervention from non-secure software.
+#. It should be possible to route interrupts meant to be handled by
+   non-secure software (Non-secure interrupts) to the last executed exception
+   level in the normal world when the execution is in secure world at
+   exception levels lower than EL3. This could be done with or without the
+   knowledge of software executing in Secure-EL1/Secure-EL0. The choice of
+   approach should be governed by the secure software. This requirement
+   ensures that non-secure software is able to execute in tandem with the
+   secure software without overriding it.
+Interrupt types
+The framework categorises an interrupt to be one of the following depending upon
+the exception level(s) it is handled in.
+#. Secure EL1 interrupt. This type of interrupt can be routed to EL3 or
+   Secure-EL1 depending upon the security state of the current execution
+   context. It is always handled in Secure-EL1.
+#. Non-secure interrupt. This type of interrupt can be routed to EL3,
+   Secure-EL1, Non-secure EL1 or EL2 depending upon the security state of the
+   current execution context. It is always handled in either Non-secure EL1
+   or EL2.
+#. EL3 interrupt. This type of interrupt can be routed to EL3 or Secure-EL1
+   depending upon the security state of the current execution context. It is
+   always handled in EL3.
+The following constants define the various interrupt types in the framework
+    #define INTR_TYPE_S_EL1      0
+    #define INTR_TYPE_EL3        1
+    #define INTR_TYPE_NS         2
+Routing model
+A type of interrupt can be either generated as an FIQ or an IRQ. The target
+exception level of an interrupt type is configured through the FIQ and IRQ bits
+in the Secure Configuration Register at EL3 (``SCR_EL3.FIQ`` and ``SCR_EL3.IRQ``
+bits). When ``SCR_EL3.FIQ``\ =1, FIQs are routed to EL3. Otherwise they are routed
+to the First Exception Level (FEL) capable of handling interrupts. When
+``SCR_EL3.IRQ``\ =1, IRQs are routed to EL3. Otherwise they are routed to the
+FEL. This register is configured independently by EL3 software for each security
+state prior to entry into a lower exception level in that security state.
+A routing model for a type of interrupt (generated as FIQ or IRQ) is defined as
+its target exception level for each security state. It is represented by a
+single bit for each security state. A value of ``0`` means that the interrupt
+should be routed to the FEL. A value of ``1`` means that the interrupt should be
+routed to EL3. A routing model is applicable only when execution is not in EL3.
+The default routing model for an interrupt type is to route it to the FEL in
+either security state.
+Valid routing models
+The framework considers certain routing models for each type of interrupt to be
+incorrect as they conflict with the requirements mentioned in Section 1. The
+following sub-sections describe all the possible routing models and specify
+which ones are valid or invalid. EL3 interrupts are currently supported only
+for GIC version 3.0 (Arm GICv3) and only the Secure-EL1 and Non-secure interrupt
+types are supported for GIC version 2.0 (Arm GICv2) (see `Assumptions in
+Interrupt Management Framework`_). The terminology used in the following
+sub-sections is explained below.
+#. **CSS**. Current Security State. ``0`` when secure and ``1`` when non-secure
+#. **TEL3**. Target Exception Level 3. ``0`` when targeted to the FEL. ``1`` when
+   targeted to EL3.
+Secure-EL1 interrupts
+#. **CSS=0, TEL3=0**. Interrupt is routed to the FEL when execution is in
+   secure state. This is a valid routing model as secure software is in
+   control of handling secure interrupts.
+#. **CSS=0, TEL3=1**. Interrupt is routed to EL3 when execution is in secure
+   state. This is a valid routing model as secure software in EL3 can
+   handover the interrupt to Secure-EL1 for handling.
+#. **CSS=1, TEL3=0**. Interrupt is routed to the FEL when execution is in
+   non-secure state. This is an invalid routing model as a secure interrupt
+   is not visible to the secure software which violates the motivation behind
+   the Arm Security Extensions.
+#. **CSS=1, TEL3=1**. Interrupt is routed to EL3 when execution is in
+   non-secure state. This is a valid routing model as secure software in EL3
+   can handover the interrupt to Secure-EL1 for handling.
+Non-secure interrupts
+#. **CSS=0, TEL3=0**. Interrupt is routed to the FEL when execution is in
+   secure state. This allows the secure software to trap non-secure
+   interrupts, perform its book-keeping and hand the interrupt to the
+   non-secure software through EL3. This is a valid routing model as secure
+   software is in control of how its execution is preempted by non-secure
+   interrupts.
+#. **CSS=0, TEL3=1**. Interrupt is routed to EL3 when execution is in secure
+   state. This is a valid routing model as secure software in EL3 can save
+   the state of software in Secure-EL1/Secure-EL0 before handing the
+   interrupt to non-secure software. This model requires additional
+   coordination between Secure-EL1 and EL3 software to ensure that the
+   former's state is correctly saved by the latter.
+#. **CSS=1, TEL3=0**. Interrupt is routed to FEL when execution is in
+   non-secure state. This is a valid routing model as a non-secure interrupt
+   is handled by non-secure software.
+#. **CSS=1, TEL3=1**. Interrupt is routed to EL3 when execution is in
+   non-secure state. This is an invalid routing model as there is no valid
+   reason to route the interrupt to EL3 software and then hand it back to
+   non-secure software for handling.
+EL3 interrupts
+#. **CSS=0, TEL3=0**. Interrupt is routed to the FEL when execution is in
+   Secure-EL1/Secure-EL0. This is a valid routing model as secure software
+   in Secure-EL1/Secure-EL0 is in control of how its execution is preempted
+   by EL3 interrupt and can handover the interrupt to EL3 for handling.
+   However, when ``EL3_EXCEPTION_HANDLING`` is ``1``, this routing model is
+   invalid as EL3 interrupts are unconditionally routed to EL3, and EL3
+   interrupts will always preempt Secure EL1/EL0 execution. See `exception
+   handling`__ documentation.
+   .. __: exception-handling.rst#interrupt-handling
+#. **CSS=0, TEL3=1**. Interrupt is routed to EL3 when execution is in
+   Secure-EL1/Secure-EL0. This is a valid routing model as secure software
+   in EL3 can handle the interrupt.
+#. **CSS=1, TEL3=0**. Interrupt is routed to the FEL when execution is in
+   non-secure state. This is an invalid routing model as a secure interrupt
+   is not visible to the secure software which violates the motivation behind
+   the Arm Security Extensions.
+#. **CSS=1, TEL3=1**. Interrupt is routed to EL3 when execution is in
+   non-secure state. This is a valid routing model as secure software in EL3
+   can handle the interrupt.
+Mapping of interrupt type to signal
+The framework is meant to work with any interrupt controller implemented by a
+platform. A interrupt controller could generate a type of interrupt as either an
+FIQ or IRQ signal to the CPU depending upon the current security state. The
+mapping between the type and signal is known only to the platform. The framework
+uses this information to determine whether the IRQ or the FIQ bit should be
+programmed in ``SCR_EL3`` while applying the routing model for a type of
+interrupt. The platform provides this information through the
+``plat_interrupt_type_to_line()`` API (described in the
+`Porting Guide`_). For example, on the FVP port when the platform uses an Arm GICv2
+interrupt controller, Secure-EL1 interrupts are signaled through the FIQ signal
+while Non-secure interrupts are signaled through the IRQ signal. This applies
+when execution is in either security state.
+Effect of mapping of several interrupt types to one signal
+It should be noted that if more than one interrupt type maps to a single
+interrupt signal, and if any one of the interrupt type sets **TEL3=1** for a
+particular security state, then interrupt signal will be routed to EL3 when in
+that security state. This means that all the other interrupt types using the
+same interrupt signal will be forced to the same routing model. This should be
+borne in mind when choosing the routing model for an interrupt type.
+For example, in Arm GICv3, when the execution context is Secure-EL1/
+Secure-EL0, both the EL3 and the non secure interrupt types map to the FIQ
+signal. So if either one of the interrupt type sets the routing model so
+that **TEL3=1** when **CSS=0**, the FIQ bit in ``SCR_EL3`` will be programmed to
+route the FIQ signal to EL3 when executing in Secure-EL1/Secure-EL0, thereby
+effectively routing the other interrupt type also to EL3.
+Assumptions in Interrupt Management Framework
+The framework makes the following assumptions to simplify its implementation.
+#. Although the framework has support for 2 types of secure interrupts (EL3
+   and Secure-EL1 interrupt), only interrupt controller architectures
+   like Arm GICv3 has architectural support for EL3 interrupts in the form of
+   Group 0 interrupts. In Arm GICv2, all secure interrupts are assumed to be
+   handled in Secure-EL1. They can be delivered to Secure-EL1 via EL3 but they
+   cannot be handled in EL3.
+#. Interrupt exceptions (``PSTATE.I`` and ``F`` bits) are masked during execution
+   in EL3.
+#. Interrupt management: the following sections describe how interrupts are
+   managed by the interrupt handling framework. This entails:
+   #. Providing an interface to allow registration of a handler and
+      specification of the routing model for a type of interrupt.
+   #. Implementing support to hand control of an interrupt type to its
+      registered handler when the interrupt is generated.
+Both aspects of interrupt management involve various components in the secure
+software stack spanning from EL3 to Secure-EL1. These components are described
+in the section `Software components`_. The framework stores information
+associated with each type of interrupt in the following data structure.
+.. code:: c
+    typedef struct intr_type_desc {
+            interrupt_type_handler_t handler;
+            uint32_t flags;
+            uint32_t scr_el3[2];
+    } intr_type_desc_t;
+The ``flags`` field stores the routing model for the interrupt type in
+bits[1:0]. Bit[0] stores the routing model when execution is in the secure
+state. Bit[1] stores the routing model when execution is in the non-secure
+state. As mentioned in Section `Routing model`_, a value of ``0`` implies that
+the interrupt should be targeted to the FEL. A value of ``1`` implies that it
+should be targeted to EL3. The remaining bits are reserved and SBZ. The helper
+macro ``set_interrupt_rm_flag()`` should be used to set the bits in the
+``flags`` parameter.
+The ``scr_el3[2]`` field also stores the routing model but as a mapping of the
+model in the ``flags`` field to the corresponding bit in the ``SCR_EL3`` for each
+security state.
+The framework also depends upon the platform port to configure the interrupt
+controller to distinguish between secure and non-secure interrupts. The platform
+is expected to be aware of the secure devices present in the system and their
+associated interrupt numbers. It should configure the interrupt controller to
+enable the secure interrupts, ensure that their priority is always higher than
+the non-secure interrupts and target them to the primary CPU. It should also
+export the interface described in the `Porting Guide`_ to enable
+handling of interrupts.
+In the remainder of this document, for the sake of simplicity a Arm GICv2 system
+is considered and it is assumed that the FIQ signal is used to generate Secure-EL1
+interrupts and the IRQ signal is used to generate non-secure interrupts in either
+security state. EL3 interrupts are not considered.
+Software components
+Roles and responsibilities for interrupt management are sub-divided between the
+following components of software running in EL3 and Secure-EL1. Each component is
+briefly described below.
+#. EL3 Runtime Firmware. This component is common to all ports of TF-A.
+#. Secure Payload Dispatcher (SPD) service. This service interfaces with the
+   Secure Payload (SP) software which runs in Secure-EL1/Secure-EL0 and is
+   responsible for switching execution between secure and non-secure states.
+   A switch is triggered by a Secure Monitor Call and it uses the APIs
+   exported by the Context management library to implement this functionality.
+   Switching execution between the two security states is a requirement for
+   interrupt management as well. This results in a significant dependency on
+   the SPD service. TF-A implements an example Test Secure Payload Dispatcher
+   (TSPD) service.
+   An SPD service plugs into the EL3 runtime firmware and could be common to
+   some ports of TF-A.
+#. Secure Payload (SP). On a production system, the Secure Payload corresponds
+   to a Secure OS which runs in Secure-EL1/Secure-EL0. It interfaces with the
+   SPD service to manage communication with non-secure software. TF-A
+   implements an example secure payload called Test Secure Payload (TSP)
+   which runs only in Secure-EL1.
+   A Secure payload implementation could be common to some ports of TF-A,
+   just like the SPD service.
+Interrupt registration
+This section describes in detail the role of each software component (see
+`Software components`_) during the registration of a handler for an interrupt
+EL3 runtime firmware
+This component declares the following prototype for a handler of an interrupt type.
+.. code:: c
+        typedef uint64_t (*interrupt_type_handler_t)(uint32_t id,
+                                                     uint32_t flags,
+                                                     void *handle,
+                                                     void *cookie);
+The ``id`` is parameter is reserved and could be used in the future for passing
+the interrupt id of the highest pending interrupt only if there is a foolproof
+way of determining the id. Currently it contains ``INTR_ID_UNAVAILABLE``.
+The ``flags`` parameter contains miscellaneous information as follows.
+#. Security state, bit[0]. This bit indicates the security state of the lower
+   exception level when the interrupt was generated. A value of ``1`` means
+   that it was in the non-secure state. A value of ``0`` indicates that it was
+   in the secure state. This bit can be used by the handler to ensure that
+   interrupt was generated and routed as per the routing model specified
+   during registration.
+#. Reserved, bits[31:1]. The remaining bits are reserved for future use.
+The ``handle`` parameter points to the ``cpu_context`` structure of the current CPU
+for the security state specified in the ``flags`` parameter.
+Once the handler routine completes, execution will return to either the secure
+or non-secure state. The handler routine must return a pointer to
+``cpu_context`` structure of the current CPU for the target security state. On
+AArch64, this return value is currently ignored by the caller as the
+appropriate ``cpu_context`` to be used is expected to be set by the handler
+via the context management library APIs.
+A portable interrupt handler implementation must set the target context both in
+the structure pointed to by the returned pointer and via the context management
+library APIs. The handler should treat all error conditions as critical errors
+and take appropriate action within its implementation e.g. use assertion
+The runtime firmware provides the following API for registering a handler for a
+particular type of interrupt. A Secure Payload Dispatcher service should use
+this API to register a handler for Secure-EL1 and optionally for non-secure
+interrupts. This API also requires the caller to specify the routing model for
+the type of interrupt.
+.. code:: c
+    int32_t register_interrupt_type_handler(uint32_t type,
+                                            interrupt_type_handler handler,
+                                            uint64_t flags);
+The ``type`` parameter can be one of the three interrupt types listed above i.e.
+``INTR_TYPE_S_EL1``, ``INTR_TYPE_NS`` & ``INTR_TYPE_EL3``. The ``flags`` parameter
+is as described in Section 2.
+The function will return ``0`` upon a successful registration. It will return
+``-EALREADY`` in case a handler for the interrupt type has already been
+registered. If the ``type`` is unrecognised or the ``flags`` or the ``handler`` are
+invalid it will return ``-EINVAL``.
+Interrupt routing is governed by the configuration of the ``SCR_EL3.FIQ/IRQ`` bits
+prior to entry into a lower exception level in either security state. The
+context management library maintains a copy of the ``SCR_EL3`` system register for
+each security state in the ``cpu_context`` structure of each CPU. It exports the
+following APIs to let EL3 Runtime Firmware program and retrieve the routing
+model for each security state for the current CPU. The value of ``SCR_EL3`` stored
+in the ``cpu_context`` is used by the ``el3_exit()`` function to program the
+``SCR_EL3`` register prior to returning from the EL3 exception level.
+.. code:: c
+        uint32_t cm_get_scr_el3(uint32_t security_state);
+        void cm_write_scr_el3_bit(uint32_t security_state,
+                                  uint32_t bit_pos,
+                                  uint32_t value);
+``cm_get_scr_el3()`` returns the value of the ``SCR_EL3`` register for the specified
+security state of the current CPU. ``cm_write_scr_el3()`` writes a ``0`` or ``1`` to
+the bit specified by ``bit_pos``. ``register_interrupt_type_handler()`` invokes
+``set_routing_model()`` API which programs the ``SCR_EL3`` according to the routing
+model using the ``cm_get_scr_el3()`` and ``cm_write_scr_el3_bit()`` APIs.
+It is worth noting that in the current implementation of the framework, the EL3
+runtime firmware is responsible for programming the routing model. The SPD is
+responsible for ensuring that the routing model has been adhered to upon
+receiving an interrupt.
+.. _spd-int-registration:
+Secure payload dispatcher
+A SPD service is responsible for determining and maintaining the interrupt
+routing model supported by itself and the Secure Payload. It is also responsible
+for ferrying interrupts between secure and non-secure software depending upon
+the routing model. It could determine the routing model at build time or at
+runtime. It must use this information to register a handler for each interrupt
+type using the ``register_interrupt_type_handler()`` API in EL3 runtime firmware.
+If the routing model is not known to the SPD service at build time, then it must
+be provided by the SP as the result of its initialisation. The SPD should
+program the routing model only after SP initialisation has completed e.g. in the
+SPD initialisation function pointed to by the ``bl32_init`` variable.
+The SPD should determine the mechanism to pass control to the Secure Payload
+after receiving an interrupt from the EL3 runtime firmware. This information
+could either be provided to the SPD service at build time or by the SP at
+Test secure payload dispatcher behavior
+**Note:** where this document discusses ``TSP_NS_INTR_ASYNC_PREEMPT`` as being
+``1``, the same results also apply when ``EL3_EXCEPTION_HANDLING`` is ``1``.
+The TSPD only handles Secure-EL1 interrupts and is provided with the following
+routing model at build time.
+-  Secure-EL1 interrupts are routed to EL3 when execution is in non-secure
+   state and are routed to the FEL when execution is in the secure state
+   i.e **CSS=0, TEL3=0** & **CSS=1, TEL3=1** for Secure-EL1 interrupts
+-  When the build flag ``TSP_NS_INTR_ASYNC_PREEMPT`` is zero, the default routing
+   model is used for non-secure interrupts. They are routed to the FEL in
+   either security state i.e **CSS=0, TEL3=0** & **CSS=1, TEL3=0** for
+   Non-secure interrupts.
+-  When the build flag ``TSP_NS_INTR_ASYNC_PREEMPT`` is defined to 1, then the
+   non secure interrupts are routed to EL3 when execution is in secure state
+   i.e **CSS=0, TEL3=1** for non-secure interrupts. This effectively preempts
+   Secure-EL1. The default routing model is used for non secure interrupts in
+   non-secure state. i.e **CSS=1, TEL3=0**.
+It performs the following actions in the ``tspd_init()`` function to fulfill the
+requirements mentioned earlier.
+#. It passes control to the Test Secure Payload to perform its
+   initialisation. The TSP provides the address of the vector table
+   ``tsp_vectors`` in the SP which also includes the handler for Secure-EL1
+   interrupts in the ``sel1_intr_entry`` field. The TSPD passes control to the TSP at
+   this address when it receives a Secure-EL1 interrupt.
+   The handover agreement between the TSP and the TSPD requires that the TSPD
+   masks all interrupts (``PSTATE.DAIF`` bits) when it calls
+   ``tsp_sel1_intr_entry()``. The TSP has to preserve the callee saved general
+   purpose, SP_EL1/Secure-EL0, LR, VFP and system registers. It can use
+   ``x0-x18`` to enable its C runtime.
+#. The TSPD implements a handler function for Secure-EL1 interrupts. This
+   function is registered with the EL3 runtime firmware using the
+   ``register_interrupt_type_handler()`` API as follows
+   .. code:: c
+       /* Forward declaration */
+       interrupt_type_handler tspd_secure_el1_interrupt_handler;
+       int32_t rc, flags = 0;
+       set_interrupt_rm_flag(flags, NON_SECURE);
+       rc = register_interrupt_type_handler(INTR_TYPE_S_EL1,
+                                        tspd_secure_el1_interrupt_handler,
+                                        flags);
+       if (rc)
+           panic();
+#. When the build flag ``TSP_NS_INTR_ASYNC_PREEMPT`` is defined to 1, the TSPD
+   implements a handler function for non-secure interrupts. This function is
+   registered with the EL3 runtime firmware using the
+   ``register_interrupt_type_handler()`` API as follows
+   .. code:: c
+       /* Forward declaration */
+       interrupt_type_handler tspd_ns_interrupt_handler;
+       int32_t rc, flags = 0;
+       set_interrupt_rm_flag(flags, SECURE);
+       rc = register_interrupt_type_handler(INTR_TYPE_NS,
+                                       tspd_ns_interrupt_handler,
+                                       flags);
+       if (rc)
+           panic();
+.. _sp-int-registration:
+Secure payload
+A Secure Payload must implement an interrupt handling framework at Secure-EL1
+(Secure-EL1 IHF) to support its chosen interrupt routing model. Secure payload
+execution will alternate between the below cases.
+#. In the code where IRQ, FIQ or both interrupts are enabled, if an interrupt
+   type is targeted to the FEL, then it will be routed to the Secure-EL1
+   exception vector table. This is defined as the **asynchronous mode** of
+   handling interrupts. This mode applies to both Secure-EL1 and non-secure
+   interrupts.
+#. In the code where both interrupts are disabled, if an interrupt type is
+   targeted to the FEL, then execution will eventually migrate to the
+   non-secure state. Any non-secure interrupts will be handled as described
+   in the routing model where **CSS=1 and TEL3=0**. Secure-EL1 interrupts
+   will be routed to EL3 (as per the routing model where **CSS=1 and
+   TEL3=1**) where the SPD service will hand them to the SP. This is defined
+   as the **synchronous mode** of handling interrupts.
+The interrupt handling framework implemented by the SP should support one or
+both these interrupt handling models depending upon the chosen routing model.
+The following list briefly describes how the choice of a valid routing model
+(see `Valid routing models`_) effects the implementation of the Secure-EL1
+IHF. If the choice of the interrupt routing model is not known to the SPD
+service at compile time, then the SP should pass this information to the SPD
+service at runtime during its initialisation phase.
+As mentioned earlier, an Arm GICv2 system is considered and it is assumed that
+the FIQ signal is used to generate Secure-EL1 interrupts and the IRQ signal
+is used to generate non-secure interrupts in either security state.
+Secure payload IHF design w.r.t secure-EL1 interrupts
+#. **CSS=0, TEL3=0**. If ``PSTATE.F=0``, Secure-EL1 interrupts will be
+   triggered at one of the Secure-EL1 FIQ exception vectors. The Secure-EL1
+   IHF should implement support for handling FIQ interrupts asynchronously.
+   If ``PSTATE.F=1`` then Secure-EL1 interrupts will be handled as per the
+   synchronous interrupt handling model. The SP could implement this scenario
+   by exporting a separate entrypoint for Secure-EL1 interrupts to the SPD
+   service during the registration phase. The SPD service would also need to
+   know the state of the system, general purpose and the ``PSTATE`` registers
+   in which it should arrange to return execution to the SP. The SP should
+   provide this information in an implementation defined way during the
+   registration phase if it is not known to the SPD service at build time.
+#. **CSS=1, TEL3=1**. Interrupts are routed to EL3 when execution is in
+   non-secure state. They should be handled through the synchronous interrupt
+   handling model as described in 1. above.
+#. **CSS=0, TEL3=1**. Secure-EL1 interrupts are routed to EL3 when execution
+   is in secure state. They will not be visible to the SP. The ``PSTATE.F`` bit
+   in Secure-EL1/Secure-EL0 will not mask FIQs. The EL3 runtime firmware will
+   call the handler registered by the SPD service for Secure-EL1 interrupts.
+   Secure-EL1 IHF should then handle all Secure-EL1 interrupt through the
+   synchronous interrupt handling model described in 1. above.
+Secure payload IHF design w.r.t non-secure interrupts
+#. **CSS=0, TEL3=0**. If ``PSTATE.I=0``, non-secure interrupts will be
+   triggered at one of the Secure-EL1 IRQ exception vectors . The Secure-EL1
+   IHF should co-ordinate with the SPD service to transfer execution to the
+   non-secure state where the interrupt should be handled e.g the SP could
+   allocate a function identifier to issue a SMC64 or SMC32 to the SPD
+   service which indicates that the SP execution has been preempted by a
+   non-secure interrupt. If this function identifier is not known to the SPD
+   service at compile time then the SP could provide it during the
+   registration phase.
+   If ``PSTATE.I=1`` then the non-secure interrupt will pend until execution
+   resumes in the non-secure state.
+#. **CSS=0, TEL3=1**. Non-secure interrupts are routed to EL3. They will not
+   be visible to the SP. The ``PSTATE.I`` bit in Secure-EL1/Secure-EL0 will
+   have not effect. The SPD service should register a non-secure interrupt
+   handler which should save the SP state correctly and resume execution in
+   the non-secure state where the interrupt will be handled. The Secure-EL1
+   IHF does not need to take any action.
+#. **CSS=1, TEL3=0**. Non-secure interrupts are handled in the FEL in
+   non-secure state (EL1/EL2) and are not visible to the SP. This routing
+   model does not affect the SP behavior.
+A Secure Payload must also ensure that all Secure-EL1 interrupts are correctly
+configured at the interrupt controller by the platform port of the EL3 runtime
+firmware. It should configure any additional Secure-EL1 interrupts which the EL3
+runtime firmware is not aware of through its platform port.
+Test secure payload behavior
+The routing model for Secure-EL1 and non-secure interrupts chosen by the TSP is
+described in Section `Secure Payload Dispatcher`__. It is known to the TSPD
+service at build time.
+.. __: #spd-int-registration
+The TSP implements an entrypoint (``tsp_sel1_intr_entry()``) for handling Secure-EL1
+interrupts taken in non-secure state and routed through the TSPD service
+(synchronous handling model). It passes the reference to this entrypoint via
+``tsp_vectors`` to the TSPD service.
+The TSP also replaces the default exception vector table referenced through the
+``early_exceptions`` variable, with a vector table capable of handling FIQ and IRQ
+exceptions taken at the same (Secure-EL1) exception level. This table is
+referenced through the ``tsp_exceptions`` variable and programmed into the
+VBAR_EL1. It caters for the asynchronous handling model.
+The TSP also programs the Secure Physical Timer in the Arm Generic Timer block
+to raise a periodic interrupt (every half a second) for the purpose of testing
+interrupt management across all the software components listed in `Software
+Interrupt handling
+This section describes in detail the role of each software component (see
+Section `Software components`_) in handling an interrupt of a particular type.
+EL3 runtime firmware
+The EL3 runtime firmware populates the IRQ and FIQ exception vectors referenced
+by the ``runtime_exceptions`` variable as follows.
+#. IRQ and FIQ exceptions taken from the current exception level with
+   ``SP_EL0`` or ``SP_EL3`` are reported as irrecoverable error conditions. As
+   mentioned earlier, EL3 runtime firmware always executes with the
+   ``PSTATE.I`` and ``PSTATE.F`` bits set.
+#. The following text describes how the IRQ and FIQ exceptions taken from a
+   lower exception level using AArch64 or AArch32 are handled.
+When an interrupt is generated, the vector for each interrupt type is
+responsible for:
+#. Saving the entire general purpose register context (x0-x30) immediately
+   upon exception entry. The registers are saved in the per-cpu ``cpu_context``
+   data structure referenced by the ``SP_EL3``\ register.
+#. Saving the ``ELR_EL3``, ``SP_EL0`` and ``SPSR_EL3`` system registers in the
+   per-cpu ``cpu_context`` data structure referenced by the ``SP_EL3`` register.
+#. Switching to the C runtime stack by restoring the ``CTX_RUNTIME_SP`` value
+   from the per-cpu ``cpu_context`` data structure in ``SP_EL0`` and
+   executing the ``msr spsel, #0`` instruction.
+#. Determining the type of interrupt. Secure-EL1 interrupts will be signaled
+   at the FIQ vector. Non-secure interrupts will be signaled at the IRQ
+   vector. The platform should implement the following API to determine the
+   type of the pending interrupt.
+   .. code:: c
+       uint32_t plat_ic_get_interrupt_type(void);
+   It should return either ``INTR_TYPE_S_EL1`` or ``INTR_TYPE_NS``.
+#. Determining the handler for the type of interrupt that has been generated.
+   The following API has been added for this purpose.
+   .. code:: c
+       interrupt_type_handler get_interrupt_type_handler(uint32_t interrupt_type);
+   It returns the reference to the registered handler for this interrupt
+   type. The ``handler`` is retrieved from the ``intr_type_desc_t`` structure as
+   described in Section 2. ``NULL`` is returned if no handler has been
+   registered for this type of interrupt. This scenario is reported as an
+   irrecoverable error condition.
+#. Calling the registered handler function for the interrupt type generated.
+   The ``id`` parameter is set to ``INTR_ID_UNAVAILABLE`` currently. The id along
+   with the current security state and a reference to the ``cpu_context_t``
+   structure for the current security state are passed to the handler function
+   as its arguments.
+   The handler function returns a reference to the per-cpu ``cpu_context_t``
+   structure for the target security state.
+#. Calling ``el3_exit()`` to return from EL3 into a lower exception level in
+   the security state determined by the handler routine. The ``el3_exit()``
+   function is responsible for restoring the register context from the
+   ``cpu_context_t`` data structure for the target security state.
+Secure payload dispatcher
+Interrupt entry
+The SPD service begins handling an interrupt when the EL3 runtime firmware calls
+the handler function for that type of interrupt. The SPD service is responsible
+for the following:
+#. Validating the interrupt. This involves ensuring that the interrupt was
+   generated according to the interrupt routing model specified by the SPD
+   service during registration. It should use the security state of the
+   exception level (passed in the ``flags`` parameter of the handler) where
+   the interrupt was taken from to determine this. If the interrupt is not
+   recognised then the handler should treat it as an irrecoverable error
+   condition.
+   An SPD service can register a handler for Secure-EL1 and/or Non-secure
+   interrupts. A non-secure interrupt should never be routed to EL3 from
+   from non-secure state. Also if a routing model is chosen where Secure-EL1
+   interrupts are routed to S-EL1 when execution is in Secure state, then a
+   S-EL1 interrupt should never be routed to EL3 from secure state. The handler
+   could use the security state flag to check this.
+#. Determining whether a context switch is required. This depends upon the
+   routing model and interrupt type. For non secure and S-EL1 interrupt,
+   if the security state of the execution context where the interrupt was
+   generated is not the same as the security state required for handling
+   the interrupt, a context switch is required. The following 2 cases
+   require a context switch from secure to non-secure or vice-versa:
+   #. A Secure-EL1 interrupt taken from the non-secure state should be
+      routed to the Secure Payload.
+   #. A non-secure interrupt taken from the secure state should be routed
+      to the last known non-secure exception level.
+   The SPD service must save the system register context of the current
+   security state. It must then restore the system register context of the
+   target security state. It should use the ``cm_set_next_eret_context()`` API
+   to ensure that the next ``cpu_context`` to be restored is of the target
+   security state.
+   If the target state is secure then execution should be handed to the SP as
+   per the synchronous interrupt handling model it implements. A Secure-EL1
+   interrupt can be routed to EL3 while execution is in the SP. This implies
+   that SP execution can be preempted while handling an interrupt by a
+   another higher priority Secure-EL1 interrupt or a EL3 interrupt. The SPD
+   service should be able to handle this preemption or manage secure interrupt
+   priorities before handing control to the SP.
+#. Setting the return value of the handler to the per-cpu ``cpu_context`` if
+   the interrupt has been successfully validated and ready to be handled at a
+   lower exception level.
+The routing model allows non-secure interrupts to interrupt Secure-EL1 when in
+secure state if it has been configured to do so. The SPD service and the SP
+should implement a mechanism for routing these interrupts to the last known
+exception level in the non-secure state. The former should save the SP context,
+restore the non-secure context and arrange for entry into the non-secure state
+so that the interrupt can be handled.
+Interrupt exit
+When the Secure Payload has finished handling a Secure-EL1 interrupt, it could
+return control back to the SPD service through a SMC32 or SMC64. The SPD service
+should handle this secure monitor call so that execution resumes in the
+exception level and the security state from where the Secure-EL1 interrupt was
+originally taken.
+Test secure payload dispatcher Secure-EL1 interrupt handling
+The example TSPD service registers a handler for Secure-EL1 interrupts taken
+from the non-secure state. During execution in S-EL1, the TSPD expects that the
+Secure-EL1 interrupts are handled in S-EL1 by TSP. Its handler
+``tspd_secure_el1_interrupt_handler()`` expects only to be invoked for Secure-EL1
+originating from the non-secure state. It takes the following actions upon being
+#. It uses the security state provided in the ``flags`` parameter to ensure
+   that the secure interrupt originated from the non-secure state. It asserts
+   if this is not the case.
+#. It saves the system register context for the non-secure state by calling
+   ``cm_el1_sysregs_context_save(NON_SECURE);``.
+#. It sets the ``ELR_EL3`` system register to ``tsp_sel1_intr_entry`` and sets the
+   ``SPSR_EL3.DAIF`` bits in the secure CPU context. It sets ``x0`` to
+   ``TSP_HANDLE_SEL1_INTR_AND_RETURN``. If the TSP was preempted earlier by a non
+   secure interrupt during ``yielding`` SMC processing, save the registers that
+   will be trashed, which is the ``ELR_EL3`` and ``SPSR_EL3``, in order to be able
+   to re-enter TSP for Secure-EL1 interrupt processing. It does not need to
+   save any other secure context since the TSP is expected to preserve it
+   (see section `Test secure payload dispatcher behavior`_).
+#. It restores the system register context for the secure state by calling
+   ``cm_el1_sysregs_context_restore(SECURE);``.
+#. It ensures that the secure CPU context is used to program the next
+   exception return from EL3 by calling ``cm_set_next_eret_context(SECURE);``.
+#. It returns the per-cpu ``cpu_context`` to indicate that the interrupt can
+   now be handled by the SP. ``x1`` is written with the value of ``elr_el3``
+   register for the non-secure state. This information is used by the SP for
+   debugging purposes.
+The figure below describes how the interrupt handling is implemented by the TSPD
+when a Secure-EL1 interrupt is generated when execution is in the non-secure
+|Image 1|
+The TSP issues an SMC with ``TSP_HANDLED_S_EL1_INTR`` as the function identifier to
+signal completion of interrupt handling.
+The TSPD service takes the following actions in ``tspd_smc_handler()`` function
+upon receiving an SMC with ``TSP_HANDLED_S_EL1_INTR`` as the function identifier:
+#. It ensures that the call originated from the secure state otherwise
+   execution returns to the non-secure state with ``SMC_UNK`` in ``x0``.
+#. It restores the saved ``ELR_EL3`` and ``SPSR_EL3`` system registers back to
+   the secure CPU context (see step 3 above) in case the TSP had been preempted
+   by a non secure interrupt earlier.
+#. It restores the system register context for the non-secure state by
+   calling ``cm_el1_sysregs_context_restore(NON_SECURE)``.
+#. It ensures that the non-secure CPU context is used to program the next
+   exception return from EL3 by calling ``cm_set_next_eret_context(NON_SECURE)``.
+#. ``tspd_smc_handler()`` returns a reference to the non-secure ``cpu_context``
+   as the return value.
+Test secure payload dispatcher non-secure interrupt handling
+The TSP in Secure-EL1 can be preempted by a non-secure interrupt during
+``yielding`` SMC processing or by a higher priority EL3 interrupt during
+Secure-EL1 interrupt processing. When ``EL3_EXCEPTION_HANDLING`` is ``0``, only
+non-secure interrupts can cause preemption of TSP since there are no EL3
+interrupts in the system. With ``EL3_EXCEPTION_HANDLING=1`` however, any EL3
+interrupt may preempt Secure execution.
+It should be noted that while TSP is preempted, the TSPD only allows entry into
+the TSP either for Secure-EL1 interrupt handling or for resuming the preempted
+``yielding`` SMC in response to the ``TSP_FID_RESUME`` SMC from the normal world.
+(See Section `Implication of preempted SMC on Non-Secure Software`_).
+The non-secure interrupt triggered in Secure-EL1 during ``yielding`` SMC
+processing can be routed to either EL3 or Secure-EL1 and is controlled by build
+option ``TSP_NS_INTR_ASYNC_PREEMPT`` (see Section `Test secure payload
+dispatcher behavior`_). If the build option is set, the TSPD will set the
+routing model for the non-secure interrupt to be routed to EL3 from secure state
+i.e. **TEL3=1, CSS=0** and registers ``tspd_ns_interrupt_handler()`` as the
+non-secure interrupt handler. The ``tspd_ns_interrupt_handler()`` on being
+invoked ensures that the interrupt originated from the secure state and disables
+routing of non-secure interrupts from secure state to EL3. This is to prevent
+further preemption (by a non-secure interrupt) when TSP is reentered for
+handling Secure-EL1 interrupts that triggered while execution was in the normal
+world. The ``tspd_ns_interrupt_handler()`` then invokes
+``tspd_handle_sp_preemption()`` for further handling.
+If the ``TSP_NS_INTR_ASYNC_PREEMPT`` build option is zero (default), the default
+routing model for non-secure interrupt in secure state is in effect
+i.e. **TEL3=0, CSS=0**. During ``yielding`` SMC processing, the IRQ
+exceptions are unmasked i.e. ``PSTATE.I=0``, and a non-secure interrupt will
+trigger at Secure-EL1 IRQ exception vector. The TSP saves the general purpose
+register context and issues an SMC with ``TSP_PREEMPTED`` as the function
+identifier to signal preemption of TSP. The TSPD SMC handler,
+``tspd_smc_handler()``, ensures that the SMC call originated from the
+secure state otherwise execution returns to the non-secure state with
+``SMC_UNK`` in ``x0``. It then invokes ``tspd_handle_sp_preemption()`` for
+further handling.
+The ``tspd_handle_sp_preemption()`` takes the following actions upon being
+#. It saves the system register context for the secure state by calling
+   ``cm_el1_sysregs_context_save(SECURE)``.
+#. It restores the system register context for the non-secure state by
+   calling ``cm_el1_sysregs_context_restore(NON_SECURE)``.
+#. It ensures that the non-secure CPU context is used to program the next
+   exception return from EL3 by calling ``cm_set_next_eret_context(NON_SECURE)``.
+#. ``SMC_PREEMPTED`` is set in x0 and return to non secure state after
+   restoring non secure context.
+The Normal World is expected to resume the TSP after the ``yielding`` SMC
+preemption by issuing an SMC with ``TSP_FID_RESUME`` as the function identifier
+(see section `Implication of preempted SMC on Non-Secure Software`_).  The TSPD
+service takes the following actions in ``tspd_smc_handler()`` function upon
+receiving this SMC:
+#. It ensures that the call originated from the non secure state. An
+   assertion is raised otherwise.
+#. Checks whether the TSP needs a resume i.e check if it was preempted. It
+   then saves the system register context for the non-secure state by calling
+   ``cm_el1_sysregs_context_save(NON_SECURE)``.
+#. Restores the secure context by calling
+   ``cm_el1_sysregs_context_restore(SECURE)``
+#. It ensures that the secure CPU context is used to program the next
+   exception return from EL3 by calling ``cm_set_next_eret_context(SECURE)``.
+#. ``tspd_smc_handler()`` returns a reference to the secure ``cpu_context`` as the
+   return value.
+The figure below describes how the TSP/TSPD handle a non-secure interrupt when
+it is generated during execution in the TSP with ``PSTATE.I`` = 0 when the
+``TSP_NS_INTR_ASYNC_PREEMPT`` build flag is 0.
+|Image 2|
+Secure payload
+The SP should implement one or both of the synchronous and asynchronous
+interrupt handling models depending upon the interrupt routing model it has
+chosen (as described in section `Secure Payload`__).
+.. __: #sp-int-registration
+In the synchronous model, it should begin handling a Secure-EL1 interrupt after
+receiving control from the SPD service at an entrypoint agreed upon during build
+time or during the registration phase. Before handling the interrupt, the SP
+should save any Secure-EL1 system register context which is needed for resuming
+normal execution in the SP later e.g. ``SPSR_EL1``, ``ELR_EL1``. After handling
+the interrupt, the SP could return control back to the exception level and
+security state where the interrupt was originally taken from. The SP should use
+an SMC32 or SMC64 to ask the SPD service to do this.
+In the asynchronous model, the Secure Payload is responsible for handling
+non-secure and Secure-EL1 interrupts at the IRQ and FIQ vectors in its exception
+vector table when ``PSTATE.I`` and ``PSTATE.F`` bits are 0. As described earlier,
+when a non-secure interrupt is generated, the SP should coordinate with the SPD
+service to pass control back to the non-secure state in the last known exception
+level. This will allow the non-secure interrupt to be handled in the non-secure
+Test secure payload behavior
+The TSPD hands control of a Secure-EL1 interrupt to the TSP at the
+``tsp_sel1_intr_entry()``. The TSP handles the interrupt while ensuring that the
+handover agreement described in Section `Test secure payload dispatcher
+behavior`_ is maintained. It updates some statistics by calling
+``tsp_update_sync_sel1_intr_stats()``. It then calls
+``tsp_common_int_handler()`` which.
+#. Checks whether the interrupt is the secure physical timer interrupt. It
+   uses the platform API ``plat_ic_get_pending_interrupt_id()`` to get the
+   interrupt number. If it is not the secure physical timer interrupt, then
+   that means that a higher priority interrupt has preempted it. Invoke
+   ``tsp_handle_preemption()`` to handover control back to EL3 by issuing
+   an SMC with ``TSP_PREEMPTED`` as the function identifier.
+#. Handles the secure timer interrupt interrupt by acknowledging it using the
+   ``plat_ic_acknowledge_interrupt()`` platform API, calling
+   ``tsp_generic_timer_handler()`` to reprogram the secure physical generic
+   timer and calling the ``plat_ic_end_of_interrupt()`` platform API to signal
+   end of interrupt processing.
+The TSP passes control back to the TSPD by issuing an SMC64 with
+``TSP_HANDLED_S_EL1_INTR`` as the function identifier.
+The TSP handles interrupts under the asynchronous model as follows.
+#. Secure-EL1 interrupts are handled by calling the ``tsp_common_int_handler()``
+   function. The function has been described above.
+#. Non-secure interrupts are handled by calling the ``tsp_common_int_handler()``
+   function which ends up invoking ``tsp_handle_preemption()`` and issuing an
+   SMC64 with ``TSP_PREEMPTED`` as the function identifier. Execution resumes at
+   the instruction that follows this SMC instruction when the TSPD hands control
+   to the TSP in response to an SMC with ``TSP_FID_RESUME`` as the function
+   identifier from the non-secure state (see section `Test secure payload
+   dispatcher non-secure interrupt handling`_).
+Other considerations
+Implication of preempted SMC on Non-Secure Software
+A ``yielding`` SMC call to Secure payload can be preempted by a non-secure
+interrupt and the execution can return to the non-secure world for handling
+the interrupt (For details on ``yielding`` SMC refer `SMC calling convention`_).
+In this case, the SMC call has not completed its execution and the execution
+must return back to the secure payload to resume the preempted SMC call.
+This can be achieved by issuing an SMC call which instructs to resume the
+preempted SMC.
+A ``fast`` SMC cannot be preempted and hence this case will not happen for
+a fast SMC call.
+In the Test Secure Payload implementation, ``TSP_FID_RESUME`` is designated
+as the resume SMC FID. It is important to note that ``TSP_FID_RESUME`` is a
+``yielding`` SMC which means it too can be be preempted. The typical non
+secure software sequence for issuing a ``yielding`` SMC would look like this,
+assuming ``P.STATE.I=0`` in the non secure state :
+.. code:: c
+    int rc;
+    rc = smc(TSP_YIELD_SMC_FID, ...);     /* Issue a Yielding SMC call */
+    /* The pending non-secure interrupt is handled by the interrupt handler
+       and returns back here. */
+    while (rc == SMC_PREEMPTED) {       /* Check if the SMC call is preempted */
+        rc = smc(TSP_FID_RESUME);       /* Issue resume SMC call */
+    }
+The ``TSP_YIELD_SMC_FID`` is any ``yielding`` SMC function identifier and the smc()
+function invokes a SMC call with the required arguments. The pending non-secure
+interrupt causes an IRQ exception and the IRQ handler registered at the
+exception vector handles the non-secure interrupt and returns. The return value
+from the SMC call is tested for ``SMC_PREEMPTED`` to check whether it is
+preempted. If it is, then the resume SMC call ``TSP_FID_RESUME`` is issued. The
+return value of the SMC call is tested again to check if it is preempted.
+This is done in a loop till the SMC call succeeds or fails. If a ``yielding``
+SMC is preempted, until it is resumed using ``TSP_FID_RESUME`` SMC and
+completed, the current TSPD prevents any other SMC call from re-entering
+TSP by returning ``SMC_UNK`` error.
+*Copyright (c) 2014-2019, Arm Limited and Contributors. All rights reserved.*
+.. _Porting Guide: ../getting_started/porting-guide.rst
+.. _SMC calling convention:
+.. |Image 1| image:: diagrams/sec-int-handling.png?raw=true
+.. |Image 2| image:: diagrams/non-sec-int-handling.png?raw=true
diff --git a/docs/design/psci-pd-tree.rst b/docs/design/psci-pd-tree.rst
new file mode 100644
index 0000000..2e2163a
--- /dev/null
+++ b/docs/design/psci-pd-tree.rst
@@ -0,0 +1,311 @@
+PSCI Power Domain Tree design
+.. contents::
+#. A platform must export the ``plat_get_aff_count()`` and
+   ``plat_get_aff_state()`` APIs to enable the generic PSCI code to
+   populate a tree that describes the hierarchy of power domains in the
+   system. This approach is inflexible because a change to the topology
+   requires a change in the code.
+   It would be much simpler for the platform to describe its power domain tree
+   in a data structure.
+#. The generic PSCI code generates MPIDRs in order to populate the power domain
+   tree. It also uses an MPIDR to find a node in the tree. The assumption that
+   a platform will use exactly the same MPIDRs as generated by the generic PSCI
+   code is not scalable. The use of an MPIDR also restricts the number of
+   levels in the power domain tree to four.
+   Therefore, there is a need to decouple allocation of MPIDRs from the
+   mechanism used to populate the power domain topology tree.
+#. The current arrangement of the power domain tree requires a binary search
+   over the sibling nodes at a particular level to find a specified power
+   domain node. During a power management operation, the tree is traversed from
+   a 'start' to an 'end' power level. The binary search is required to find the
+   node at each level. The natural way to perform this traversal is to
+   start from a leaf node and follow the parent node pointer to reach the end
+   level.
+   Therefore, there is a need to define data structures that implement the tree in
+   a way which facilitates such a traversal.
+#. The attributes of a core power domain differ from the attributes of power
+   domains at higher levels. For example, only a core power domain can be identified
+   using an MPIDR. There is no requirement to perform state coordination while
+   performing a power management operation on the core power domain.
+   Therefore, there is a need to implement the tree in a way which facilitates this
+   distinction between a leaf and non-leaf node and any associated
+   optimizations.
+Describing a power domain tree
+To fulfill requirement 1., the existing platform APIs
+``plat_get_aff_count()`` and ``plat_get_aff_state()`` have been
+removed. A platform must define an array of unsigned chars such that:
+#. The first entry in the array specifies the number of power domains at the
+   highest power level implemented in the platform. This caters for platforms
+   where the power domain tree does not have a single root node, for example,
+   the FVP has two cluster power domains at the highest level (1).
+#. Each subsequent entry corresponds to a power domain and contains the number
+   of power domains that are its direct children.
+#. The size of the array minus the first entry will be equal to the number of
+   non-leaf power domains.
+#. The value in each entry in the array is used to find the number of entries
+   to consider at the next level. The sum of the values (number of children) of
+   all the entries at a level specifies the number of entries in the array for
+   the next level.
+The following example power domain topology tree will be used to describe the
+above text further. The leaf and non-leaf nodes in this tree have been numbered
+                                         +-+
+                                         |0|
+                                         +-+
+                                        /   \
+                                       /     \
+                                      /       \
+                                     /         \
+                                    /           \
+                                   /             \
+                                  /               \
+                                 /                 \
+                                /                   \
+                               /                     \
+                            +-+                       +-+
+                            |1|                       |2|
+                            +-+                       +-+
+                           /   \                     /   \
+                          /     \                   /     \
+                         /       \                 /       \
+                        /         \               /         \
+                     +-+           +-+         +-+           +-+
+                     |3|           |4|         |5|           |6|
+                     +-+           +-+         +-+           +-+
+            +---+-----+    +----+----|     +----+----+     +----+-----+-----+
+            |   |     |    |    |    |     |    |    |     |    |     |     |
+            |   |     |    |    |    |     |    |    |     |    |     |     |
+            v   v     v    v    v    v     v    v    v     v    v     v     v
+          +-+  +-+   +-+  +-+  +-+  +-+   +-+  +-+  +-+   +-+  +--+  +--+  +--+
+          |0|  |1|   |2|  |3|  |4|  |5|   |6|  |7|  |8|   |9|  |10|  |11|  |12|
+          +-+  +-+   +-+  +-+  +-+  +-+   +-+  +-+  +-+   +-+  +--+  +--+  +--+
+This tree is defined by the platform as the array described above as follows:
+        #define PLAT_NUM_POWER_DOMAINS       20
+        #define PLATFORM_CORE_COUNT          13
+        #define PSCI_NUM_NON_CPU_PWR_DOMAINS \
+                           (PLAT_NUM_POWER_DOMAINS - PLATFORM_CORE_COUNT)
+        unsigned char plat_power_domain_tree_desc[] = { 1, 2, 2, 2, 3, 3, 3, 4};
+Removing assumptions about MPIDRs used in a platform
+To fulfill requirement 2., it is assumed that the platform assigns a
+unique number (core index) between ``0`` and ``PLAT_CORE_COUNT - 1`` to each core
+power domain. MPIDRs could be allocated in any manner and will not be used to
+populate the tree.
+``plat_core_pos_by_mpidr(mpidr)`` will return the core index for the core
+corresponding to the MPIDR. It will return an error (-1) if an MPIDR is passed
+which is not allocated or corresponds to an absent core. The semantics of this
+platform API have changed since it is required to validate the passed MPIDR. It
+has been made a mandatory API as a result.
+Another mandatory API, ``plat_my_core_pos()`` has been added to return the core
+index for the calling core. This API provides a more lightweight mechanism to get
+the index since there is no need to validate the MPIDR of the calling core.
+The platform should assign the core indices (as illustrated in the diagram above)
+such that, if the core nodes are numbered from left to right, then the index
+for a core domain will be the same as the index returned by
+``plat_core_pos_by_mpidr()`` or ``plat_my_core_pos()`` for that core. This
+relationship allows the core nodes to be allocated in a separate array
+(requirement 4.) during ``psci_setup()`` in such an order that the index of the
+core in the array is the same as the return value from these APIs.
+Dealing with holes in MPIDR allocation
+For platforms where the number of allocated MPIDRs is equal to the number of
+core power domains, for example, Juno and FVPs, the logic to convert an MPIDR to
+a core index should remain unchanged. Both Juno and FVP use a simple collision
+proof hash function to do this.
+It is possible that on some platforms, the allocation of MPIDRs is not
+contiguous or certain cores have been disabled. This essentially means that the
+MPIDRs have been sparsely allocated, that is, the size of the range of MPIDRs
+used by the platform is not equal to the number of core power domains.
+The platform could adopt one of the following approaches to deal with this
+#. Implement more complex logic to convert a valid MPIDR to a core index while
+   maintaining the relationship described earlier. This means that the power
+   domain tree descriptor will not describe any core power domains which are
+   disabled or absent. Entries will not be allocated in the tree for these
+   domains.
+#. Treat unallocated MPIDRs and disabled cores as absent but still describe them
+   in the power domain descriptor, that is, the number of core nodes described
+   is equal to the size of the range of MPIDRs allocated. This approach will
+   lead to memory wastage since entries will be allocated in the tree but will
+   allow use of a simpler logic to convert an MPIDR to a core index.
+Traversing through and distinguishing between core and non-core power domains
+To fulfill requirement 3 and 4, separate data structures have been defined
+to represent leaf and non-leaf power domain nodes in the tree.
+.. code:: c
+    /*******************************************************************************
+     * The following two data structures implement the power domain tree. The tree
+     * is used to track the state of all the nodes i.e. power domain instances
+     * described by the platform. The tree consists of nodes that describe CPU power
+     * domains i.e. leaf nodes and all other power domains which are parents of a
+     * CPU power domain i.e. non-leaf nodes.
+     ******************************************************************************/
+    typedef struct non_cpu_pwr_domain_node {
+        /*
+         * Index of the first CPU power domain node level 0 which has this node
+         * as its parent.
+         */
+        unsigned int cpu_start_idx;
+        /*
+         * Number of CPU power domains which are siblings of the domain indexed
+         * by 'cpu_start_idx' i.e. all the domains in the range 'cpu_start_idx
+         * -> cpu_start_idx + ncpus' have this node as their parent.
+         */
+        unsigned int ncpus;
+        /* Index of the parent power domain node */
+        unsigned int parent_node;
+        -----
+    } non_cpu_pd_node_t;
+    typedef struct cpu_pwr_domain_node {
+        u_register_t mpidr;
+        /* Index of the parent power domain node */
+        unsigned int parent_node;
+        -----
+    } cpu_pd_node_t;
+The power domain tree is implemented as a combination of the following data
+    non_cpu_pd_node_t psci_non_cpu_pd_nodes[PSCI_NUM_NON_CPU_PWR_DOMAINS];
+    cpu_pd_node_t psci_cpu_pd_nodes[PLATFORM_CORE_COUNT];
+Populating the power domain tree
+The ``populate_power_domain_tree()`` function in ``psci_setup.c`` implements the
+algorithm to parse the power domain descriptor exported by the platform to
+populate the two arrays. It is essentially a breadth-first-search. The nodes for
+each level starting from the root are laid out one after another in the
+``psci_non_cpu_pd_nodes`` and ``psci_cpu_pd_nodes`` arrays as follows:
+    psci_non_cpu_pd_nodes -> [[Level 3 nodes][Level 2 nodes][Level 1 nodes]]
+    psci_cpu_pd_nodes -> [Level 0 nodes]
+For the example power domain tree illustrated above, the ``psci_cpu_pd_nodes``
+will be populated as follows. The value in each entry is the index of the parent
+node. Other fields have been ignored for simplicity.
+                          +-------------+     ^
+                    CPU0  |      3      |     |
+                          +-------------+     |
+                    CPU1  |      3      |     |
+                          +-------------+     |
+                    CPU2  |      3      |     |
+                          +-------------+     |
+                    CPU3  |      4      |     |
+                          +-------------+     |
+                    CPU4  |      4      |     |
+                          +-------------+     |
+                    CPU5  |      4      |     | PLATFORM_CORE_COUNT
+                          +-------------+     |
+                    CPU6  |      5      |     |
+                          +-------------+     |
+                    CPU7  |      5      |     |
+                          +-------------+     |
+                    CPU8  |      5      |     |
+                          +-------------+     |
+                    CPU9  |      6      |     |
+                          +-------------+     |
+                    CPU10 |      6      |     |
+                          +-------------+     |
+                    CPU11 |      6      |     |
+                          +-------------+     |
+                    CPU12 |      6      |     v
+                          +-------------+
+The ``psci_non_cpu_pd_nodes`` array will be populated as follows. The value in
+each entry is the index of the parent node.
+                          +-------------+     ^
+                    PD0   |      -1     |     |
+                          +-------------+     |
+                    PD1   |      0      |     |
+                          +-------------+     |
+                    PD2   |      0      |     |
+                          +-------------+     |
+                    PD3   |      1      |     | PLAT_NUM_POWER_DOMAINS -
+                          +-------------+     | PLATFORM_CORE_COUNT
+                    PD4   |      1      |     |
+                          +-------------+     |
+                    PD5   |      2      |     |
+                          +-------------+     |
+                    PD6   |      2      |     |
+                          +-------------+     v
+Each core can find its node in the ``psci_cpu_pd_nodes`` array using the
+``plat_my_core_pos()`` function. When a core is turned on, the normal world
+provides an MPIDR. The ``plat_core_pos_by_mpidr()`` function is used to validate
+the MPIDR before using it to find the corresponding core node. The non-core power
+domain nodes do not need to be identified.
+*Copyright (c) 2017-2018, Arm Limited and Contributors. All rights reserved.*
diff --git a/docs/design/reset-design.rst b/docs/design/reset-design.rst
new file mode 100644
index 0000000..1473851
--- /dev/null
+++ b/docs/design/reset-design.rst
@@ -0,0 +1,165 @@
+Trusted Firmware-A reset design
+.. contents::
+This document describes the high-level design of the framework to handle CPU
+resets in Trusted Firmware-A (TF-A). It also describes how the platform
+integrator can tailor this code to the system configuration to some extent,
+resulting in a simplified and more optimised boot flow.
+This document should be used in conjunction with the `Firmware Design`_, which
+provides greater implementation details around the reset code, specifically
+for the cold boot path.
+General reset code flow
+The TF-A reset code is implemented in BL1 by default. The following high-level
+diagram illustrates this:
+|Default reset code flow|
+This diagram shows the default, unoptimised reset flow. Depending on the system
+configuration, some of these steps might be unnecessary. The following sections
+guide the platform integrator by indicating which build options exclude which
+steps, depending on the capability of the platform.
+Note: If BL31 is used as the TF-A entry point instead of BL1, the diagram
+above is still relevant, as all these operations will occur in BL31 in
+this case. Please refer to section 6 "Using BL31 entrypoint as the reset
+address" for more information.
+Programmable CPU reset address
+By default, TF-A assumes that the CPU reset address is not programmable.
+Therefore, all CPUs start at the same address (typically address 0) whenever
+they reset. Further logic is then required to identify whether it is a cold or
+warm boot to direct CPUs to the right execution path.
+If the reset vector address (reflected in the reset vector base address register
+``RVBAR_EL3``) is programmable then it is possible to make each CPU start directly
+at the right address, both on a cold and warm reset. Therefore, the boot type
+detection can be skipped, resulting in the following boot flow:
+|Reset code flow with programmable reset address|
+To enable this boot flow, compile TF-A with ``PROGRAMMABLE_RESET_ADDRESS=1``.
+This option only affects the TF-A reset image, which is BL1 by default or BL31 if
+On both the FVP and Juno platforms, the reset vector address is not programmable
+so both ports use ``PROGRAMMABLE_RESET_ADDRESS=0``.
+Cold boot on a single CPU
+By default, TF-A assumes that several CPUs may be released out of reset.
+Therefore, the cold boot code has to arbitrate access to hardware resources
+shared amongst CPUs. This is done by nominating one of the CPUs as the primary,
+which is responsible for initialising shared hardware and coordinating the boot
+flow with the other CPUs.
+If the platform guarantees that only a single CPU will ever be brought up then
+no arbitration is required. The notion of primary/secondary CPU itself no longer
+applies. This results in the following boot flow:
+|Reset code flow with single CPU released out of reset|
+To enable this boot flow, compile TF-A with ``COLD_BOOT_SINGLE_CPU=1``. This
+option only affects the TF-A reset image, which is BL1 by default or BL31 if
+On both the FVP and Juno platforms, although only one core is powered up by
+default, there are platform-specific ways to release any number of cores out of
+reset. Therefore, both platform ports use ``COLD_BOOT_SINGLE_CPU=0``.
+Programmable CPU reset address, Cold boot on a single CPU
+It is obviously possible to combine both optimisations on platforms that have
+a programmable CPU reset address and which release a single CPU out of reset.
+This results in the following boot flow:
+|Reset code flow with programmable reset address and single CPU released out of reset|
+To enable this boot flow, compile TF-A with both ``COLD_BOOT_SINGLE_CPU=1``
+and ``PROGRAMMABLE_RESET_ADDRESS=1``. These options only affect the TF-A reset
+image, which is BL1 by default or BL31 if ``RESET_TO_BL31=1``.
+Using BL31 entrypoint as the reset address
+On some platforms the runtime firmware (BL3x images) for the application
+processors are loaded by some firmware running on a secure system processor
+on the SoC, rather than by BL1 and BL2 running on the primary application
+processor. For this type of SoC it is desirable for the application processor
+to always reset to BL31 which eliminates the need for BL1 and BL2.
+TF-A provides a build-time option ``RESET_TO_BL31`` that includes some additional
+logic in the BL31 entry point to support this use case.
+In this configuration, the platform's Trusted Boot Firmware must ensure that
+BL31 is loaded to its runtime address, which must match the CPU's ``RVBAR_EL3``
+reset vector base address, before the application processor is powered on.
+Additionally, platform software is responsible for loading the other BL3x images
+required and providing entry point information for them to BL31. Loading these
+images might be done by the Trusted Boot Firmware or by platform code in BL31.
+Although the Arm FVP platform does not support programming the reset base
+address dynamically at run-time, it is possible to set the initial value of the
+``RVBAR_EL3`` register at start-up. This feature is provided on the Base FVP only.
+It allows the Arm FVP port to support the ``RESET_TO_BL31`` configuration, in
+which case the ``bl31.bin`` image must be loaded to its run address in Trusted
+SRAM and all CPU reset vectors be changed from the default ``0x0`` to this run
+address. See the `User Guide`_ for details of running the FVP models in this way.
+Although technically it would be possible to program the reset base address with
+the right support in the SCP firmware, this is currently not implemented so the
+Juno port doesn't support the ``RESET_TO_BL31`` configuration.
+The ``RESET_TO_BL31`` configuration requires some additions and changes in the
+BL31 functionality:
+Determination of boot path
+In this configuration, BL31 uses the same reset framework and code as the one
+described for BL1 above. Therefore, it is affected by the
+same way.
+In the default, unoptimised BL31 reset flow, on a warm boot a CPU is directed
+to the PSCI implementation via a platform defined mechanism. On a cold boot,
+the platform must place any secondary CPUs into a safe state while the primary
+CPU executes a modified BL31 initialization, as described below.
+Platform initialization
+In this configuration, when the CPU resets to BL31 there are no parameters that
+can be passed in registers by previous boot stages. Instead, the platform code
+in BL31 needs to know, or be able to determine, the location of the BL32 (if
+required) and BL33 images and provide this information in response to the
+``bl31_plat_get_next_image_ep_info()`` function.
+Additionally, platform software is responsible for carrying out any security
+initialisation, for example programming a TrustZone address space controller.
+This might be done by the Trusted Boot Firmware or by platform code in BL31.
+*Copyright (c) 2015-2018, Arm Limited and Contributors. All rights reserved.*
+.. _Firmware Design: firmware-design.rst
+.. _User Guide: ../getting_started/user-guide.rst
+.. |Default reset code flow| image:: ../diagrams/default_reset_code.png?raw=true
+.. |Reset code flow with programmable reset address| image:: ../diagrams/reset_code_no_boot_type_check.png?raw=true
+.. |Reset code flow with single CPU released out of reset| image:: ../diagrams/reset_code_no_cpu_check.png?raw=true
+.. |Reset code flow with programmable reset address and single CPU released out of reset| image:: ../diagrams/reset_code_no_checks.png?raw=true
diff --git a/docs/design/trusted-board-boot.rst b/docs/design/trusted-board-boot.rst
new file mode 100644
index 0000000..ae21bf0
--- /dev/null
+++ b/docs/design/trusted-board-boot.rst
@@ -0,0 +1,239 @@
+Trusted Board Boot Design Guide
+.. contents::
+The Trusted Board Boot (TBB) feature prevents malicious firmware from running on
+the platform by authenticating all firmware images up to and including the
+normal world bootloader. It does this by establishing a Chain of Trust using
+Public-Key-Cryptography Standards (PKCS).
+This document describes the design of Trusted Firmware-A (TF-A) TBB, which is an
+implementation of the `Trusted Board Boot Requirements (TBBR)`_ specification,
+Arm DEN0006D. It should be used in conjunction with the `Firmware Update`_
+design document, which implements a specific aspect of the TBBR.
+Chain of Trust
+A Chain of Trust (CoT) starts with a set of implicitly trusted components. On
+the Arm development platforms, these components are:
+-  A SHA-256 hash of the Root of Trust Public Key (ROTPK). It is stored in the
+   trusted root-key storage registers.
+-  The BL1 image, on the assumption that it resides in ROM so cannot be
+   tampered with.
+The remaining components in the CoT are either certificates or boot loader
+images. The certificates follow the `X.509 v3`_ standard. This standard
+enables adding custom extensions to the certificates, which are used to store
+essential information to establish the CoT.
+In the TBB CoT all certificates are self-signed. There is no need for a
+Certificate Authority (CA) because the CoT is not established by verifying the
+validity of a certificate's issuer but by the content of the certificate
+extensions. To sign the certificates, the PKCS#1 SHA-256 with RSA Encryption
+signature scheme is used with a RSA key length of 2048 bits. Future version of
+TF-A will support additional cryptographic algorithms.
+The certificates are categorised as "Key" and "Content" certificates. Key
+certificates are used to verify public keys which have been used to sign content
+certificates. Content certificates are used to store the hash of a boot loader
+image. An image can be authenticated by calculating its hash and matching it
+with the hash extracted from the content certificate. The SHA-256 function is
+used to calculate all hashes. The public keys and hashes are included as
+non-standard extension fields in the `X.509 v3`_ certificates.
+The keys used to establish the CoT are:
+-  **Root of trust key**
+   The private part of this key is used to sign the BL2 content certificate and
+   the trusted key certificate. The public part is the ROTPK.
+-  **Trusted world key**
+   The private part is used to sign the key certificates corresponding to the
+   secure world images (SCP_BL2, BL31 and BL32). The public part is stored in
+   one of the extension fields in the trusted world certificate.
+-  **Non-trusted world key**
+   The private part is used to sign the key certificate corresponding to the
+   non secure world image (BL33). The public part is stored in one of the
+   extension fields in the trusted world certificate.
+-  **BL3-X keys**
+   For each of SCP_BL2, BL31, BL32 and BL33, the private part is used to
+   sign the content certificate for the BL3-X image. The public part is stored
+   in one of the extension fields in the corresponding key certificate.
+The following images are included in the CoT:
+-  BL1
+-  BL2
+-  SCP_BL2 (optional)
+-  BL31
+-  BL33
+-  BL32 (optional)
+The following certificates are used to authenticate the images.
+-  **BL2 content certificate**
+   It is self-signed with the private part of the ROT key. It contains a hash
+   of the BL2 image.
+-  **Trusted key certificate**
+   It is self-signed with the private part of the ROT key. It contains the
+   public part of the trusted world key and the public part of the non-trusted
+   world key.
+-  **SCP_BL2 key certificate**
+   It is self-signed with the trusted world key. It contains the public part of
+   the SCP_BL2 key.
+-  **SCP_BL2 content certificate**
+   It is self-signed with the SCP_BL2 key. It contains a hash of the SCP_BL2
+   image.
+-  **BL31 key certificate**
+   It is self-signed with the trusted world key. It contains the public part of
+   the BL31 key.
+-  **BL31 content certificate**
+   It is self-signed with the BL31 key. It contains a hash of the BL31 image.
+-  **BL32 key certificate**
+   It is self-signed with the trusted world key. It contains the public part of
+   the BL32 key.
+-  **BL32 content certificate**
+   It is self-signed with the BL32 key. It contains a hash of the BL32 image.
+-  **BL33 key certificate**
+   It is self-signed with the non-trusted world key. It contains the public
+   part of the BL33 key.
+-  **BL33 content certificate**
+   It is self-signed with the BL33 key. It contains a hash of the BL33 image.
+The SCP_BL2 and BL32 certificates are optional, but they must be present if the
+corresponding SCP_BL2 or BL32 images are present.
+Trusted Board Boot Sequence
+The CoT is verified through the following sequence of steps. The system panics
+if any of the steps fail.
+-  BL1 loads and verifies the BL2 content certificate. The issuer public key is
+   read from the verified certificate. A hash of that key is calculated and
+   compared with the hash of the ROTPK read from the trusted root-key storage
+   registers. If they match, the BL2 hash is read from the certificate.
+   Note: the matching operation is platform specific and is currently
+   unimplemented on the Arm development platforms.
+-  BL1 loads the BL2 image. Its hash is calculated and compared with the hash
+   read from the certificate. Control is transferred to the BL2 image if all
+   the comparisons succeed.
+-  BL2 loads and verifies the trusted key certificate. The issuer public key is
+   read from the verified certificate. A hash of that key is calculated and
+   compared with the hash of the ROTPK read from the trusted root-key storage
+   registers. If the comparison succeeds, BL2 reads and saves the trusted and
+   non-trusted world public keys from the verified certificate.
+The next two steps are executed for each of the SCP_BL2, BL31 & BL32 images.
+The steps for the optional SCP_BL2 and BL32 images are skipped if these images
+are not present.
+-  BL2 loads and verifies the BL3x key certificate. The certificate signature
+   is verified using the trusted world public key. If the signature
+   verification succeeds, BL2 reads and saves the BL3x public key from the
+   certificate.
+-  BL2 loads and verifies the BL3x content certificate. The signature is
+   verified using the BL3x public key. If the signature verification succeeds,
+   BL2 reads and saves the BL3x image hash from the certificate.
+The next two steps are executed only for the BL33 image.
+-  BL2 loads and verifies the BL33 key certificate. If the signature
+   verification succeeds, BL2 reads and saves the BL33 public key from the
+   certificate.
+-  BL2 loads and verifies the BL33 content certificate. If the signature
+   verification succeeds, BL2 reads and saves the BL33 image hash from the
+   certificate.
+The next step is executed for all the boot loader images.
+-  BL2 calculates the hash of each image. It compares it with the hash obtained
+   from the corresponding content certificate. The image authentication succeeds
+   if the hashes match.
+The Trusted Board Boot implementation spans both generic and platform-specific
+BL1 and BL2 code, and in tool code on the host build machine. The feature is
+enabled through use of specific build flags as described in the `User Guide`_.
+On the host machine, a tool generates the certificates, which are included in
+the FIP along with the boot loader images. These certificates are loaded in
+Trusted SRAM using the IO storage framework. They are then verified by an
+Authentication module included in TF-A.
+The mechanism used for generating the FIP and the Authentication module are
+described in the following sections.
+Authentication Framework
+The authentication framework included in TF-A provides support to implement
+the desired trusted boot sequence. Arm platforms use this framework to
+implement the boot requirements specified in the `TBBR-client`_ document.
+More information about the authentication framework can be found in the
+`Auth Framework`_ document.
+Certificate Generation Tool
+The ``cert_create`` tool is built and runs on the host machine as part of the
+TF-A build process when ``GENERATE_COT=1``. It takes the boot loader images
+and keys as inputs (keys must be in PEM format) and generates the
+certificates (in DER format) required to establish the CoT. New keys can be
+generated by the tool in case they are not provided. The certificates are then
+passed as inputs to the ``fiptool`` utility for creating the FIP.
+The certificates are also stored individually in the in the output build
+The tool resides in the ``tools/cert_create`` directory. It uses OpenSSL SSL
+library version 1.0.1 or later to generate the X.509 certificates. Instructions
+for building and using the tool can be found in the `User Guide`_.
+*Copyright (c) 2015-2019, Arm Limited and Contributors. All rights reserved.*
+.. _Firmware Update: firmware-update.rst
+.. _X.509 v3:
+.. _User Guide: ../getting_started/user-guide.rst
+.. _Auth Framework: auth-framework.rst
+.. _TBBR-client:
+.. _Trusted Board Boot Requirements (TBBR): `TBBR-client`_