Call reset handlers upon BL3-1 entry.

This patch adds support to call the reset_handler() function in BL3-1 in the
cold and warm boot paths when another Boot ROM reset_handler() has already run.

This means the BL1 and BL3-1 versions of the CPU and platform specific reset
handlers may execute different code to each other. This enables a developer to
perform additional actions or undo actions already performed during the first
call of the reset handlers e.g. apply additional errata workarounds.

Typically, the reset handler will be first called from the BL1 Boot ROM. Any
additional functionality can be added to the reset handler when it is called
from BL3-1 resident in RW memory. The constant FIRST_RESET_HANDLER_CALL is used
to identify whether this is the first version of the reset handler code to be
executed or an overridden version of the code.

The Cortex-A57 errata workarounds are applied only if they have not already been
applied.

Fixes ARM-software/tf-issue#275

Change-Id: Id295f106e4fda23d6736debdade2ac7f2a9a9053
diff --git a/bl31/aarch64/bl31_entrypoint.S b/bl31/aarch64/bl31_entrypoint.S
index b786b29..01d7a7f 100644
--- a/bl31/aarch64/bl31_entrypoint.S
+++ b/bl31/aarch64/bl31_entrypoint.S
@@ -61,15 +61,21 @@
 	bic	x0, x0, #SCTLR_EE_BIT
 	msr	sctlr_el3, x0
 	isb
+#endif
 
-	/* -----------------------------------------------------
-	 * Perform any processor specific actions upon reset
-	 * e.g. cache, tlb invalidations etc. Override the
-	 * Boot ROM(BL0) programming sequence
-	 * -----------------------------------------------------
+	/* ---------------------------------------------
+	 * When RESET_TO_BL31 is true, perform any
+	 * processor specific actions upon reset e.g.
+	 * cache, tlb invalidations, errata workarounds
+	 * etc.
+	 * When RESET_TO_BL31 is false, perform any
+	 * processor specific actions which undo or are
+	 * in addition to the actions performed by the
+	 * reset handler in the Boot ROM (BL1).
+	 * ---------------------------------------------
 	 */
 	bl	reset_handler
-#endif
+
 	/* ---------------------------------------------
 	 * Enable the instruction cache, stack pointer
 	 * and data access alignment checks
diff --git a/docs/firmware-design.md b/docs/firmware-design.md
index ee76d5c..96e4b4c 100644
--- a/docs/firmware-design.md
+++ b/docs/firmware-design.md
@@ -9,12 +9,13 @@
 4.  [Power State Coordination Interface](#4--power-state-coordination-interface)
 5.  [Secure-EL1 Payloads and Dispatchers](#5--secure-el1-payloads-and-dispatchers)
 6.  [Crash Reporting in BL3-1](#6--crash-reporting-in-bl3-1)
-7.  [CPU specific operations framework](#7--cpu-specific-operations-framework)
-8.  [Memory layout of BL images](#8-memory-layout-of-bl-images)
-9.  [Firmware Image Package (FIP)](#9--firmware-image-package-fip)
-10. [Use of coherent memory in Trusted Firmware](#10--use-of-coherent-memory-in-trusted-firmware)
-11. [Code Structure](#11--code-structure)
-12. [References](#12--references)
+7.  [Guidelines for Reset Handlers](#7--guidelines-for-reset-handlers)
+8.  [CPU specific operations framework](#8--cpu-specific-operations-framework)
+9.  [Memory layout of BL images](#9-memory-layout-of-bl-images)
+10. [Firmware Image Package (FIP)](#10--firmware-image-package-fip)
+11. [Use of coherent memory in Trusted Firmware](#11--use-of-coherent-memory-in-trusted-firmware)
+12. [Code Structure](#12--code-structure)
+13. [References](#13--references)
 
 
 1.  Introduction
@@ -960,8 +961,48 @@
     fpexc32_el2	:0x0000000004000700
     sp_el0	:0x0000000004010780
 
+7.  Guidelines for Reset Handlers
+---------------------------------
+
+Trusted Firmware implements a framework that allows CPU and platform ports to
+perform actions immediately after a CPU is released from reset in both the cold
+and warm boot paths. This is done by calling the `reset_handler()` function in
+both the BL1 and BL3-1 images. It in turn calls the platform and CPU specific
+reset handling functions.
+
+Details for implementing a CPU specific reset handler can be found in
+Section 8. Details for implementing a platform specific reset handler can be
+found in the [Porting Guide](see the `plat_reset_handler()` function).
+
+When adding functionality to a reset handler, the following points should be
+kept in mind.
+
+1.   The first reset handler in the system exists either in a ROM image
+     (e.g. BL1), or BL3-1 if `RESET_TO_BL31` is true. This may be detected at
+     compile time using the constant `FIRST_RESET_HANDLER_CALL`.
+
+2.   When considering ROM images, it's important to consider non TF-based ROMs
+     and ROMs based on previous versions of the TF code.
 
-7.  CPU specific operations framework
+3.   If the functionality should be applied to a ROM and there is no possibility
+     of a ROM being used that does not apply the functionality (or equivalent),
+     then the functionality should be applied within a `#if
+     FIRST_RESET_HANDLER_CALL` block.
+
+4.   If the functionality should execute in BL3-1 in order to override or
+     supplement a ROM version of the functionality, then the functionality
+     should be applied in the `#else` part of a `#if FIRST_RESET_HANDLER_CALL`
+     block.
+
+5.   If the functionality should be applied to a ROM but there is a possibility
+     of ROMs being used that do not apply the functionality, then the
+     functionality should be applied outside of a `FIRST_RESET_HANDLER_CALL`
+     block, so that BL3-1 has an opportunity to apply the functionality instead.
+     In this case, additional code may be needed to cope with different ROMs
+     that do or do not apply the functionality.
+
+
+8.  CPU specific operations framework
 -----------------------------
 
 Certain aspects of the ARMv8 architecture are implementation defined,
@@ -1026,6 +1067,9 @@
 the returned `cpu_ops` is then invoked which executes the required reset
 handling for that CPU and also any errata workarounds enabled by the platform.
 
+Refer to Section "Guidelines for Reset Handlers" for general guidelines
+regarding placement of code in a reset handler.
+
 ### CPU specific power down sequence
 
 During the BL3-1 initialization sequence, the pointer to the matching `cpu_ops`
@@ -1056,7 +1100,7 @@
 expected by the crash reporting framework.
 
 
-8. Memory layout of BL images
+9. Memory layout of BL images
 -----------------------------
 
 Each bootloader image can be divided in 2 parts:
@@ -1378,7 +1422,7 @@
 images in Trusted SRAM.
 
 
-9.  Firmware Image Package (FIP)
+10.  Firmware Image Package (FIP)
 ---------------------------------
 
 Using a Firmware Image Package (FIP) allows for packing bootloader images (and
@@ -1456,7 +1500,7 @@
 platform policy can be modified to allow additional images.
 
 
-10. Use of coherent memory in Trusted Firmware
+11. Use of coherent memory in Trusted Firmware
 ----------------------------------------------
 
 There might be loss of coherency when physical memory with mismatched
@@ -1657,7 +1701,7 @@
 the [Porting Guide]). Refer to the reference platform code for examples.
 
 
-11.  Code Structure
+12.  Code Structure
 -------------------
 
 Trusted Firmware code is logically divided between the three boot loader
@@ -1702,7 +1746,7 @@
 kernel at boot time. These can be found in the `fdts` directory.
 
 
-12.  References
+13.  References
 ---------------
 
 1.  Trusted Board Boot Requirements CLIENT PDD (ARM DEN 0006B-5). Available
diff --git a/docs/porting-guide.md b/docs/porting-guide.md
index 03b5888..747cb00 100644
--- a/docs/porting-guide.md
+++ b/docs/porting-guide.md
@@ -483,7 +483,9 @@
 preserve the value in x10 register as it is used by the caller to store the
 return address.
 
-The default implementation doesn't do anything.
+The default implementation doesn't do anything. If a platform needs to override
+the default implementation, refer to the [Firmware Design Guide] for general
+guidelines regarding placement of code in a reset handler.
 
 ### Function : plat_disable_acp()
 
@@ -1476,6 +1478,7 @@
 [IMF Design Guide]:                   interrupt-framework-design.md
 [User Guide]:                         user-guide.md
 [FreeBSD]:                            http://www.freebsd.org
+[Firmware Design Guide]:              firmware-design.md
 
 [plat/common/aarch64/platform_mp_stack.S]: ../plat/common/aarch64/platform_mp_stack.S
 [plat/common/aarch64/platform_up_stack.S]: ../plat/common/aarch64/platform_up_stack.S
diff --git a/include/common/bl_common.h b/include/common/bl_common.h
index 9945e3a..0959c89 100644
--- a/include/common/bl_common.h
+++ b/include/common/bl_common.h
@@ -90,6 +90,18 @@
 	(_p)->h.attr = (uint32_t)(_attr) ; \
 	} while (0)
 
+/*******************************************************************************
+ * Constant that indicates if this is the first version of the reset handler
+ * contained in an image. This will be the case when the image is BL1 or when
+ * its BL3-1 and RESET_TO_BL31 is true. This constant enables a subsequent
+ * version of the reset handler to perform actions that override the ones
+ * performed in the first version of the code. This will be required when the
+ * first version exists in an un-modifiable image e.g. a BootROM image.
+ ******************************************************************************/
+#if IMAGE_BL1 || (IMAGE_BL31 && RESET_TO_BL31)
+#define FIRST_RESET_HANDLER_CALL
+#endif
+
 #ifndef __ASSEMBLY__
 #include <cdefs.h> /* For __dead2 */
 #include <cassert.h>
diff --git a/include/lib/cpus/aarch64/cpu_macros.S b/include/lib/cpus/aarch64/cpu_macros.S
index 65fb82d..089f09c 100644
--- a/include/lib/cpus/aarch64/cpu_macros.S
+++ b/include/lib/cpus/aarch64/cpu_macros.S
@@ -40,7 +40,7 @@
 CPU_MIDR: /* cpu_ops midr */
 	.space  8
 /* Reset fn is needed in BL at reset vector */
-#if IMAGE_BL1 || (IMAGE_BL31 && RESET_TO_BL31)
+#if IMAGE_BL1 || IMAGE_BL31
 CPU_RESET_FUNC: /* cpu_ops reset_func */
 	.space  8
 #endif
@@ -65,7 +65,7 @@
 	.section cpu_ops, "a"; .align 3
 	.type cpu_ops_\_name, %object
 	.quad \_midr
-#if IMAGE_BL1 || (IMAGE_BL31 && RESET_TO_BL31)
+#if IMAGE_BL1 || IMAGE_BL31
 	.if \_noresetfunc
 	.quad 0
 	.else
diff --git a/lib/cpus/aarch64/cortex_a53.S b/lib/cpus/aarch64/cortex_a53.S
index ec18464..306b42e 100644
--- a/lib/cpus/aarch64/cortex_a53.S
+++ b/lib/cpus/aarch64/cortex_a53.S
@@ -29,6 +29,7 @@
  */
 #include <arch.h>
 #include <asm_macros.S>
+#include <bl_common.h>
 #include <cortex_a53.h>
 #include <cpu_macros.S>
 #include <plat_macros.S>
@@ -58,13 +59,17 @@
 
 func cortex_a53_reset_func
 	/* ---------------------------------------------
-	 * As a bare minimum enable the SMP bit.
+	 * As a bare minimum enable the SMP bit if it is
+	 * not already set.
 	 * ---------------------------------------------
 	 */
 	mrs	x0, CPUECTLR_EL1
+	tst	x0, #CPUECTLR_SMP_BIT
+	b.ne	skip_smp_setup
 	orr	x0, x0, #CPUECTLR_SMP_BIT
 	msr	CPUECTLR_EL1, x0
 	isb
+skip_smp_setup:
 	ret
 
 func cortex_a53_core_pwr_dwn
diff --git a/lib/cpus/aarch64/cortex_a57.S b/lib/cpus/aarch64/cortex_a57.S
index dab16d7..3334e68 100644
--- a/lib/cpus/aarch64/cortex_a57.S
+++ b/lib/cpus/aarch64/cortex_a57.S
@@ -30,6 +30,7 @@
 #include <arch.h>
 #include <asm_macros.S>
 #include <assert_macros.S>
+#include <bl_common.h>
 #include <cortex_a57.h>
 #include <cpu_macros.S>
 #include <plat_macros.S>
@@ -99,9 +100,17 @@
 	ret
 #endif
 apply_806969:
+	/*
+	 * Test if errata has already been applied in an earlier
+	 * invocation of the reset handler and does not need to
+	 * be applied again.
+	 */
 	mrs	x1, CPUACTLR_EL1
+	tst	x1, #CPUACTLR_NO_ALLOC_WBWA
+	b.ne	skip_806969
 	orr	x1, x1, #CPUACTLR_NO_ALLOC_WBWA
 	msr	CPUACTLR_EL1, x1
+skip_806969:
 	ret
 
 
@@ -123,9 +132,17 @@
 	ret
 #endif
 apply_813420:
+	/*
+	 * Test if errata has already been applied in an earlier
+	 * invocation of the reset handler and does not need to
+	 * be applied again.
+	 */
 	mrs	x1, CPUACTLR_EL1
+	tst	x1, #CPUACTLR_DCC_AS_DCCI
+	b.ne	skip_813420
 	orr	x1, x1, #CPUACTLR_DCC_AS_DCCI
 	msr	CPUACTLR_EL1, x1
+skip_813420:
 	ret
 
 	/* -------------------------------------------------
@@ -154,13 +171,18 @@
 	mov	x0, x20
 	bl	errata_a57_813420_wa
 #endif
+
 	/* ---------------------------------------------
-	 * As a bare minimum enable the SMP bit.
+	 * As a bare minimum enable the SMP bit if it is
+	 * not already set.
 	 * ---------------------------------------------
 	 */
 	mrs	x0, CPUECTLR_EL1
+	tst	x0, #CPUECTLR_SMP_BIT
+	b.ne	skip_smp_setup
 	orr	x0, x0, #CPUECTLR_SMP_BIT
 	msr	CPUECTLR_EL1, x0
+skip_smp_setup:
 	isb
 	ret	x19
 
diff --git a/lib/cpus/aarch64/cpu_helpers.S b/lib/cpus/aarch64/cpu_helpers.S
index 5680bce..d829f60 100644
--- a/lib/cpus/aarch64/cpu_helpers.S
+++ b/lib/cpus/aarch64/cpu_helpers.S
@@ -37,7 +37,7 @@
 #endif
 
  /* Reset fn is needed in BL at reset vector */
-#if IMAGE_BL1 || (IMAGE_BL31 && RESET_TO_BL31)
+#if IMAGE_BL1 || IMAGE_BL31
 	/*
 	 * The reset handler common to all platforms.  After a matching
 	 * cpu_ops structure entry is found, the correponding reset_handler
@@ -64,7 +64,7 @@
 1:
 	ret
 
-#endif /* IMAGE_BL1 || (IMAGE_BL31 && RESET_TO_BL31) */
+#endif /* IMAGE_BL1 || IMAGE_BL31 */
 
 #if IMAGE_BL31 /* The power down core and cluster is needed only in  BL31 */
 	/*
diff --git a/plat/juno/aarch64/plat_helpers.S b/plat/juno/aarch64/plat_helpers.S
index 028a1a5..37966a3 100644
--- a/plat/juno/aarch64/plat_helpers.S
+++ b/plat/juno/aarch64/plat_helpers.S
@@ -115,12 +115,20 @@
 	/* -----------------------------------------------------
 	 * void plat_reset_handler(void);
 	 *
+	 * Before adding code in this function, refer to the
+	 * guidelines in docs/firmware-design.md to determine
+	 * whether the code should reside within the
+	 * FIRST_RESET_HANDLER_CALL block or not.
+	 *
 	 * Implement workaround for defect id 831273 by enabling
 	 * an event stream every 65536 cycles and set the L2 RAM
-	 * latencies for Cortex-A57.
+	 * latencies for Cortex-A57. This code is included only
+	 * when FIRST_RESET_HANDLER_CALL is defined since it
+	 * should be executed only during BL1.
 	 * -----------------------------------------------------
 	 */
 func plat_reset_handler
+#ifdef FIRST_RESET_HANDLER_CALL
 	/* Read the MIDR_EL1 */
 	mrs	x0, midr_el1
 	ubfx	x1, x0, MIDR_PN_SHIFT, #12
@@ -135,11 +143,12 @@
 
 1:
 	/* ---------------------------------------------
-	* Enable the event stream every 65536 cycles
-	* ---------------------------------------------
-	*/
+	 * Enable the event stream every 65536 cycles
+	 * ---------------------------------------------
+	 */
 	mov     x0, #(0xf << EVNTI_SHIFT)
 	orr     x0, x0, #EVNTEN_BIT
 	msr     CNTKCTL_EL1, x0
 	isb
+#endif /* FIRST_RESET_HANDLER_CALL */
 	ret
diff --git a/services/std_svc/psci/psci_entry.S b/services/std_svc/psci/psci_entry.S
index 8145012..3e67d34 100644
--- a/services/std_svc/psci/psci_entry.S
+++ b/services/std_svc/psci/psci_entry.S
@@ -54,9 +54,18 @@
 psci_aff_common_finish_entry:
 #if !RESET_TO_BL31
 	/* ---------------------------------------------
+	 * Perform any processor specific actions which
+	 * undo or are in addition to the actions
+	 * performed by the reset handler in the BootROM
+	 * (BL1) e.g. cache, tlb invalidations, errata
+	 * workarounds etc.
+	 * ---------------------------------------------
+	 */
+	bl      reset_handler
+
+	/* ---------------------------------------------
 	 * Enable the instruction cache, stack pointer
-	 * and data access alignment checks. Also, set
-	 * the EL3 exception endianess to little-endian.
+	 * and data access alignment checks.
 	 * It can be assumed that BL3-1 entrypoint code
 	 * will do this when RESET_TO_BL31 is set. The
 	 * same  assumption cannot be made when another