Add PMF instrumentation points in TF

In order to quantify the overall time spent in the PSCI software
implementation, an initial collection of PMF instrumentation points
has been added.

Instrumentation has been added to the following code paths:

- Entry to PSCI SMC handler.  The timestamp is captured as early
  as possible during the runtime exception and stored in memory
  before entering the PSCI SMC handler.

- Exit from PSCI SMC handler.  The timestamp is captured after
  normal return from the PSCI SMC handler or if a low power state
  was requested it is captured in the bl31 warm boot path before
  return to normal world.

- Entry to low power state.  The timestamp is captured before entry
  to a low power state which implies either standby or power down.
  As these power states are mutually exclusive, only one timestamp
  is defined to describe both.  It is possible to differentiate between
  the two power states using the PSCI STAT interface.

- Exit from low power state.  The timestamp is captured after a standby
  or power up operation has completed.

To calculate the number of cycles spent running code in Trusted Firmware
one can perform the following calculation:

(exit_psci - enter_psci) - (exit_low_pwr - enter_low_pwr).

The resulting number of cycles can be converted to time given the
frequency of the counter.

Change-Id: Ie3b8f3d16409b6703747093b3a2d5c7429ad0166
Signed-off-by: dp-arm <dimitris.papastamos@arm.com>
diff --git a/bl31/aarch64/runtime_exceptions.S b/bl31/aarch64/runtime_exceptions.S
index 799062e..f333bf1 100644
--- a/bl31/aarch64/runtime_exceptions.S
+++ b/bl31/aarch64/runtime_exceptions.S
@@ -31,6 +31,7 @@
 #include <arch.h>
 #include <asm_macros.S>
 #include <context.h>
+#include <cpu_data.h>
 #include <interrupt_mgmt.h>
 #include <platform_def.h>
 #include <runtime_svc.h>
@@ -47,6 +48,21 @@
 	msr	daifclr, #DAIF_ABT_BIT
 
 	str	x30, [sp, #CTX_GPREGS_OFFSET + CTX_GPREG_LR]
+
+#if ENABLE_RUNTIME_INSTRUMENTATION
+
+	/*
+	 * Read the timestamp value and store it in per-cpu data.
+	 * The value will be extracted from per-cpu data by the
+	 * C level SMC handler and saved to the PMF timestamp region.
+	 */
+	mrs	x30, cntpct_el0
+	str	x29, [sp, #CTX_GPREGS_OFFSET + CTX_GPREG_X29]
+	mrs	x29, tpidr_el3
+	str	x30, [x29, #CPU_DATA_PMF_TS0_OFFSET]
+	ldr	x29, [sp, #CTX_GPREGS_OFFSET + CTX_GPREG_X29]
+#endif
+
 	mrs	x30, esr_el3
 	ubfx	x30, x30, #ESR_EC_SHIFT, #ESR_EC_LENGTH