perf(cpufeat): centralise PAuth key saving

prepare_el3_entry() is meant to be the one-stop shop for all the context
we must fiddle with to enter EL3 proper. However, PAuth is the one
exception, happening right after. Absorb it into prepare_el3_entry(),
handling the BL1/BL31 difference.

This is a good time to also move the key saving into the enable
function, also to centralise. With this it becomes apparent that saving
keys just before CPU_SUSPEND is redundant as they will be reinitialised
when the core wakes up.

Note that the key loading, now in save_gp_pmcr_pauth_regs, does not end
in an isb.  The effects of the key change are not needed until the isb
in the caller, so this isb is not needed.

Change-Id: Idd286bea91140c106ab4c933c5c44b0bc2050ca2
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
diff --git a/lib/extensions/pauth/pauth.c b/lib/extensions/pauth/pauth.c
index c6c6e10..2dd0d28 100644
--- a/lib/extensions/pauth/pauth.c
+++ b/lib/extensions/pauth/pauth.c
@@ -6,8 +6,11 @@
 #include <arch.h>
 #include <arch_features.h>
 #include <arch_helpers.h>
+#include <lib/el3_runtime/cpu_data.h>
 #include <lib/extensions/pauth.h>
 
+extern uint64_t bl1_apiakey[2];
+
 void __no_pauth pauth_init_enable_el3(void)
 {
 	if (is_feat_pauth_supported()) {
@@ -33,6 +36,22 @@
 	/* Program instruction key A used by the Trusted Firmware */
 	write_apiakeylo_el1(key_lo);
 	write_apiakeyhi_el1(key_hi);
+
+#if IMAGE_BL31
+	set_cpu_data(apiakey[0], key_lo);
+	set_cpu_data(apiakey[1], key_hi);
+
+	/*
+	 * In the warmboot entrypoint, cpu_data may have been written before
+	 * data caching was enabled. Flush the caches so nothing stale is read.
+	 */
+#if !(HW_ASSISTED_COHERENCY || WARMBOOT_ENABLE_DCACHE_EARLY)
+	flush_cpu_data(apiakey);
+#endif
+#elif IMAGE_BL1
+	bl1_apiakey[0] = key_lo;
+	bl1_apiakey[1] = key_hi;
+#endif
 }
 
 /*