feat(psci): allow cores to wake up from powerdown

The simplistic view of a core's powerdown sequence is that power is
atomically cut upon calling `wfi`. However, it turns out that it has
lots to do - it has to talk to the interconnect to exit coherency, clean
caches, check for RAS errors, etc. These take significant amounts of
time and are certainly not atomic. As such there is a significant window
of opportunity for external events to happen. Many of these steps are
not destructive to context, so theoretically, the core can just "give
up" half way (or roll certain actions back) and carry on running. The
point in this sequence after which roll back is not possible is called
the point of no return.

One of these actions is the checking for RAS errors. It is possible for
one to happen during this lengthy sequence, or at least remain
undiscovered until that point. If the core were to continue powerdown
when that happens, there would be no (easy) way to inform anyone about
it. Rejecting the powerdown and letting software handle the error is the
best way to implement this.

Arm cores since at least the a510 have included this exact feature. So
far it hasn't been deemed necessary to account for it in firmware due to
the low likelihood of this happening. However, events like GIC wakeup
requests are much more probable. Older cores will powerdown and
immediately power back up when this happens. Travis and Gelas include a
feature similar to the RAS case above, called powerdown abandon. The
idea is that this will improve the latency to service the interrupt by
saving on work which the core and software need to do.

So far firmware has relied on the `wfi` being the point of no return and
if it doesn't explicitly detect a pending interrupt quite early on, it
will embark onto a sequence that it expects to end with shutdown. To
accommodate for it not being a point of no return, we must undo all of
the system management we did, just like in the warm boot entrypoint.

To achieve that, the pwr_domain_pwr_down_wfi hook must not be terminal.
Most recent platforms do some platform management and finish on the
standard `wfi`, followed by a panic or an endless loop as this is
expected to not return. To make this generic, any platform that wishes
to support wakeups must instead let common code call
`psci_power_down_wfi()` right after. Besides wakeups, this lets common
code handle powerdown errata better as well.

Then, the CPU_OFF case is simple - PSCI does not allow it to return. So
the best that can be done is to attempt the `wfi` a few times (the
choice of 32 is arbitrary) in the hope that the wakeup is transient. If
it isn't, the only choice is to panic, as the system is likely to be in
a bad state, eg. interrupts weren't routed away. The same applies for
SYSTEM_OFF, SYSTEM_RESET, and SYSTEM_RESET2. There the panic won't
matter as the system is going offline one way or another. The RAS case
will be considered in a separate patch.

Now, the CPU_SUSPEND case is more involved. First, to powerdown it must
wipe its context as it is not written on warm boot. But it cannot be
overwritten in case of a wakeup. To avoid the catch 22, save a copy that
will only be used if powerdown fails. That is about 500 bytes on the
stack so it hopefully doesn't tip anyone over any limits. In future that
can be avoided by having a core manage its own context.

Second, when the core wakes up, it must undo anything it did to prepare
for poweroff, which for the cores we care about, is writing
CPUPWRCTLR_EL1.CORE_PWRDN_EN. The least intrusive for the cpu library
way of doing this is to simply call the power off hook again and have
the hook toggle the bit. If in the future there need to be more complex
sequences, their direction can be advised on the value of this bit.

Third, do the actual "resume". Most of the logic is already there for
the retention suspend, so that only needs a small touch up to apply to
the powerdown case as well. The missing bit is the powerdown specific
state management. Luckily, the warmboot entrypoint does exactly that
already too, so steal that and we're done.

All of this is hidden behind a FEAT_PABANDON flag since it has a large
memory and runtime cost that we don't want to burden non pabandon cores
with.

Finally, do some function renaming to better reflect their purpose and
make names a little bit more consistent.

Change-Id: I2405b59300c2e24ce02e266f91b7c51474c1145f
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
diff --git a/lib/cpus/aarch64/cortex_gelas.S b/lib/cpus/aarch64/cortex_gelas.S
index 43608e4..df73a89 100644
--- a/lib/cpus/aarch64/cortex_gelas.S
+++ b/lib/cpus/aarch64/cortex_gelas.S
@@ -21,6 +21,10 @@
 #error "Gelas supports only AArch64. Compile with CTX_INCLUDE_AARCH32_REGS=0"
 #endif
 
+#if FEAT_PABANDON == 0
+#error "Gelas must be compiled with FEAT_PABANDON enabled"
+#endif
+
 cpu_reset_func_start cortex_gelas
 	/* ----------------------------------------------------
 	 * Disable speculative loads
diff --git a/lib/cpus/aarch64/travis.S b/lib/cpus/aarch64/travis.S
index 695e7d8..3edd298 100644
--- a/lib/cpus/aarch64/travis.S
+++ b/lib/cpus/aarch64/travis.S
@@ -21,6 +21,10 @@
 #error "Travis supports only AArch64. Compile with CTX_INCLUDE_AARCH32_REGS=0"
 #endif
 
+#if FEAT_PABANDON == 0
+#error "Travis must be compiled with FEAT_PABANDON enabled"
+#endif
+
 cpu_reset_func_start travis
 	/* ----------------------------------------------------
 	 * Disable speculative loads
diff --git a/lib/psci/aarch64/psci_helpers.S b/lib/psci/aarch64/psci_helpers.S
index 088ab43..b297f9b 100644
--- a/lib/psci/aarch64/psci_helpers.S
+++ b/lib/psci/aarch64/psci_helpers.S
@@ -118,19 +118,16 @@
 endfunc psci_do_pwrup_cache_maintenance
 
 /* -----------------------------------------------------------------------
- * void psci_power_down_wfi(void);
- * This function is called to indicate to the power controller that it
- * is safe to power down this cpu. It should not exit the wfi and will
- * be released from reset upon power up.
+ * void psci_power_down_wfi(void); This function is called to indicate to the
+ * power controller that it is safe to power down this cpu. It may exit if the
+ * request was denied and reset did not occur
  * -----------------------------------------------------------------------
  */
 func psci_power_down_wfi
 	apply_erratum cortex_a510, ERRATUM(2684597), ERRATA_A510_2684597
 
 	dsb	sy		// ensure write buffer empty
-1:
 	wfi
-	b	1b
 
 	/*
 	 * in case the WFI wasn't terminal, we have to undo errata mitigations.
@@ -139,4 +136,6 @@
 	apply_erratum cortex_a710, ERRATUM(2291219), ERRATA_A710_2291219
 	apply_erratum cortex_x3,   ERRATUM(2313909), ERRATA_X3_2313909, NO_GET_CPU_REV
 	apply_erratum neoverse_n2, ERRATUM(2326639), ERRATA_N2_2326639, NO_GET_CPU_REV
+
+	ret
 endfunc psci_power_down_wfi
diff --git a/lib/psci/psci_common.c b/lib/psci/psci_common.c
index 93d71b8..4da7a90 100644
--- a/lib/psci/psci_common.c
+++ b/lib/psci/psci_common.c
@@ -1019,8 +1019,12 @@
 	 */
 	if (psci_get_aff_info_state() == AFF_STATE_ON_PENDING)
 		psci_cpu_on_finish(cpu_idx, &state_info);
-	else
-		psci_cpu_suspend_to_powerdown_finish(cpu_idx, &state_info);
+	else {
+		unsigned int max_off_lvl = psci_find_max_off_lvl(&state_info);
+
+		assert(max_off_lvl != PSCI_INVALID_PWR_LVL);
+		psci_cpu_suspend_to_powerdown_finish(cpu_idx, max_off_lvl, &state_info);
+	}
 
 	/*
 	 * Generic management: Now we just need to retrieve the
@@ -1156,7 +1160,7 @@
  * Initiate power down sequence, by calling power down operations registered for
  * this CPU.
  ******************************************************************************/
-void psci_pwrdown_cpu(unsigned int power_level)
+void psci_pwrdown_cpu_start(unsigned int power_level)
 {
 #if ENABLE_RUNTIME_INSTRUMENTATION
 
@@ -1197,6 +1201,60 @@
 }
 
 /*******************************************************************************
+ * Finish a terminal power down sequence, ending with a wfi. In case of wakeup
+ * will retry the sleep and panic if it persists.
+ ******************************************************************************/
+void __dead2 psci_pwrdown_cpu_end_terminal(void)
+{
+	/*
+	 * Execute a wfi which, in most cases, will allow the power controller
+	 * to physically power down this cpu. Under some circumstances that may
+	 * be denied. Hopefully this is transient, retrying a few times should
+	 * power down.
+	 */
+	for (int i = 0; i < 32; i++)
+		psci_power_down_wfi();
+
+	/* Wake up wasn't transient. System is probably in a bad state. */
+	ERROR("Could not power off CPU.\n");
+	panic();
+}
+
+/*******************************************************************************
+ * Finish a non-terminal power down sequence, ending with a wfi. In case of
+ * wakeup will unwind any CPU specific actions and return.
+ ******************************************************************************/
+
+void psci_pwrdown_cpu_end_wakeup(unsigned int power_level)
+{
+	/*
+	 * Usually, will be terminal. In some circumstances the powerdown will
+	 * be denied and we'll need to unwind
+	 */
+	psci_power_down_wfi();
+
+	/*
+	 * Waking up does not require hardware-assisted coherency, but that is
+	 * the case for every core that can wake up. Untangling the cache
+	 * coherency code from powerdown is a non-trivial effort which isn't
+	 * needed for our purposes.
+	 */
+#if !FEAT_PABANDON
+	ERROR("Systems without FEAT_PABANDON shouldn't wake up.\n");
+	panic();
+#else /* FEAT_PABANDON */
+
+	/*
+	 * Begin unwinding. Everything can be shared with CPU_ON and co later,
+	 * except the CPU specific bit. Cores that have hardware-assisted
+	 * coherency don't have much to do so just calling the hook again is
+	 * the simplest way to achieve this
+	 */
+	prepare_cpu_pwr_dwn(power_level);
+#endif /* FEAT_PABANDON */
+}
+
+/*******************************************************************************
  * This function invokes the callback 'stop_func()' with the 'mpidr' of each
  * online PE. Caller can pass suitable method to stop a remote core.
  *
diff --git a/lib/psci/psci_off.c b/lib/psci/psci_off.c
index d40ee3f..46b2114 100644
--- a/lib/psci/psci_off.c
+++ b/lib/psci/psci_off.c
@@ -115,7 +115,7 @@
 	/*
 	 * Arch. management. Initiate power down sequence.
 	 */
-	psci_pwrdown_cpu(psci_find_max_off_lvl(&state_info));
+	psci_pwrdown_cpu_start(psci_find_max_off_lvl(&state_info));
 
 	/*
 	 * Plat. management: Perform platform specific actions to turn this
@@ -153,7 +153,6 @@
 		psci_inv_cpu_data(psci_svc_cpu_data.aff_info_state);
 
 #if ENABLE_RUNTIME_INSTRUMENTATION
-
 		/*
 		 * Update the timestamp with cache off.  We assume this
 		 * timestamp can only be read from the current CPU and the
@@ -164,17 +163,12 @@
 		    RT_INSTR_ENTER_HW_LOW_PWR,
 		    PMF_NO_CACHE_MAINT);
 #endif
-
 		if (psci_plat_pm_ops->pwr_domain_pwr_down_wfi != NULL) {
-			/* This function must not return */
+			/* This function may not return */
 			psci_plat_pm_ops->pwr_domain_pwr_down_wfi(&state_info);
-		} else {
-			/*
-			 * Enter a wfi loop which will allow the power
-			 * controller to physically power down this cpu.
-			 */
-			psci_power_down_wfi();
 		}
+
+		psci_pwrdown_cpu_end_terminal();
 	}
 
 	return rc;
diff --git a/lib/psci/psci_private.h b/lib/psci/psci_private.h
index 6622755..49b19c9 100644
--- a/lib/psci/psci_private.h
+++ b/lib/psci/psci_private.h
@@ -349,7 +349,7 @@
 			   psci_power_state_t *state_info,
 			   unsigned int is_power_down_state);
 
-void psci_cpu_suspend_to_powerdown_finish(unsigned int cpu_idx, const psci_power_state_t *state_info);
+void psci_cpu_suspend_to_powerdown_finish(unsigned int cpu_idx, unsigned int max_off_lvl, const psci_power_state_t *state_info);
 
 /* Private exported functions from psci_helpers.S */
 void psci_do_pwrdown_cache_maintenance(unsigned int pwr_level);
diff --git a/lib/psci/psci_suspend.c b/lib/psci/psci_suspend.c
index 2aadbfd..aaf82a0 100644
--- a/lib/psci/psci_suspend.c
+++ b/lib/psci/psci_suspend.c
@@ -25,8 +25,7 @@
  * This function does generic and platform specific operations after a wake-up
  * from standby/retention states at multiple power levels.
  ******************************************************************************/
-static void psci_cpu_suspend_to_standby_finish(unsigned int cpu_idx,
-					     unsigned int end_pwrlvl,
+static void psci_cpu_suspend_to_standby_finish(unsigned int end_pwrlvl,
 					     psci_power_state_t *state_info)
 {
 	/*
@@ -44,11 +43,10 @@
  * operations.
  ******************************************************************************/
 static void psci_suspend_to_pwrdown_start(unsigned int end_pwrlvl,
+					  unsigned int max_off_lvl,
 					  const entry_point_info_t *ep,
 					  const psci_power_state_t *state_info)
 {
-	unsigned int max_off_lvl = psci_find_max_off_lvl(state_info);
-
 	PUBLISH_EVENT(psci_suspend_pwrdown_start);
 
 #if PSCI_OS_INIT_MODE
@@ -94,10 +92,8 @@
 
 	/*
 	 * Arch. management. Initiate power down sequence.
-	 * TODO : Introduce a mechanism to query the cache level to flush
-	 * and the cpu-ops power down to perform from the platform.
 	 */
-	psci_pwrdown_cpu(max_off_lvl);
+	psci_pwrdown_cpu_start(max_off_lvl);
 }
 
 /*******************************************************************************
@@ -127,6 +123,11 @@
 	int rc = PSCI_E_SUCCESS;
 	bool skip_wfi = false;
 	unsigned int parent_nodes[PLAT_MAX_PWR_LVL] = {0};
+	unsigned int max_off_lvl = 0;
+#if FEAT_PABANDON
+	cpu_context_t *ctx = cm_get_context(NON_SECURE);
+	cpu_context_t old_ctx;
+#endif
 
 	/*
 	 * This function must only be called on platforms where the
@@ -196,8 +197,38 @@
 	psci_stats_update_pwr_down(idx, end_pwrlvl, state_info);
 #endif
 
-	if (is_power_down_state != 0U)
-		psci_suspend_to_pwrdown_start(end_pwrlvl, ep, state_info);
+	if (is_power_down_state != 0U) {
+		/*
+		 * WHen CTX_INCLUDE_EL2_REGS is usnet, we're probably runnig
+		 * with some SPD that assumes the core is going off so it
+		 * doesn't bother saving NS's context. Do that here until we
+		 * figure out a way to make this coherent.
+		 */
+#if FEAT_PABANDON
+#if !CTX_INCLUDE_EL2_REGS
+		cm_el1_sysregs_context_save(NON_SECURE);
+#endif
+		/*
+		 * when the core wakes it expects its context to already be in
+		 * place so we must overwrite it before powerdown. But if
+		 * powerdown never happens we want the old context. Save it in
+		 * case we wake up. EL2/El1 will not be touched by PSCI so don't
+		 * copy */
+		memcpy(&ctx->gpregs_ctx, &old_ctx.gpregs_ctx, sizeof(gp_regs_t));
+		memcpy(&ctx->el3state_ctx, &old_ctx.el3state_ctx, sizeof(el3_state_t));
+#if DYNAMIC_WORKAROUND_CVE_2018_3639
+		memcpy(&ctx->cve_2018_3639_ctx, &old_ctx.cve_2018_3639_ctx, sizeof(cve_2018_3639_t));
+#endif
+#if ERRATA_SPECULATIVE_AT
+		memcpy(&ctx->errata_speculative_at_ctx, &old_ctx.errata_speculative_at_ctx, sizeof(errata_speculative_at_t));
+#endif
+#if CTX_INCLUDE_PAUTH_REGS
+		memcpy(&ctx->pauth_ctx, &old_ctx.pauth_ctx, sizeof(pauth_t));
+#endif
+#endif
+		max_off_lvl = psci_find_max_off_lvl(state_info);
+		psci_suspend_to_pwrdown_start(end_pwrlvl, max_off_lvl, ep, state_info);
+	}
 
 	/*
 	 * Plat. management: Allow the platform to perform the
@@ -223,39 +254,33 @@
 		return rc;
 	}
 
-	if (is_power_down_state != 0U) {
-#if ENABLE_RUNTIME_INSTRUMENTATION
-
-		/*
-		 * Update the timestamp with cache off.  We assume this
-		 * timestamp can only be read from the current CPU and the
-		 * timestamp cache line will be flushed before return to
-		 * normal world on wakeup.
-		 */
-		PMF_CAPTURE_TIMESTAMP(rt_instr_svc,
-		    RT_INSTR_ENTER_HW_LOW_PWR,
-		    PMF_NO_CACHE_MAINT);
-#endif
-
-		/* The function calls below must not return */
-		if (psci_plat_pm_ops->pwr_domain_pwr_down_wfi != NULL)
-			psci_plat_pm_ops->pwr_domain_pwr_down_wfi(state_info);
-		else
-			psci_power_down_wfi();
-	}
-
 #if ENABLE_RUNTIME_INSTRUMENTATION
+	/*
+	 * Update the timestamp with cache off. We assume this
+	 * timestamp can only be read from the current CPU and the
+	 * timestamp cache line will be flushed before return to
+	 * normal world on wakeup.
+	 */
 	PMF_CAPTURE_TIMESTAMP(rt_instr_svc,
 	    RT_INSTR_ENTER_HW_LOW_PWR,
 	    PMF_NO_CACHE_MAINT);
 #endif
 
-	/*
-	 * We will reach here if only retention/standby states have been
-	 * requested at multiple power levels. This means that the cpu
-	 * context will be preserved.
-	 */
-	wfi();
+	if (is_power_down_state != 0U) {
+		if (psci_plat_pm_ops->pwr_domain_pwr_down_wfi != NULL) {
+			/* This function may not return */
+			psci_plat_pm_ops->pwr_domain_pwr_down_wfi(state_info);
+		}
+
+		psci_pwrdown_cpu_end_wakeup(max_off_lvl);
+	} else {
+		/*
+		 * We will reach here if only retention/standby states have been
+		 * requested at multiple power levels. This means that the cpu
+		 * context will be preserved.
+		 */
+		wfi();
+	}
 
 #if ENABLE_RUNTIME_INSTRUMENTATION
 	PMF_CAPTURE_TIMESTAMP(rt_instr_svc,
@@ -277,10 +302,32 @@
 #endif
 
 	/*
-	 * After we wake up from context retaining suspend, call the
-	 * context retaining suspend finisher.
+	 * Waking up means we've retained all context. Call the finishers to put
+	 * the system back to a usable state.
 	 */
-	psci_cpu_suspend_to_standby_finish(idx, end_pwrlvl, state_info);
+	if (is_power_down_state != 0U) {
+#if FEAT_PABANDON
+		psci_cpu_suspend_to_powerdown_finish(idx, max_off_lvl, state_info);
+
+		/* we overwrote context ourselves, put it back */
+		memcpy(&ctx->gpregs_ctx, &old_ctx.gpregs_ctx, sizeof(gp_regs_t));
+		memcpy(&ctx->el3state_ctx, &old_ctx.el3state_ctx, sizeof(el3_state_t));
+#if DYNAMIC_WORKAROUND_CVE_2018_3639
+		memcpy(&ctx->cve_2018_3639_ctx, &old_ctx.cve_2018_3639_ctx, sizeof(cve_2018_3639_t));
+#endif
+#if ERRATA_SPECULATIVE_AT
+		memcpy(&ctx->errata_speculative_at_ctx, &old_ctx.errata_speculative_at_ctx, sizeof(errata_speculative_at_t));
+#endif
+#if CTX_INCLUDE_PAUTH_REGS
+		memcpy(&ctx->pauth_ctx, &old_ctx.pauth_ctx, sizeof(pauth_t));
+#endif
+#if !CTX_INCLUDE_EL2_REGS
+		cm_el1_sysregs_context_restore(NON_SECURE);
+#endif
+#endif
+	} else {
+		psci_cpu_suspend_to_standby_finish(end_pwrlvl, state_info);
+	}
 
 	/*
 	 * Set the requested and target state of this CPU and all the higher
@@ -298,10 +345,9 @@
  * are called by the common finisher routine in psci_common.c. The `state_info`
  * is the psci_power_state from which this CPU has woken up from.
  ******************************************************************************/
-void psci_cpu_suspend_to_powerdown_finish(unsigned int cpu_idx, const psci_power_state_t *state_info)
+void psci_cpu_suspend_to_powerdown_finish(unsigned int cpu_idx, unsigned int max_off_lvl, const psci_power_state_t *state_info)
 {
 	unsigned int counter_freq;
-	unsigned int max_off_lvl;
 
 	/* Ensure we have been woken up from a suspended state */
 	assert((psci_get_aff_info_state() == AFF_STATE_ON) &&
@@ -338,8 +384,6 @@
 	 * error, it's expected to assert within
 	 */
 	if ((psci_spd_pm != NULL) && (psci_spd_pm->svc_suspend_finish != NULL)) {
-		max_off_lvl = psci_find_max_off_lvl(state_info);
-		assert(max_off_lvl != PSCI_INVALID_PWR_LVL);
 		psci_spd_pm->svc_suspend_finish(max_off_lvl);
 	}
 
diff --git a/lib/psci/psci_system_off.c b/lib/psci/psci_system_off.c
index 002392c..b9418a3 100644
--- a/lib/psci/psci_system_off.c
+++ b/lib/psci/psci_system_off.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2014-2020, ARM Limited and Contributors. All rights reserved.
+ * Copyright (c) 2014-2024, Arm Limited and Contributors. All rights reserved.
  *
  * SPDX-License-Identifier: BSD-3-Clause
  */
@@ -30,7 +30,7 @@
 	/* Call the platform specific hook */
 	psci_plat_pm_ops->system_off();
 
-	/* This function does not return. We should never get here */
+	psci_pwrdown_cpu_end_terminal();
 }
 
 void __dead2 psci_system_reset(void)
@@ -49,7 +49,7 @@
 	/* Call the platform specific hook */
 	psci_plat_pm_ops->system_reset();
 
-	/* This function does not return. We should never get here */
+	psci_pwrdown_cpu_end_terminal();
 }
 
 u_register_t psci_system_reset2(uint32_t reset_type, u_register_t cookie)
@@ -79,7 +79,10 @@
 	}
 	console_flush();
 
-	return (u_register_t)
-		psci_plat_pm_ops->system_reset2((int) is_vendor, reset_type,
-						cookie);
+	u_register_t ret =
+		(u_register_t) psci_plat_pm_ops->system_reset2((int) is_vendor, reset_type, cookie);
+	if (ret != PSCI_E_SUCCESS)
+		return ret;
+
+	psci_pwrdown_cpu_end_terminal();
 }