armv8: move low-level assembly functions into function-sections

TPL builds today don't need to call into firmware or set up the MMU
(if this changes, it should be controlled through a config option
whether to include this or not), but include the needed support code
for this anyway.  By moving these unused low-level functions into
seperate function-sections, the linker can garbage-collect the unused
sections.

Note that (if DM support is enabled), there will be a call to the
cache-flushing code from alloc_priv(...) in drivers/core/device.c.
This then add 52 bytes of binary size (an increase from 20589 to 20641
bytes) compared to completely removing this code.

Even for a feature-rich TPL (including DM support as for the RK3368),
this equates to a size difference of significantly more than 10% in
TPL binary size.

Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Reviewed-by: Simon Glass <sjg@chromium.org>
diff --git a/arch/arm/cpu/armv8/transition.S b/arch/arm/cpu/armv8/transition.S
index ca07465..7aa6935 100644
--- a/arch/arm/cpu/armv8/transition.S
+++ b/arch/arm/cpu/armv8/transition.S
@@ -10,6 +10,7 @@
 #include <linux/linkage.h>
 #include <asm/macro.h>
 
+.pushsection .text.armv8_switch_to_el2, "ax"
 ENTRY(armv8_switch_to_el2)
 	switch_el x6, 1f, 0f, 0f
 0:
@@ -30,7 +31,9 @@
 	br x4
 1:	armv8_switch_to_el2_m x4, x5, x6
 ENDPROC(armv8_switch_to_el2)
+.popsection
 
+.pushsection .text.armv8_switch_to_el1, "ax"
 ENTRY(armv8_switch_to_el1)
 	switch_el x6, 0f, 1f, 0f
 0:
@@ -40,7 +43,10 @@
 	br x4
 1:	armv8_switch_to_el1_m x4, x5, x6
 ENDPROC(armv8_switch_to_el1)
+.popsection
 
+.pushsection .text.armv8_el2_to_aarch32, "ax"
 WEAK(armv8_el2_to_aarch32)
 	ret
 ENDPROC(armv8_el2_to_aarch32)
+.popsection