armv8/cache.S: Triple with single instruction

Replace the current 2-instruction 2-step tripling code by a
corresponding single instruction leveraging ARMv8-A's "flexible second
operand as a register with optional shift". This has the added benefit
(albeit arguably negligible) of reducing the final code size.

Fix the comment as the tripled cache level is placed in x12, not x0.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
diff --git a/arch/arm/cpu/armv8/cache.S b/arch/arm/cpu/armv8/cache.S
index eec2958..d1cee23 100644
--- a/arch/arm/cpu/armv8/cache.S
+++ b/arch/arm/cpu/armv8/cache.S
@@ -80,8 +80,7 @@
 	/* x15 <- return address */
 
 loop_level:
-	lsl	x12, x0, #1
-	add	x12, x12, x0		/* x0 <- tripled cache level */
+	add	x12, x0, x0, lsl #1	/* x12 <- tripled cache level */
 	lsr	x12, x10, x12
 	and	x12, x12, #7		/* x12 <- cache type */
 	cmp	x12, #2