Changes necessary to support SEPARATE_NOBITS_REGION feature

Since BL31 PROGBITS and BL31 NOBITS sections are going to be
in non-adjacent memory regions, potentially far from each other,
some fixes are needed to support it completely.

1. adr instruction only allows computing the effective address
of a location only within 1MB range of the PC. However, adrp
instruction together with an add permits position independent
address of any location with 4GB range of PC.

2. Since BL31 _RW_END_ marks the end of BL31 image, care must be
taken that it is aligned to page size since we map this memory
region in BL31 using xlat_v2 lib utils which mandate alignment of
image size to page granularity.

Change-Id: I3451cc030d03cb2032db3cc088f0c0e2c84bffda
Signed-off-by: Madhukar Pappireddy <madhukar.pappireddy@arm.com>
diff --git a/lib/el3_runtime/aarch64/cpu_data.S b/lib/el3_runtime/aarch64/cpu_data.S
index 2edf225..2392d6b 100644
--- a/lib/el3_runtime/aarch64/cpu_data.S
+++ b/lib/el3_runtime/aarch64/cpu_data.S
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2014-2016, ARM Limited and Contributors. All rights reserved.
+ * Copyright (c) 2014-2020, ARM Limited and Contributors. All rights reserved.
  *
  * SPDX-License-Identifier: BSD-3-Clause
  */
@@ -41,7 +41,8 @@
 func _cpu_data_by_index
 	mov_imm	x1, CPU_DATA_SIZE
 	mul	x0, x0, x1
-	adr	x1, percpu_data
+	adrp	x1, percpu_data
+	add	x1, x1, :lo12:percpu_data
 	add	x0, x0, x1
 	ret
 endfunc _cpu_data_by_index