plat: marvell: armada: a3k: improve 4GB DRAM usage from 3.375 GB to 3.75 GB

The current configuration of CPU windows on Armada 37x0 with 4 GB DRAM
can only utilize 3.375 GB of memory. This is because there are only 5
configuration windows, configured as such (in hexadecimal, also showing
ranges not configurable by CPU windows):

         0 - 80000000 |   2 GB | DDR  | CPU window 0
  80000000 - C0000000 |   1 GB | DDR  | CPU window 1
  C0000000 - D0000000 | 256 MB | DDR  | CPU window 2
  D0000000 - D2000000 |  32 MB |      | Internal regs
      empty space     |        |      |
  D8000000 - D8010000 |  64 KB |      | CCI regs
      empty space     |        |      |
  E0000000 - E8000000 | 128 MB | DDR  | CPU window 3
  E8000000 - F0000000 | 128 MB | PCIe | CPU window 4
      empty space     |        |      |
  FFF00000 - end      |  64 KB |      | Boot ROM

This can be improved by taking into account that:
- CCI window can be moved (the base address is only hardcoded in TF-A;
  U-Boot and Linux will not break with changing of this address)
- PCIe window can be moved (upstream U-Boot can change device-tree
  ranges of PCIe if PCIe window is moved)

Change the layout after the Internal regs as such:

  D2000000 - F2000000 | 512 MB | DDR  | CPU window 3
  F2000000 - FA000000 | 128 MB | PCIe | CPU window 4
      empty space     |        |      |
  FE000000 - FE010000 |  64 KB |      | CCI regs
      empty space     |        |      |
  FFF00000 - end      |  64 KB |      | Boot ROM

(Note that CCI regs base address is moved from D8000000 to FE000000 in
 all cases, not only for the configuration with 4 GB of DRAM. This is
 because TF-A is built with this address as a constant, so we cannot
 change this address at runtime only on some boards.)

This yields 3.75 GB of usable RAM.

Moreover U-Boot can theoretically reconfigure the PCIe window to DDR if
it discovers that no PCIe card is connected. This can add another 128 MB
of DRAM (resulting only in 128 MB of DRAM not being used).

Signed-off-by: Marek BehĂșn <marek.behun@nic.cz>
Change-Id: I4ca1999f852f90055fac8b2c4f7e80275a13ad7e
diff --git a/plat/marvell/armada/a3k/common/plat_cci.c b/plat/marvell/armada/a3k/common/plat_cci.c
new file mode 100644
index 0000000..56f091f
--- /dev/null
+++ b/plat/marvell/armada/a3k/common/plat_cci.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (C) 2021 Marek Behun <marek.behun@nic.cz>
+ *
+ * Based on plat/marvell/armada/common/marvell_cci.c
+ *
+ * SPDX-License-Identifier:     BSD-3-Clause
+ * https://spdx.org/licenses
+ */
+
+#include <drivers/arm/cci.h>
+#include <lib/mmio.h>
+
+#include <plat_marvell.h>
+
+static const int cci_map[] = {
+	PLAT_MARVELL_CCI_CLUSTER0_SL_IFACE_IX,
+	PLAT_MARVELL_CCI_CLUSTER1_SL_IFACE_IX
+};
+
+/*
+ * This redefines the weak definition in
+ * plat/marvell/armada/common/marvell_cci.c
+ */
+void plat_marvell_interconnect_init(void)
+{
+	/*
+	 * To better utilize the address space, we remap CCI base address from
+	 * the default (0xD8000000) to MVEBU_CCI_BASE.
+	 * This has to be done here, rather than in cpu_wins_init(), because
+	 * cpu_wins_init() is called later.
+	 */
+	mmio_write_32(CPU_DEC_CCI_BASE_REG, MVEBU_CCI_BASE >> 20);
+
+	cci_init(PLAT_MARVELL_CCI_BASE, cci_map, ARRAY_SIZE(cci_map));
+}