feat(rpi): add Raspberry Pi 5 support

The Raspberry Pi 5 is a single-board computer based on BCM2712 that
contains four Arm Cortex-A76 cores.

This change introduces minimal BL31 support with PSCI that has been
validated to boot Linux and a private EDK2 build.

It's a drop-in replacement for the custom TF-A armstub now included in
the EEPROM images.

Change-Id: Id72a0370f54e71ac97c3daa1bacedacb7dec148f
Signed-off-by: Mario Bălănică <mariobalanica02@gmail.com>
diff --git a/changelog.yaml b/changelog.yaml
index 35ffaa8..5135d69 100644
--- a/changelog.yaml
+++ b/changelog.yaml
@@ -541,6 +541,9 @@
           - title: Raspberry Pi 4
             scope: rpi4
 
+          - title: Raspberry Pi 5
+            scope: rpi5
+
       - title: Renesas
         scope: renesas
 
diff --git a/docs/plat/index.rst b/docs/plat/index.rst
index b1ccaa5..795fb09 100644
--- a/docs/plat/index.rst
+++ b/docs/plat/index.rst
@@ -37,6 +37,7 @@
    qti-msm8916
    rpi3
    rpi4
+   rpi5
    rcar-gen3
    rz-g2
    rockchip
diff --git a/docs/plat/rpi5.rst b/docs/plat/rpi5.rst
new file mode 100644
index 0000000..f2e1b9f
--- /dev/null
+++ b/docs/plat/rpi5.rst
@@ -0,0 +1,78 @@
+Raspberry Pi 5
+==============
+
+The `Raspberry Pi 5`_ is a single-board computer that contains four
+Arm Cortex-A76 cores.
+
+This port is a minimal BL31 implementation capable of booting 64-bit EL2
+payloads such as Linux and EDK2.
+
+**IMPORTANT NOTE**: This port isn't secure. All of the memory used is DRAM,
+which is available from both the Non-secure and Secure worlds. The SoC does
+not seem to feature a secure memory controller of any kind, so portions of
+DRAM can't be protected properly from the Non-secure world.
+
+Build
+------------------
+
+To build this platform, run:
+
+.. code:: shell
+
+    CROSS_COMPILE=aarch64-linux-gnu- make PLAT=rpi5 DEBUG=1
+
+The firmware will be generated at ``build/rpi5/debug/bl31.bin``.
+
+The following build options are supported:
+
+- ``RPI3_DIRECT_LINUX_BOOT``: Enabled by default. Allows direct boot of the Linux
+  kernel from the firmware.
+
+- ``PRELOADED_BL33_BASE``: Used to specify the fixed address of a BL33 binary
+  that has been preloaded by earlier boot stages (VPU). Useful for bundling
+  BL31 and BL33 in the same ``armstub`` image (e.g. TF-A + EDK2).
+
+- ``RPI3_PRELOADED_DTB_BASE``: This option allows to specify the fixed address of
+  a DTB in memory. Can only be used if ``device_tree_address=`` is present in
+  config.txt.
+
+- ``RPI3_RUNTIME_UART``: Indicates whether TF-A should use the debug UART for
+  runtime messages or not. ``-1`` (default) disables the option, any other value
+  enables it.
+
+Usage
+------------------
+
+Copy the firmware binary to the first FAT32 partition of a supported boot media
+(SD, USB) and append ``armstub=bl31.bin`` to config.txt, or just rename the
+file to ``armstub8-2712.bin``.
+
+No other config options or files are required by the firmware alone, this will
+depend on the payload you intend to run.
+
+For Linux, you must also place an appropriate DTB and kernel in the boot
+partition. This has been validated with a copy of Raspberry Pi OS.
+
+The VPU will preload a BL33 AArch64 image named either ``kernel_2712.img`` or
+``kernel8.img``, which can be overridden by adding a ``kernel=filename`` option
+to config.txt.
+
+Kernel and DTB load addresses are also chosen by the VPU and can be changed with
+``kernel_address=`` and ``device_tree_address=`` in config.txt. If TF-A was built
+with ``PRELOADED_BL33_BASE`` or ``RPI3_PRELOADED_DTB_BASE``, setting those config
+options may be necessary.
+
+By default, all boot stages print messages to the dedicated UART debug port.
+Configuration is ``115200 8n1``.
+
+Design
+------------------
+
+This port is largely based on the RPi 4 one.
+
+The boot process is essentially the same, the only notable difference being that
+all VPU blobs have been moved into EEPROM (former start4.elf & fixup4.dat). There's
+also a custom BL31 TF-A armstub included for PSCI, which can be replaced with this
+port.
+
+.. _Raspberry Pi 5: https://www.raspberrypi.com/products/raspberry-pi-5/
diff --git a/plat/rpi/rpi5/include/plat.ld.S b/plat/rpi/rpi5/include/plat.ld.S
new file mode 100644
index 0000000..961c630
--- /dev/null
+++ b/plat/rpi/rpi5/include/plat.ld.S
@@ -0,0 +1,23 @@
+/*
+ * Copyright (c) 2019-2024, Arm Limited and Contributors. All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ *
+ * Stub linker script to provide the armstub8.bin header before the actual
+ * code. If the GPU firmware finds a magic value at offset 240 in
+ * armstub8.bin, it will put the DTB and kernel load address in subsequent
+ * words. We can then read those values to find the proper NS entry point
+ * and find our DTB more flexibly.
+ */
+
+MEMORY {
+    PRERAM (rwx): ORIGIN = 0, LENGTH = 4096
+}
+
+SECTIONS
+{
+    .armstub8 . : {
+        *armstub8_header.o(.text*)
+        KEEP(*(.armstub8))
+    } >PRERAM
+}
diff --git a/plat/rpi/rpi5/include/platform_def.h b/plat/rpi/rpi5/include/platform_def.h
new file mode 100644
index 0000000..a4c2f5b
--- /dev/null
+++ b/plat/rpi/rpi5/include/platform_def.h
@@ -0,0 +1,141 @@
+/*
+ * Copyright (c) 2015-2024, Arm Limited and Contributors. All rights reserved.
+ * Copyright (c) 2024, Mario Bălănică <mariobalanica02@gmail.com>
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef PLATFORM_DEF_H
+#define PLATFORM_DEF_H
+
+#include <arch.h>
+#include <common/tbbr/tbbr_img_def.h>
+#include <lib/utils_def.h>
+#include <plat/common/common_def.h>
+
+#include "rpi_hw.h"
+
+/* Special value used to verify platform parameters from BL2 to BL31 */
+#define RPI3_BL31_PLAT_PARAM_VAL	ULL(0x0F1E2D3C4B5A6978)
+
+#define PLATFORM_STACK_SIZE		ULL(0x1000)
+
+#define PLATFORM_MAX_CPUS_PER_CLUSTER	U(4)
+#define PLATFORM_CLUSTER_COUNT		U(1)
+#define PLATFORM_CLUSTER0_CORE_COUNT	PLATFORM_MAX_CPUS_PER_CLUSTER
+#define PLATFORM_CORE_COUNT		PLATFORM_CLUSTER0_CORE_COUNT
+
+#define RPI_PRIMARY_CPU			U(0)
+
+#define PLAT_MAX_PWR_LVL		MPIDR_AFFLVL1
+#define PLAT_NUM_PWR_DOMAINS		(PLATFORM_CLUSTER_COUNT + \
+					 PLATFORM_CORE_COUNT)
+
+#define PLAT_MAX_RET_STATE		U(1)
+#define PLAT_MAX_OFF_STATE		U(2)
+
+/* Local power state for power domains in Run state. */
+#define PLAT_LOCAL_STATE_RUN		U(0)
+/* Local power state for retention. Valid only for CPU power domains */
+#define PLAT_LOCAL_STATE_RET		U(1)
+/*
+ * Local power state for OFF/power-down. Valid for CPU and cluster power
+ * domains.
+ */
+#define PLAT_LOCAL_STATE_OFF		U(2)
+
+/*
+ * Macros used to parse state information from State-ID if it is using the
+ * recommended encoding for State-ID.
+ */
+#define PLAT_LOCAL_PSTATE_WIDTH		U(4)
+#define PLAT_LOCAL_PSTATE_MASK		((U(1) << PLAT_LOCAL_PSTATE_WIDTH) - 1)
+
+/*
+ * Some data must be aligned on the biggest cache line size in the platform.
+ * This is known only to the platform as it might have a combination of
+ * integrated and external caches.
+ */
+#define CACHE_WRITEBACK_SHIFT		U(6)
+#define CACHE_WRITEBACK_GRANULE		(U(1) << CACHE_WRITEBACK_SHIFT)
+
+/*
+ * I/O registers.
+ */
+#define DEVICE0_BASE			RPI_IO_BASE
+#define DEVICE0_SIZE			RPI_IO_SIZE
+
+/*
+ * Mailbox to control the secondary cores. All secondary cores are held in a
+ * wait loop in cold boot. To release them perform the following steps (plus
+ * any additional barriers that may be needed):
+ *
+ *     uint64_t *entrypoint = (uint64_t *)PLAT_RPI3_TM_ENTRYPOINT;
+ *     *entrypoint = ADDRESS_TO_JUMP_TO;
+ *
+ *     uint64_t *mbox_entry = (uint64_t *)PLAT_RPI3_TM_HOLD_BASE;
+ *     mbox_entry[cpu_id] = PLAT_RPI3_TM_HOLD_STATE_GO;
+ *
+ *     sev();
+ */
+/* The secure entry point to be used on warm reset by all CPUs. */
+#define PLAT_RPI3_TM_ENTRYPOINT		0x100
+#define PLAT_RPI3_TM_ENTRYPOINT_SIZE	ULL(8)
+
+/* Hold entries for each CPU. */
+#define PLAT_RPI3_TM_HOLD_BASE		(PLAT_RPI3_TM_ENTRYPOINT + \
+					 PLAT_RPI3_TM_ENTRYPOINT_SIZE)
+#define PLAT_RPI3_TM_HOLD_ENTRY_SIZE	ULL(8)
+#define PLAT_RPI3_TM_HOLD_SIZE		(PLAT_RPI3_TM_HOLD_ENTRY_SIZE * \
+					 PLATFORM_CORE_COUNT)
+
+#define PLAT_RPI3_TRUSTED_MAILBOX_SIZE	(PLAT_RPI3_TM_ENTRYPOINT_SIZE + \
+					 PLAT_RPI3_TM_HOLD_SIZE)
+
+#define PLAT_RPI3_TM_HOLD_STATE_WAIT	ULL(0)
+#define PLAT_RPI3_TM_HOLD_STATE_GO	ULL(1)
+#define PLAT_RPI3_TM_HOLD_STATE_BSP_OFF	ULL(2)
+
+/*
+ * BL31 specific defines.
+ *
+ * Put BL31 at the top of the Trusted SRAM. BL31_BASE is calculated using the
+ * current BL31 debug size plus a little space for growth.
+ */
+#define PLAT_MAX_BL31_SIZE		ULL(0x80000)
+
+#define BL31_BASE			ULL(0x1000)
+#define BL31_LIMIT			ULL(0x80000)
+#define BL31_PROGBITS_LIMIT		ULL(0x80000)
+
+#define SEC_SRAM_ID			0
+#define SEC_DRAM_ID			1
+
+/*
+ * Other memory-related defines.
+ */
+#define PLAT_PHY_ADDR_SPACE_SIZE	(ULL(1) << 40)
+#define PLAT_VIRT_ADDR_SPACE_SIZE	(ULL(1) << 40)
+
+#define MAX_MMAP_REGIONS		8
+#define MAX_XLAT_TABLES			4
+
+#define MAX_IO_DEVICES			U(3)
+#define MAX_IO_HANDLES			U(4)
+
+#define MAX_IO_BLOCK_DEVICES		U(1)
+
+/*
+ * Serial-related constants.
+ */
+#define PLAT_RPI_PL011_UART_BASE	RPI4_PL011_UART_BASE
+#define PLAT_RPI_PL011_UART_CLOCK	RPI4_PL011_UART_CLOCK
+#define PLAT_RPI_UART_BAUDRATE		ULL(115200)
+#define PLAT_RPI_CRASH_UART_BASE	PLAT_RPI_PL011_UART_BASE
+
+/*
+ * System counter
+ */
+#define SYS_COUNTER_FREQ_IN_TICKS	ULL(54000000)
+
+#endif /* PLATFORM_DEF_H */
diff --git a/plat/rpi/rpi5/include/rpi_hw.h b/plat/rpi/rpi5/include/rpi_hw.h
new file mode 100644
index 0000000..384542e
--- /dev/null
+++ b/plat/rpi/rpi5/include/rpi_hw.h
@@ -0,0 +1,51 @@
+/*
+ * Copyright (c) 2016-2024, Arm Limited and Contributors. All rights reserved.
+ * Copyright (c) 2024, Mario Bălănică <mariobalanica02@gmail.com>
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef RPI_HW_H
+#define RPI_HW_H
+
+#include <lib/utils_def.h>
+
+/*
+ * Peripherals
+ */
+
+#define RPI_IO_BASE			ULL(0x1000000000)
+#define RPI_IO_SIZE			ULL(0x1000000000)
+
+/*
+ * ARM <-> VideoCore mailboxes
+ */
+#define RPI3_MBOX_BASE			(RPI_IO_BASE + ULL(0x7c013880))
+
+/*
+ * Power management, reset controller, watchdog.
+ */
+#define RPI3_PM_BASE			(RPI_IO_BASE + ULL(0x7d200000))
+
+/*
+ * Hardware random number generator.
+ */
+#define RPI3_RNG_BASE			(RPI_IO_BASE + ULL(0x7d208000))
+
+/*
+ * PL011 system serial port
+ */
+#define RPI4_PL011_UART_BASE		(RPI_IO_BASE + ULL(0x7d001000))
+#define RPI4_PL011_UART_CLOCK		ULL(44000000)
+
+/*
+ * GIC interrupt controller
+ */
+#define RPI_HAVE_GIC
+#define RPI4_GIC_GICD_BASE		(RPI_IO_BASE + ULL(0x7fff9000))
+#define RPI4_GIC_GICC_BASE		(RPI_IO_BASE + ULL(0x7fffa000))
+
+#define	RPI4_LOCAL_CONTROL_BASE_ADDRESS		(RPI_IO_BASE + ULL(0x7c280000))
+#define	RPI4_LOCAL_CONTROL_PRESCALER		(RPI_IO_BASE + ULL(0x7c280008))
+
+#endif /* RPI_HW_H */
diff --git a/plat/rpi/rpi5/platform.mk b/plat/rpi/rpi5/platform.mk
new file mode 100644
index 0000000..81b7ded
--- /dev/null
+++ b/plat/rpi/rpi5/platform.mk
@@ -0,0 +1,107 @@
+#
+# Copyright (c) 2015-2024, Arm Limited and Contributors. All rights reserved.
+# Copyright (c) 2024, Mario Bălănică <mariobalanica02@gmail.com>
+#
+# SPDX-License-Identifier: BSD-3-Clause
+#
+
+include lib/xlat_tables_v2/xlat_tables.mk
+
+include drivers/arm/gic/v2/gicv2.mk
+
+PLAT_INCLUDES		:=	-Iplat/rpi/common/include			\
+				-Iplat/rpi/rpi5/include
+
+PLAT_BL_COMMON_SOURCES	:=	drivers/arm/pl011/aarch64/pl011_console.S 	\
+				plat/rpi/common/rpi3_common.c			\
+				plat/rpi/common/rpi3_console_pl011.c		\
+				${XLAT_TABLES_LIB_SRCS}
+
+BL31_SOURCES		+=	lib/cpus/aarch64/cortex_a76.S			\
+				plat/rpi/common/aarch64/plat_helpers.S		\
+				plat/rpi/common/aarch64/armstub8_header.S 	\
+				drivers/delay_timer/delay_timer.c		\
+				plat/common/plat_gicv2.c			\
+				plat/rpi/common/rpi4_bl31_setup.c		\
+				plat/rpi/rpi5/rpi5_setup.c			\
+				plat/rpi/common/rpi3_pm.c			\
+				plat/common/plat_psci_common.c			\
+				plat/rpi/common/rpi3_topology.c			\
+				${GICV2_SOURCES}
+
+# For now we only support BL31, using the kernel loaded by the GPU firmware.
+RESET_TO_BL31		:=	1
+
+# All CPUs enter armstub8.bin.
+COLD_BOOT_SINGLE_CPU	:=	0
+
+# Tune compiler for Cortex-A76
+ifeq ($(notdir $(CC)),armclang)
+    TF_CFLAGS_aarch64	+=	-mcpu=cortex-a76
+else ifneq ($(findstring clang,$(notdir $(CC))),)
+    TF_CFLAGS_aarch64	+=	-mcpu=cortex-a76
+else
+    TF_CFLAGS_aarch64	+=	-mtune=cortex-a76
+endif
+
+# Add support for platform supplied linker script for BL31 build
+$(eval $(call add_define,PLAT_EXTRA_LD_SCRIPT))
+
+# Enable all errata workarounds for Cortex-A76 r4p1
+ERRATA_A76_1946160		:= 1
+ERRATA_A76_2743102		:= 1
+
+# Add new default target when compiling this platform
+all: bl31
+
+# Build config flags
+# ------------------
+
+# Disable stack protector by default
+ENABLE_STACK_PROTECTOR	 	:= 0
+
+# Have different sections for code and rodata
+SEPARATE_CODE_AND_RODATA	:= 1
+
+# Hardware-managed coherency
+HW_ASSISTED_COHERENCY		:= 1
+USE_COHERENT_MEM		:= 0
+
+# Cortex-A76 is 64-bit only
+CTX_INCLUDE_AARCH32_REGS	:= 0
+
+# Platform build flags
+# --------------------
+
+# There is not much else than a Linux kernel to load at the moment.
+RPI3_DIRECT_LINUX_BOOT		:= 1
+
+# BL33 images can only be AArch64 on this platform.
+RPI3_BL33_IN_AARCH32		:= 0
+
+# UART to use at runtime. -1 means the runtime UART is disabled.
+# Any other value means the default UART will be used.
+RPI3_RUNTIME_UART		:= 0
+
+# Use normal memory mapping for ROM, FIP, SRAM and DRAM
+RPI3_USE_UEFI_MAP		:= 0
+
+# Process platform flags
+# ----------------------
+
+$(eval $(call add_define,RPI3_BL33_IN_AARCH32))
+$(eval $(call add_define,RPI3_DIRECT_LINUX_BOOT))
+ifdef RPI3_PRELOADED_DTB_BASE
+$(eval $(call add_define,RPI3_PRELOADED_DTB_BASE))
+endif
+$(eval $(call add_define,RPI3_RUNTIME_UART))
+$(eval $(call add_define,RPI3_USE_UEFI_MAP))
+
+ifeq (${ARCH},aarch32)
+  $(error Error: AArch32 not supported on rpi5)
+endif
+
+ifneq ($(ENABLE_STACK_PROTECTOR), 0)
+PLAT_BL_COMMON_SOURCES	+=	drivers/rpi3/rng/rpi3_rng.c		\
+				plat/rpi/common/rpi3_stack_protector.c
+endif
diff --git a/plat/rpi/rpi5/rpi5_setup.c b/plat/rpi/rpi5/rpi5_setup.c
new file mode 100644
index 0000000..de82300
--- /dev/null
+++ b/plat/rpi/rpi5/rpi5_setup.c
@@ -0,0 +1,12 @@
+/*
+ * Copyright (c) 2024, Mario Bălănică <mariobalanica02@gmail.com>
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#include <rpi_shared.h>
+
+void plat_rpi_bl31_custom_setup(void)
+{
+	/* Nothing to do here yet. */
+}