blob: 391c7c815b28901622048123a39f479b49121a3e [file] [log] [blame]
Paul Beesleyf3653a62019-05-22 11:22:44 +01001NVIDIA Tegra
2============
Douglas Raillardd7c21b72017-06-28 15:23:03 +01003
Varun Wadekar13f57a82019-12-03 14:14:12 -08004- .. rubric:: T194
5 :name: t194
6
7T194 has eight NVIDIA Carmel CPU cores in a coherent multi-processor
8configuration. The Carmel cores support the ARM Architecture version 8.2,
9executing both 64-bit AArch64 code, and 32-bit AArch32 code. The Carmel
10processors are organized as four dual-core clusters, where each cluster has
11a dedicated 2 MiB Level-2 unified cache. A high speed coherency fabric connects
12these processor complexes and allows heterogeneous multi-processing with all
13eight cores if required.
14
Varun Wadekar6801c792019-01-03 15:09:44 -080015- .. rubric:: T186
16 :name: t186
17
18The NVIDIA® Parker (T186) series system-on-chip (SoC) delivers a heterogeneous
19multi-processing (HMP) solution designed to optimize performance and
20efficiency.
21
Varun Wadekara0ea6862021-04-23 22:26:18 -070022T186 has Dual NVIDIA Denver2 ARM® CPU cores, plus Quad ARM Cortex®-A57 cores,
Varun Wadekar6801c792019-01-03 15:09:44 -080023in a coherent multiprocessor configuration. The Denver 2 and Cortex-A57 cores
24support ARMv8, executing both 64-bit Aarch64 code, and 32-bit Aarch32 code
25including legacy ARMv7 applications. The Denver 2 processors each have 128 KB
26Instruction and 64 KB Data Level 1 caches; and have a 2MB shared Level 2
27unified cache. The Cortex-A57 processors each have 48 KB Instruction and 32 KB
28Data Level 1 caches; and also have a 2 MB shared Level 2 unified cache. A
29high speed coherency fabric connects these two processor complexes and allows
30heterogeneous multi-processing with all six cores if required.
31
Douglas Raillardd7c21b72017-06-28 15:23:03 +010032Denver is NVIDIA's own custom-designed, 64-bit, dual-core CPU which is
Dan Handley610e7e12018-03-01 18:44:00 +000033fully Armv8-A architecture compatible. Each of the two Denver cores
Douglas Raillardd7c21b72017-06-28 15:23:03 +010034implements a 7-way superscalar microarchitecture (up to 7 concurrent
35micro-ops can be executed per clock), and includes a 128KB 4-way L1
36instruction cache, a 64KB 4-way L1 data cache, and a 2MB 16-way L2
37cache, which services both cores.
38
39Denver implements an innovative process called Dynamic Code Optimization,
40which optimizes frequently used software routines at runtime into dense,
41highly tuned microcode-equivalent routines. These are stored in a
42dedicated, 128MB main-memory-based optimization cache. After being read
43into the instruction cache, the optimized micro-ops are executed,
44re-fetched and executed from the instruction cache as long as needed and
45capacity allows.
46
47Effectively, this reduces the need to re-optimize the software routines.
48Instead of using hardware to extract the instruction-level parallelism
49(ILP) inherent in the code, Denver extracts the ILP once via software
50techniques, and then executes those routines repeatedly, thus amortizing
51the cost of ILP extraction over the many execution instances.
52
53Denver also features new low latency power-state transitions, in addition
54to extensive power-gating and dynamic voltage and clock scaling based on
55workloads.
56
Varun Wadekara0ea6862021-04-23 22:26:18 -070057- .. rubric:: T210
58 :name: t210
59
60T210 has Quad Arm® Cortex®-A57 cores in a switched configuration with a
61companion set of quad Arm Cortex-A53 cores. The Cortex-A57 and A53 cores
62support Armv8-A, executing both 64-bit Aarch64 code, and 32-bit Aarch32 code
63including legacy Armv7-A applications. The Cortex-A57 processors each have
6448 KB Instruction and 32 KB Data Level 1 caches; and have a 2 MB shared
65Level 2 unified cache. The Cortex-A53 processors each have 32 KB Instruction
66and 32 KB Data Level 1 caches; and have a 512 KB shared Level 2 unified cache.
67
Douglas Raillardd7c21b72017-06-28 15:23:03 +010068Directory structure
Paul Beesleyf3653a62019-05-22 11:22:44 +010069-------------------
Douglas Raillardd7c21b72017-06-28 15:23:03 +010070
71- plat/nvidia/tegra/common - Common code for all Tegra SoCs
72- plat/nvidia/tegra/soc/txxx - Chip specific code
73
74Trusted OS dispatcher
Paul Beesleyf3653a62019-05-22 11:22:44 +010075---------------------
Douglas Raillardd7c21b72017-06-28 15:23:03 +010076
Varun Wadekar6801c792019-01-03 15:09:44 -080077Tegra supports multiple Trusted OS'.
78
79- Trusted Little Kernel (TLK): In order to include the 'tlkd' dispatcher in
80 the image, pass 'SPD=tlkd' on the command line while preparing a bl31 image.
81- Trusty: In order to include the 'trusty' dispatcher in the image, pass
82 'SPD=trusty' on the command line while preparing a bl31 image.
83
84This allows other Trusted OS vendors to use the upstream code and include
85their dispatchers in the image without changing any makefiles.
86
87These are the supported Trusted OS' by Tegra platforms.
88
Varun Wadekar13f57a82019-12-03 14:14:12 -080089- Tegra210: TLK and Trusty
90- Tegra186: Trusty
91- Tegra194: Trusty
Douglas Raillardd7c21b72017-06-28 15:23:03 +010092
Varun Wadekar4d034c52019-01-11 14:47:48 -080093Scatter files
Paul Beesleyf3653a62019-05-22 11:22:44 +010094-------------
Varun Wadekar4d034c52019-01-11 14:47:48 -080095
96Tegra platforms currently support scatter files and ld.S scripts. The scatter
97files help support ARMLINK linker to generate BL31 binaries. For now, there
98exists a common scatter file, plat/nvidia/tegra/scat/bl31.scat, for all Tegra
99SoCs. The `LINKER` build variable needs to point to the ARMLINK binary for
100the scatter file to be used. Tegra platforms have verified BL31 image generation
101with ARMCLANG (compilation) and ARMLINK (linking) for the Tegra186 platforms.
102
Douglas Raillardd7c21b72017-06-28 15:23:03 +0100103Preparing the BL31 image to run on Tegra SoCs
Paul Beesleyf3653a62019-05-22 11:22:44 +0100104---------------------------------------------
Douglas Raillardd7c21b72017-06-28 15:23:03 +0100105
106.. code:: shell
107
108 CROSS_COMPILE=<path-to-aarch64-gcc>/bin/aarch64-none-elf- make PLAT=tegra \
Varun Wadekara0ea6862021-04-23 22:26:18 -0700109 TARGET_SOC=<target-soc e.g. t194|t186|t210> SPD=<dispatcher e.g. trusty|tlkd>
Varun Wadekar6801c792019-01-03 15:09:44 -0800110 bl31
Douglas Raillardd7c21b72017-06-28 15:23:03 +0100111
112Platforms wanting to use different TZDRAM\_BASE, can add ``TZDRAM_BASE=<value>``
113to the build command line.
114
115The Tegra platform code expects a pointer to the following platform specific
116structure via 'x1' register from the BL2 layer which is used by the
117bl31\_early\_platform\_setup() handler to extract the TZDRAM carveout base and
118size for loading the Trusted OS and the UART port ID to be used. The Tegra
119memory controller driver programs this base/size in order to restrict NS
120accesses.
121
122typedef struct plat\_params\_from\_bl2 {
123/\* TZ memory size */
124uint64\_t tzdram\_size;
125/* TZ memory base */
126uint64\_t tzdram\_base;
127/* UART port ID \*/
128int uart\_id;
Harvey Hsiehfbdfce12016-11-23 19:13:08 +0800129/* L2 ECC parity protection disable flag \*/
130int l2\_ecc\_parity\_prot\_dis;
Varun Wadekar4967c3d2017-07-21 13:34:16 -0700131/* SHMEM base address for storing the boot logs \*/
132uint64\_t boot\_profiler\_shmem\_base;
Douglas Raillardd7c21b72017-06-28 15:23:03 +0100133} plat\_params\_from\_bl2\_t;
134
135Power Management
Paul Beesleyf3653a62019-05-22 11:22:44 +0100136----------------
Douglas Raillardd7c21b72017-06-28 15:23:03 +0100137
138The PSCI implementation expects each platform to expose the 'power state'
139parameter to be used during the 'SYSTEM SUSPEND' call. The state-id field
140is implementation defined on Tegra SoCs and is preferably defined by
141tegra\_def.h.
142
143Tegra configs
Paul Beesleyf3653a62019-05-22 11:22:44 +0100144-------------
Douglas Raillardd7c21b72017-06-28 15:23:03 +0100145
146- 'tegra\_enable\_l2\_ecc\_parity\_prot': This flag enables the L2 ECC and Parity
Dan Handley610e7e12018-03-01 18:44:00 +0000147 Protection bit, for Arm Cortex-A57 CPUs, during CPU boot. This flag will
Douglas Raillardd7c21b72017-06-28 15:23:03 +0100148 be enabled by Tegrs SoCs during 'Cluster power up' or 'System Suspend' exit.