BUILD: makefile: add a few popular ARMv8 CPU targets
This adds the following CPUs to the makefile:
- armv81 : modern ARM cores (Cortex A55/A75/A76/A78/X1, Neoverse, Graviton2)
- a72 : ARM Cortex-A72 or A73 (e.g. RPi4, Odroid N2, VIM3, AWS Graviton)
- a53 : ARM Cortex-A53 or any of its successors in 64-bit mode (e.g. RPi3)
- armv8-auto: both older and newer ARMv8 cores, with a minor runtime penalty
The reasons for these ones are:
- a53 is the common denominator of all of its successors, and does
support CRC32 which is used by the gzip compression, that the generic
armv8-a does not ;
- a72 supports the same features but is an out-of-order one that deserves
better optimizations; it's found in a number of high-performance
multi-core CPUs mainly oriented towards I/O and network processing
(Armada 8040, NXP LX2160A, AWS Graviton), and more recently the
Raspberry Pi 4. The A73 found in VIM3 and Odroid-N2 can use the same
optimizations ;
- armv81 is for generic ARMv8.1-A and above, automatically enables LSE
atomics which are way more scalable, and CRC32. This one covers modern
ARMv8 cores such as Cortex A55/A75/A76/A77/A78/X1 and the Neoverse
family such as found in AWS's Graviton2. The LSE instructions are
essential for large numbers of cores (8 and above).
- armv8-auto dynamically enables support for LSE extensions when
detected while still being compatible with older cores. There is a
small performance penalty in doing this (~3%) but a same executable
will perform optimally on a wider range of hardware. This should be
the best option for distros. It requires gcc-10 or gcc-9.4 and above.
When no CPU is specified, GCC version 10.2 and above will automatically
implement the wrapper used to detect the LSE extensions.
diff --git a/Makefile b/Makefile
index 74fca35..6571dc6 100644
--- a/Makefile
+++ b/Makefile
@@ -162,7 +162,8 @@
#### TARGET CPU
# Use CPU=<cpu_name> to optimize for a particular CPU, among the following
# list :
-# generic, native, i586, i686, ultrasparc, power8, power9, custom
+# generic, native, i586, i686, ultrasparc, power8, power9, custom,
+# a53, a72, armv81, armv8-auto
CPU = generic
#### Architecture, used when not building for native architecture
@@ -274,6 +275,10 @@
CPU_CFLAGS.ultrasparc = -O6 -mcpu=v9 -mtune=ultrasparc
CPU_CFLAGS.power8 = -O2 -mcpu=power8 -mtune=power8
CPU_CFLAGS.power9 = -O2 -mcpu=power9 -mtune=power9
+CPU_CFLAGS.a53 = -O2 -mcpu=cortex-a53
+CPU_CFLAGS.a72 = -O2 -mcpu=cortex-a72
+CPU_CFLAGS.armv81 = -O2 -march=armv8.1-a
+CPU_CFLAGS.armv8-auto = -O2 -march=armv8-a+crc -moutline-atomics
CPU_CFLAGS = $(CPU_CFLAGS.$(CPU))
#### ARCH dependent flags, may be overridden by CPU flags