BUILD: makefile: add a few popular ARMv8 CPU targets

This adds the following CPUs to the makefile:
  - armv81    : modern ARM cores (Cortex A55/A75/A76/A78/X1, Neoverse, Graviton2)
  - a72       : ARM Cortex-A72 or A73 (e.g. RPi4, Odroid N2, VIM3, AWS Graviton)
  - a53       : ARM Cortex-A53 or any of its successors in 64-bit mode (e.g. RPi3)
  - armv8-auto: both older and newer ARMv8 cores, with a minor runtime penalty

The reasons for these ones are:
  - a53 is the common denominator of all of its successors, and does
    support CRC32 which is used by the gzip compression, that the generic
    armv8-a does not ;

  - a72 supports the same features but is an out-of-order one that deserves
    better optimizations; it's found in a number of high-performance
    multi-core CPUs mainly oriented towards I/O and network processing
    (Armada 8040, NXP LX2160A, AWS Graviton), and more recently the
    Raspberry Pi 4. The A73 found in VIM3 and Odroid-N2 can use the same
    optimizations ;

  - armv81 is for generic ARMv8.1-A and above, automatically enables LSE
    atomics which are way more scalable, and CRC32. This one covers modern
    ARMv8 cores such as Cortex A55/A75/A76/A77/A78/X1 and the Neoverse
    family such as found in AWS's Graviton2. The LSE instructions are
    essential for large numbers of cores (8 and above).

  - armv8-auto dynamically enables support for LSE extensions when
    detected while still being compatible with older cores. There is a
    small performance penalty in doing this (~3%) but a same executable
    will perform optimally on a wider range of hardware. This should be
    the best option for distros. It requires gcc-10 or gcc-9.4 and above.

When no CPU is specified, GCC version 10.2 and above will automatically
implement the wrapper used to detect the LSE extensions.
diff --git a/INSTALL b/INSTALL
index e5143c6..e9d38f9 100644
--- a/INSTALL
+++ b/INSTALL
@@ -285,7 +285,10 @@
 
 Please note that SLZ will benefit from some CPU-specific instructions like the
 availability of the CRC32 extension on some ARM processors. Thus it can further
-improve its performance to build with "CPU=native" on the target system.
+improve its performance to build with "CPU=native" on the target system, or
+"CPU=armv81" (modern systems such as Graviton2 or A55/A75 and beyond),
+"CPU=a72" (e.g. for RPi4, or AWS Graviton), "CPU=a53" (e.g. for RPi3), or
+"CPU=armv8-auto" (automatic detection with minor runtime penalty).
 
 A second option involves the widely known zlib library, which is very likely
 installed on your system. In order to use zlib, simply pass "USE_ZLIB=1" to the
@@ -421,6 +424,11 @@
   - ultrasparc : Sun UltraSparc I/II/III/IV processor
   - power8 : IBM POWER8 processor
   - power9 : IBM POWER9 processor
+  - armv81 : modern ARM cores (Cortex A55/A75/A76/A78/X1, Neoverse, Graviton2)
+  - a72    : ARM Cortex-A72 or A73 (e.g. RPi4, Odroid N2, AWS Graviton)
+  - a53    : ARM Cortex-A53 or any of its successors in 64-bit mode (e.g. RPi3)
+  - armv8-auto : support both older and newer armv8 cores with a minor penalty,
+                 thanks to gcc 10's outline atomics (default with gcc 10.2).
   - native : use the build machine's specific processor optimizations. Use with
     extreme care, and never in virtualized environments (known to break).
   - generic : any other processor or no CPU-specific optimization. (default)