BUG/MINOR: xxhash: make sure armv6 uses memcpy() There was a special case made to allow ARMv6 to use unaligned accesses via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit pointers, not 64-bit ones, leading to bus errors when the compiler emits an ldrd instruction and the input (e.g. a pattern) is not aligned, as in issue #1035. Note that v7 was properly using the packed approach here and was safe, however haproxy versions 2.3 and older use the old r39 xxhash code which has the same issue for armv7. A slightly different fix is required there, by using a different definition of packed for 32 and 64 bits. The problem is really visible when running v7 code on a v8 kernel because such kernels do not implement alignment trap emulation, and the process dies when this happens. This is why in the issue above it was only detected under lxc. The emulation could have been disabled on v7 as well by writing zero to /proc/cpu/alignment though. This commit is a backport of xxhash commit a470f2ef ("update default memory access for armv6"). Thanks to @srkunze for the report and tests, @stgraber for his help on setting up an easy reproducer outside of lxc, and @Cyan4973 for the discussion around the best way to fix this. Details and alternate patches available on https://github.com/Cyan4973/xxHash/issues/490.

commit: 4acb99f8672232753adb36e57b45e80e5bd87783 [log] [tgz]
author: Willy Tarreau <w@1wt.eu> Thu Feb 04 17:02:39 2021 +0100
committer: Willy Tarreau <w@1wt.eu> Thu Feb 04 17:14:58 2021 +0100
tree: f710c07753bb80c97921e1badb3b604a22754535
parent: a8979a9b5997122bfef697a3d8a6d284dab4a9ff [diff]
diff --git a/include/import/xxhash.h b/include/import/xxhash.h
index 826942f..24326bc 100644
--- a/include/import/xxhash.h
+++ b/include/import/xxhash.h

@@ -791,15 +791,15 @@
  * Method 1:
  *     `__attribute__((packed))` statement. It depends on compiler extensions
  *     and is therefore not portable.
- *     This method is safe if your compiler supports it, and *generally* as
- *     fast or faster than `memcpy`.
+ *     This method is safe _if_ your compiler supports it,
+ *     and *generally* as fast or faster than `memcpy`.
  * Method 2:
  *     Direct access via cast. This method doesn't depend on the compiler but
  *     violates the C standard.
  *     It can generate buggy code on targets which do not support unaligned
  *     memory accesses.
  *     But in some circumstances, it's the only known way to get the most
- *     performance (example: GCC + ARMv6)
+ *     performance.
  * Method 3:
  *     Byteshift. This can generate the best code on old compilers which don't
  *     inline small `memcpy()` calls, and it might also be faster on big-endian
@@ -808,10 +808,10 @@
  * Prefer these methods in priority order (0 > 1 > 2 > 3)
  */
 #ifndef XXH_FORCE_MEMORY_ACCESS   /* can be defined externally, on command line for example */
-#  if !defined(__clang__) && defined(__GNUC__) && defined(__ARM_FEATURE_UNALIGNED) && defined(__ARM_ARCH) && (__ARM_ARCH == 6)
-#    define XXH_FORCE_MEMORY_ACCESS 2
-#  elif !defined(__clang__) && ((defined(__INTEL_COMPILER) && !defined(_WIN32)) || \
-  (defined(__GNUC__) && (defined(__ARM_ARCH) && __ARM_ARCH >= 7)))
+   /* prefer __packed__ structures (method 1) for gcc on armv7 and armv8 */
+#  if !defined(__clang__) && ( \
+    (defined(__INTEL_COMPILER) && !defined(_WIN32)) || \
+    (defined(__GNUC__) && (defined(__ARM_ARCH) && __ARM_ARCH >= 7)) )
 #    define XXH_FORCE_MEMORY_ACCESS 1
 #  endif
 #endif
commit	4acb99f8672232753adb36e57b45e80e5bd87783	[log] [tgz]
author	Willy Tarreau <w@1wt.eu>	Thu Feb 04 17:02:39 2021 +0100
committer	Willy Tarreau <w@1wt.eu>	Thu Feb 04 17:14:58 2021 +0100
tree	f710c07753bb80c97921e1badb3b604a22754535
parent	a8979a9b5997122bfef697a3d8a6d284dab4a9ff [diff]