BUG/MEDIUM: threads: fix the double CAS implementation for ARMv7

Commit f61f0cb ("MINOR: threads: Introduce double-width CAS on x86_64
and arm.") introduced the double CAS. But the ARMv7 version is bogus,
it uses the value of the pointers instead of dereferencing them. When
lucky, it simply doesn't build due to impossible registers combinations.
Otherwise it will immediately crash at run time when facing traffic.

No backport is needed, this bug was introduced in 1.9-dev.
diff --git a/include/common/hathreads.h b/include/common/hathreads.h
index a8fdf15..143cf2c 100644
--- a/include/common/hathreads.h
+++ b/include/common/hathreads.h
@@ -780,7 +780,7 @@
 	__asm __volatile("dmb" ::: "memory");
 }
 
-static __inline int __ha_cas_dw(void *target, void *compare, void *set)
+static __inline int __ha_cas_dw(void *target, void *compare, const void *set)
 {
 	uint64_t previous;
 	int tmp;
@@ -794,7 +794,7 @@
 			 "cmpeq %1, #1;"
 			 "beq 1b;"
 			 : "=&r" (previous), "=&r" (tmp)
-			 : "r" (compare), "r" (set), "r" (target)
+			 : "r" (*(uint64_t *)compare), "r" (*(uint64_t *)set), "r" (target)
 			 : "memory", "cc");
 	tmp = (previous == *(uint64_t *)compare);
 	*(uint64_t *)compare = previous;