MINOR: threads: flatten the per-thread cpu-map

When we initially experimented with threads and processes support, we
needed to implement arrays of threads per process for cpu-map, but this
is not needed anymore since we support either threads or processes.
Let's simply make the thread-based cpu-map per thread and not per
thread and per process since that's not used anymore. Doing so reduces
the global struct from 33kB to 1.5kB.
diff --git a/src/haproxy.c b/src/haproxy.c
index 603f084..1aeae99 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -3152,17 +3152,17 @@
 		/* Now the CPU affinity for all threads */
 		for (i = 0; i < global.nbthread; i++) {
 			if (global.cpu_map.proc[relative_pid-1])
-				global.cpu_map.thread[relative_pid-1][i] &= global.cpu_map.proc[relative_pid-1];
+				global.cpu_map.thread[i] &= global.cpu_map.proc[relative_pid-1];
 
 			if (i < MAX_THREADS &&       /* only the first 32/64 threads may be pinned */
-			    global.cpu_map.thread[relative_pid-1][i]) {/* only do this if the thread has a THREAD map */
+			    global.cpu_map.thread[i]) {/* only do this if the thread has a THREAD map */
 #if defined(__FreeBSD__) || defined(__NetBSD__)
 				cpuset_t cpuset;
 #else
 				cpu_set_t cpuset;
 #endif
 				int j;
-				unsigned long cpu_map = global.cpu_map.thread[relative_pid-1][i];
+				unsigned long cpu_map = global.cpu_map.thread[i];
 
 				CPU_ZERO(&cpuset);