BUG/MEDIUM: config: fix cpu-map notation with both process and threads

The application of a cpu-map statement with both process and threads
is broken (P-Q/1 or 1/P-Q notation).

For example, before the fix, when using P-Q/1, proc_t1 would be updated.
Then it would be AND'ed with thread which is still 0 and thus does
nothing.

Another problem is when using 1/1[-Q], thread[0] is defined. But if
there is multiple processes, every processes will use this define
affinity even if it should be applied only to 1st process.

The solution to the fix is a little bit too complex for my taste and
there is maybe a simpler solution but I did not wish to break the
storage of global.cpu_map, as it is quite painful to test all the
use-cases. Besides, this code will probably be clean up when
multiprocess support removed on the future version.

Let's try to explain my logic.

* either haproxy runs in multiprocess or multithread mode. If on
  multiprocess, we should consider proc_t1 (P-Q/1 notation). If on
  multithread, we should consider thread (1/P-Q notation). However
  during parsing, the final number of processes or threads is unknown,
  thus we have to consider the two possibilities.

* there is a special case for the first thread / first process which is
  present in both execution modes. And as a matter of fact cpu-map 1 or
  1/1 notation represents the same thing. Thus, thread[0] and proc_t1[0]
  represents the same thing. To solve this problem, only thread[0] is
  used for this special case.

This fix must be backported up to 2.0.

(cherry picked from commit af02c574064873cac3fb10993293d34a4eb94d2f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
diff --git a/src/cfgparse-global.c b/src/cfgparse-global.c
index ca447bb..80b7391 100644
--- a/src/cfgparse-global.c
+++ b/src/cfgparse-global.c
@@ -1078,22 +1078,34 @@
 				}
 			}
 		} else {
-			/* Mapping at the thread level. All threads are retained
-			 * for process 1, and only thread 1 is retained for other
-			 * processes.
+			/* Mapping at the thread level.
+			 * Either proc and/or thread must be 1 and only 1. All
+			 * other combinations are silently ignored.
 			 */
 			if (thread == 0x1) {
+				int val;
+
 				/* first thread, iterate on processes. E.g. cpu-map 1-4/1 0-3 */
 				for (i = n = 0; i < MAX_PROCS; i++) {
 					/* No mapping for this process */
 					if (!(proc & (1UL << i)))
 						continue;
-					if (!autoinc)
-						global.cpu_map.proc_t1[i] = cpus;
+
+					if (!autoinc) {
+						val = cpus;
+					}
 					else {
 						n += my_ffsl(cpus >> n);
-						global.cpu_map.proc_t1[i] = (1UL << (n-1));
+						val = 1UL << (n - 1);
 					}
+
+					/* For first process, thread[0] is used.
+					 * Use proc_t1[N] for all others
+					 */
+					if (!i)
+						global.cpu_map.thread[0] = val;
+					else
+						global.cpu_map.proc_t1[i] = val;
 				}
 			}
 
diff --git a/src/haproxy.c b/src/haproxy.c
index f55a97b..2163326 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -3680,8 +3680,11 @@
 
 #ifdef USE_CPU_AFFINITY
 		/* Now the CPU affinity for all threads */
-		if (global.cpu_map.proc_t1[relative_pid-1])
-			global.cpu_map.thread[0] &= global.cpu_map.proc_t1[relative_pid-1];
+
+		/* If on multiprocess, use proc_t1 except for the first process.
+		 */
+		if ((relative_pid - 1) > 0)
+			global.cpu_map.thread[0] = global.cpu_map.proc_t1[relative_pid-1];
 
 		for (i = 0; i < global.nbthread; i++) {
 			if (global.cpu_map.proc[relative_pid-1])