BUG/MEDIUM: queue/threads: limit the number of entries dequeued at once

When testing strong queue contention on a 48-thread machine, some crashes
would frequently happen due to process_srv_queue() never leaving and
processing pending requests forever. A dump showed more than 500000
loops at once. The problem is that other threads find it working so
they don't do anything and are free to process their pending requests.
Because of this, the dequeuing thread can be kept busy forever and does
not process its own requests anymore (fortunately the watchdog stops it).

This patch adds a limit to the number of rounds, it limits it to
maxpollevents, which is reasonable because it's also an indicator of
latency and batches size. However there's a catch. If all requests
are finished when the thread ends the loop, there might not be enough
anymore to restart processing the queue. Thus we tolerate to re-enter
the loop to process one request at a time when it doesn't have any
anymore. This way we're leaving more room for another thread to take
on this task, and we're sure to eventually end this loop.

Doing this has also improved the overall dequeuing performance by ~20%
in highly contended situations with 48 threads.

It should be backported at least to 2.4, maybe even 2.2 since issues
were faced in the past on machines having many cores.
diff --git a/src/queue.c b/src/queue.c
index 6e0e130..73bd3c2 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -373,9 +373,15 @@
 
 	/* let's repeat that under the lock on each round. Threads competing
 	 * for the same server will give up, knowing that at least one of
-	 * them will check the conditions again before quitting.
+	 * them will check the conditions again before quitting. In order
+	 * to avoid the deadly situation where one thread spends its time
+	 * dequeueing for others, we limit the number of rounds it does.
+	 * However we still re-enter the loop for one pass if there's no
+	 * more served, otherwise we could end up with no other thread
+	 * trying to dequeue them.
 	 */
-	while (!stop && s->served < (maxconn = srv_dynamic_maxconn(s))) {
+	while (!stop && (done < global.tune.maxpollevents || !s->served) &&
+	       s->served < (maxconn = srv_dynamic_maxconn(s))) {
 		if (HA_SPIN_TRYLOCK(QUEUE_LOCK, &s->queue.lock) != 0)
 			break;
 
@@ -385,6 +391,8 @@
 				break;
 			_HA_ATOMIC_INC(&s->served);
 			done++;
+			if (done >= global.tune.maxpollevents)
+				break;
 		}
 		HA_SPIN_UNLOCK(QUEUE_LOCK, &s->queue.lock);
 	}