BUG/MEDIUM: task: bound the number of tasks picked from the wait queue at once

There is a theorical problem in the wait queue, which is that with many
threads, one could spend a lot of time looping on the newly expired tasks,
causing a lot of contention on the global wq_lock and on the global
rq_lock. This initially sounds bening, but if another thread does just
a task_schedule() or task_queue(), it might end up waiting for a long
time on this lock, and this wait time will count on its execution budget,
degrading the end user's experience and possibly risking to trigger the
watchdog if that lasts too long.

The simplest (and backportable) solution here consists in bounding the
number of expired tasks that may be picked from the global wait queue at
once by a thread, given that all other ones will do it as well anyway.

We don't need to pick more than global.tune.runqueue_depth tasks at once
as we won't process more, so this counter is updated for both the local
and the global queues: threads with more local expired tasks will pick
less global tasks and conversely, keeping the load balanced between all
threads. This will guarantee a much lower latency if/when wakeup storms
happen (e.g. hundreds of thousands of synchronized health checks).

Note that some crashes have been witnessed with 1/4 of the threads in
wake_expired_tasks() and, while the issue might or might not be related,
not having reasonable bounds here definitely justifies why we can spend
so much time there.

This patch should be backported, probably as far as 2.0 (maybe with
some adaptations).
diff --git a/src/task.c b/src/task.c
index cabfcbe..293aefe 100644
--- a/src/task.c
+++ b/src/task.c
@@ -219,11 +219,12 @@
 void wake_expired_tasks()
 {
 	struct task_per_thread * const tt = sched; // thread's tasks
+	int max_processed = global.tune.runqueue_depth;
 	struct task *task;
 	struct eb32_node *eb;
 	__decl_thread(int key);
 
-	while (1) {
+	while (max_processed-- > 0) {
   lookup_next_local:
 		eb = eb32_lookup_ge(&tt->timers, now_ms - TIMER_LOOK_BACK);
 		if (!eb) {
@@ -298,6 +299,8 @@
 	while (1) {
 		HA_RWLOCK_WRLOCK(TASK_WQ_LOCK, &wq_lock);
   lookup_next:
+		if (max_processed-- <= 0)
+			break;
 		eb = eb32_lookup_ge(&timers, now_ms - TIMER_LOOK_BACK);
 		if (!eb) {
 			/* we might have reached the end of the tree, typically because