BUG/MAJOR: threads/tasks: fix the scheduler again My recent change in commit ce4e0aa ("MEDIUM: task: change the construction of the loop in process_runnable_tasks()") was bogus as it used to keep the rq_next across an unlock/lock sequence, occasionally leading to crashes for tasks that are eligible to any thread. We must use the lookup call for each new batch instead. The problem is easily triggered with such a configuration : global nbthread 4 listen check mode http bind 0.0.0.0:8080 redirect location / option httpchk GET / server s1 127.0.0.1:8080 check inter 1 server s2 127.0.0.1:8080 check inter 1 Thanks to Olivier for diagnosing this one. No backport is needed.

commit: 9e45b33f7ee78953d3ec6ad32d4d9eed3bfc897a [log] [tgz]
author: Willy Tarreau <w@1wt.eu> Wed Nov 08 14:05:19 2017 +0100
committer: Willy Tarreau <w@1wt.eu> Wed Nov 08 14:05:19 2017 +0100
tree: eacfdbb7d684ab1961c04a5f3b486f7087933169
parent: ecd2e15919f31df2c0e42b3a1ac74f1344d9a2ae [diff] [blame]
diff --git a/src/task.c b/src/task.c
index 4555f2f..9882903 100644
--- a/src/task.c
+++ b/src/task.c

@@ -252,13 +252,16 @@
 	}
 
 	HA_SPIN_LOCK(TASK_RQ_LOCK, &rq_lock);
-	rq_next = eb32sc_lookup_ge(&rqueue, rqueue_ticks - TIMER_LOOK_BACK, tid_bit);
 
 	do {
 		/* Note: this loop is one of the fastest code path in
 		 * the whole program. It should not be re-arranged
 		 * without a good reason.
 		 */
+
+		/* we have to restart looking up after every batch */
+		rq_next = eb32sc_lookup_ge(&rqueue, rqueue_ticks - TIMER_LOOK_BACK, tid_bit);
+
 		for (local_tasks_count = 0; local_tasks_count < 16; local_tasks_count++) {
 			if (unlikely(!rq_next)) {
 				/* either we just started or we reached the end
commit	9e45b33f7ee78953d3ec6ad32d4d9eed3bfc897a	[log] [tgz]
author	Willy Tarreau <w@1wt.eu>	Wed Nov 08 14:05:19 2017 +0100
committer	Willy Tarreau <w@1wt.eu>	Wed Nov 08 14:05:19 2017 +0100
tree	eacfdbb7d684ab1961c04a5f3b486f7087933169
parent	ecd2e15919f31df2c0e42b3a1ac74f1344d9a2ae [diff] [blame]