MINOR: task: limit the number of subsequent heavy tasks with flag TASK_HEAVY While the scheduler is priority-aware and class-aware, and consistently tries to maintain fairness between all classes, it doesn't make use of a fine execution budget to compensate for high-latency tasks such as TLS handshakes. This can result in many subsequent calls adding multiple milliseconds of latency between the various steps of other tasklets that don't even depend on this. An ideal solution would be to add a 4th queue, have all tasks announce their estimated cost upfront and let the scheduler maintain an auto- refilling budget to pick from the most suitable queue. But it turns out that a very simplified version of this already provides impressive gains with very tiny changes and could easily be backported. The principle is to reserve a new task flag "TASK_HEAVY" that indicates that a task is expected to take a lot of time without yielding (e.g. an SSL handshake typically takes 700 microseconds of crypto computation). When the scheduler sees this flag when queuing a tasklet, it will place it into the bulk queue. And during dequeuing, we accept only one of these in a full round. This means that the first one will be accepted, will not prevent other lower priority tasks from running, but if a new one arrives, then the queue stops here and goes back to the polling. This will allow to collect more important updates for other tasks that will be batched before the next call of a heavy task. Preliminary tests consisting in placing this flag on the SSL handshake tasklet show that response times under SSL stress fell from 14 ms before the patch to 3.0 ms with the patch, and even 1.8 ms if tune.sched.low-latency is set to "on". (cherry picked from commit 74dea8caeadfaf3cac262fd4f2c532788c396fb5) [wt: tasklet_wakeup_on() is in task.h in 2.3] Signed-off-by: Willy Tarreau <w@1wt.eu>

commit: 2c42493467ecf3060c75df7a0446edfadd6ad7fd [log] [tgz]
author: Willy Tarreau <w@1wt.eu> Fri Feb 26 00:25:51 2021 +0100
committer: Willy Tarreau <w@1wt.eu> Wed Mar 10 15:11:16 2021 +0100
tree: 1ec8b8e627a4ddb4c47eb59d777178eaf3ab194c
parent: a632666b0ce1ad49aeab42d1d29ddf9a057cad8e [diff] [blame]
diff --git a/src/task.c b/src/task.c
index a2b78e8..8689da8 100644
--- a/src/task.c
+++ b/src/task.c

@@ -403,6 +403,7 @@
 	unsigned int done = 0;
 	unsigned int queue;
 	unsigned short state;
+	char heavy_calls = 0;
 	void *ctx;
 
 	for (queue = 0; queue < TL_CLASSES;) {
@@ -449,7 +450,20 @@
 
 		budgets[queue]--;
 		t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *, list);
-		state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
+		state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_HEAVY|TASK_KILLED);
+
+		if (state & TASK_HEAVY) {
+			/* This is a heavy task. We'll call no more than one
+			 * per function call. If we called one already, we'll
+			 * return and announce the max possible weight so that
+			 * the caller doesn't come back too soon.
+			 */
+			if (heavy_calls) {
+				done = INT_MAX;  // 11ms instead of 3 without this
+				break; // too many heavy tasks processed already
+			}
+			heavy_calls = 1;
+		}
 
 		ti->flags &= ~TI_FL_STUCK; // this thread is still running
 		activity[tid].ctxsw++;
commit	2c42493467ecf3060c75df7a0446edfadd6ad7fd	[log] [tgz]
author	Willy Tarreau <w@1wt.eu>	Fri Feb 26 00:25:51 2021 +0100
committer	Willy Tarreau <w@1wt.eu>	Wed Mar 10 15:11:16 2021 +0100
tree	1ec8b8e627a4ddb4c47eb59d777178eaf3ab194c
parent	a632666b0ce1ad49aeab42d1d29ddf9a057cad8e [diff] [blame]