BUG/MEDIUM: quic: break out of the loop in quic_lstnr_dghdlr
The function processes packets sent by other threads in the current
thread's queue. But if, for any reason, other threads write faster
than the current one processes, this can lead to a situation where
the function never returns.
It seems that it might be what's happening in issue #1808, though
unfortunately, this function is one of the rare without traces. But
the amount of calls to functions like qc_lstnr_pkt_rcv() on a single
thread seems to indicate this possibility.
Thanks to Tristan for his efforts in collecting extremely precious
traces!
This likely needs to be backported to 2.6.
diff --git a/src/quic_sock.c b/src/quic_sock.c
index a2be525..3eacb39 100644
--- a/src/quic_sock.c
+++ b/src/quic_sock.c
@@ -251,6 +251,7 @@
LIST_APPEND(dgrams, &dgram->list);
MT_LIST_APPEND(&quic_dghdlrs[cid_tid].dgrams, &dgram->mt_list);
+ /* typically quic_lstnr_dghdlr() */
tasklet_wakeup(quic_dghdlrs[cid_tid].task);
return 1;
diff --git a/src/xprt_quic.c b/src/xprt_quic.c
index 367a9a5..564da9b 100644
--- a/src/xprt_quic.c
+++ b/src/xprt_quic.c
@@ -6572,6 +6572,7 @@
struct quic_dgram *dgram;
int first_pkt = 1;
struct list *tasklist_head = NULL;
+ int max_dgrams = global.tune.maxpollevents;
while ((dgram = MT_LIST_POP(&dghdlr->dgrams, typeof(dgram), mt_list))) {
pos = dgram->buf;
@@ -6606,10 +6607,18 @@
/* Mark this datagram as consumed */
HA_ATOMIC_STORE(&dgram->buf, NULL);
+
+ if (--max_dgrams <= 0)
+ goto stop_here;
}
return t;
+ stop_here:
+ /* too much work done at once, come back here later */
+ if (!MT_LIST_ISEMPTY(&dghdlr->dgrams))
+ tasklet_wakeup((struct tasklet *)t);
+
err:
return t;
}