BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue()
After checking that a server or backend is full, it remains possible
to call pendconn_add() just after the last pending requests finishes, so
that there's no more connection on the server for very low maxconn (typ 1),
leaving new ones in queue till the timeout.
The approach depends on where the request was queued, though:
- when queued on a server, we can simply detect that we may dequeue
pending requests and wake them up, it will wake our request and
that's fine. This needs to be done in srv_redispatch_connect() when
the server is set.
- when queued on a backend, it means that all servers are done with
their requests. It means that all servers were full before the
check and all were empty after. In practice this will only concern
configs with less servers than threads. It's where the issue was
first spotted, and it's very hard to reproduce with more than one
server. In this case we need to load-balance again in order to find
a spare server (or even to fail). For this, we call the newly added
dedicated function pendconn_must_try_again() that tells whether or
not a blocked pending request was dequeued and needs to be retried.
This should be backported along with pendconn_must_try_again() to all
stable versions, but with extreme care because over time the queue's
locking evolved.
(cherry picked from commit 5541d4995d6d9e8e7956423d26c26bebe8f0eaea)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 68492650d35c8ec3d6352079b8d2b54fd43be1d6)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit ad00496b1aeb68d90d737bec212eb9ca186e3c22)
Signed-off-by: Willy Tarreau <w@1wt.eu>
1 file changed