BUG/MEDIUM: servers: Fix a race condition with idle connections.
When we're purging idle connections, there's a race condition, when we're
removing the connection from the idle list, to add it to the list of
connections to free, if the thread owning the connection tries to free it
at the same time.
To fix this, simply add a per-thread lock, that has to be hold before
removing the connection from the idle list, and when, in conn_free(), we're
about to remove the connection from every list. That way, we know for sure
the connection will stay valid while we remove it from the idle list, to add
it to the list of connections to free.
This should happen rarely enough that it shouldn't have any impact on
performances.
This has not been reported yet, but could provoke random segfaults.
This should be backported to 2.0.
(cherry picked from commit 4be7190c1024b82248a55456ea44b40c40d4f066)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
diff --git a/src/server.c b/src/server.c
index 0104984..a96f1ef 100644
--- a/src/server.c
+++ b/src/server.c
@@ -62,6 +62,7 @@
struct task *idle_conn_task = NULL;
struct task *idle_conn_cleanup[MAX_THREADS] = { NULL };
struct list toremove_connections[MAX_THREADS];
+__decl_hathreads(HA_SPINLOCK_T toremove_lock[MAX_THREADS]);
/* The server names dictionary */
struct dict server_name_dict = {
@@ -5506,6 +5507,7 @@
int j;
int did_remove = 0;
+ HA_SPIN_LOCK(OTHER_LOCK, &toremove_lock[i]);
for (j = 0; j < max_conn; j++) {
struct connection *conn = LIST_POP_LOCKED(&srv->idle_orphan_conns[i], struct connection *, list);
if (!conn)
@@ -5513,6 +5515,7 @@
did_remove = 1;
LIST_ADDQ_LOCKED(&toremove_connections[i], &conn->list);
}
+ HA_SPIN_UNLOCK(OTHER_LOCK, &toremove_lock[i]);
if (did_remove && max_conn < srv->curr_idle_thr[i])
srv_is_empty = 0;
if (did_remove)