BUG/MEDIUM: listener/threads: fix an AB/BA locking issue in delete_listener()
The delete_listener() function takes the listener's lock before taking
the proto_lock, which is contrary to what other functions do, possibly
causing an AB/BA deadlock. In practice the two only places where both
are taken are during protocol_enable_all() and delete_listener(), the
former being used during startup and the latter during stop. In practice
during reload floods, it is technically possible for a thread to be
initializing the listeners while another one is stopping. While this
is too hard to trigger on 2.0 and above due to the synchronization of
all threads during startup, it's reasonably easy to do in 1.9 by having
hundreds of listeners, starting 64 threads and flooding them with reloads
like this :
$ while usleep 50000; do killall -USR2 haproxy; done
Usually in less than a minute, all threads will be deadlocked. The fix
consists in always taking the proto_lock before the listener lock. It
seems to be the only place where these two locks were reversed. This
fix needs to be backported to 2.0, 1.9, and 1.8.
(cherry picked from commit 6ee9f8df3bfbb811526cff3313da5758b1277bc6)
Signed-off-by: Willy Tarreau <w@1wt.eu>
diff --git a/src/listener.c b/src/listener.c
index b5fe2ac..54c0996 100644
--- a/src/listener.c
+++ b/src/listener.c
@@ -595,17 +595,17 @@
*/
void delete_listener(struct listener *listener)
{
+ HA_SPIN_LOCK(PROTO_LOCK, &proto_lock);
HA_SPIN_LOCK(LISTENER_LOCK, &listener->lock);
if (listener->state == LI_ASSIGNED) {
listener->state = LI_INIT;
- HA_SPIN_LOCK(PROTO_LOCK, &proto_lock);
LIST_DEL(&listener->proto_list);
listener->proto->nb_listeners--;
- HA_SPIN_UNLOCK(PROTO_LOCK, &proto_lock);
_HA_ATOMIC_SUB(&jobs, 1);
_HA_ATOMIC_SUB(&listeners, 1);
}
HA_SPIN_UNLOCK(LISTENER_LOCK, &listener->lock);
+ HA_SPIN_UNLOCK(PROTO_LOCK, &proto_lock);
}
/* Returns a suitable value for a listener's backlog. It uses the listener's,