BUG/MAJOR: lb/threads: make sure the avoided server is not full on second pass
In fwrr_get_next_server(), we optionally pass a server to avoid. It
usually points to the current server during a redispatch operation. If
this server is usable, an "avoided" pointer is set and we continue to
look for another server. If in the end no other server is found, then
we fall back to this avoided one, which is still better than nothing.
The problem that may arise with threads is that in the mean time, this
avoided server might have received extra connections and might not be
usable anymore. This causes it to be queued a second time in the "full"
list and the loop to search for a server again, ending up on this one
again and so on.
This patch makes sure that we break out of the loop when we have to
pick the avoided server. It's probably what the code intended to do
as the current break statement causes fwrr_update_position() and
fwrr_dequeue_srv() to be called again on the avoided server.
It must be backported to 1.9 and 1.8, and seems appropriate for older
versions though it's unclear what the impact of this bug might be
there since the race doesn't exist and we're left with the double
update of the server's position.
diff --git a/src/lb_fwrr.c b/src/lb_fwrr.c
index 311698f..d419b8e 100644
--- a/src/lb_fwrr.c
+++ b/src/lb_fwrr.c
@@ -552,7 +552,7 @@
if (switched) {
if (avoided) {
srv = avoided;
- break;
+ goto take_this_one;
}
goto requeue_servers;
}
@@ -582,6 +582,7 @@
full = srv;
}
+ take_this_one:
/* OK, we got the best server, let's update it */
fwrr_queue_srv(srv);