[MEDIUM]: Prevent redispatcher from selecting the same server, version #3
When haproxy decides that session needs to be redispatched it chose a server,
but there is no guarantee for it to be a different one. So, it often
happens that selected server is exactly the same that it was previously, so
a client ends up with a 503 error anyway, especially when one sever has
much bigger weight than others.
Changes from the previous version:
- drop stupid and unnecessary SN_DIRECT changes
- assign_server(): use srvtoavoid to keep the old server and clear s->srv
so SRV_STATUS_NOSRV guarantees that t->srv == NULL (again)
and get_server_rr_with_conns has chances to work (previously
we were passing a NULL here)
- srv_redispatch_connect(): remove t->srv->cum_sess and t->srv->failed_conns
incrementing as t->srv was guaranteed to be NULL
- add avoididx to get_server_rr_with_conns. I hope I correctly understand this code.
- fix http_flush_cookie_flags() and move it to assign_server_and_queue()
directly. The code here was supposed to set CK_DOWN and clear CK_VALID,
but: (TX_CK_VALID | TX_CK_DOWN) == TX_CK_VALID == TX_CK_MASK so:
if ((txn->flags & TX_CK_MASK) == TX_CK_VALID)
txn->flags ^= (TX_CK_VALID | TX_CK_DOWN);
was really a:
if ((txn->flags & TX_CK_MASK) == TX_CK_VALID)
txn->flags &= TX_CK_VALID
Now haproxy logs "--DI" after redispatching connection.
- defer srv->redispatches++ and s->be->redispatches++ so there
are called only if a conenction was redispatched, not only
supposed to.
- don't increment lbconn if redispatcher selected the same sarver
- don't count unsuccessfully redispatched connections as redispatched
connections
- don't count redispatched connections as errors, so:
- the number of connections effectively served by a server is:
srv->cum_sess - srv->failed_conns - srv->retries - srv->redispatches
and
SUM(servers->failed_conns) == be->failed_conns
- requires the "Don't increment server connections too much + fix retries" patch
- needs little more testing and probably some discussion so reverting to the RFC state
Tests #1:
retries 4
redispatch
i) 1 server(s): b (wght=1, down)
b) sessions=5, lbtot=1, err_conn=1, retr=4, redis=0
-> request failed
ii) server(s): b (wght=1, down), u (wght=1, down)
b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
u) sessions=1, lbtot=1, err_conn=1, retr=0, redis=0
-> request FAILED
iii) 2 server(s): b (wght=1, down), u (wght=1, up)
b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
u) sessions=1, lbtot=1, err_conn=0, retr=0, redis=0
-> request OK
iv) 2 server(s): b (wght=100, down), u (wght=1, up)
b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
u) sessions=1, lbtot=1, err_conn=0, retr=0, redis=0
-> request OK
v) 1 server(s): b (down for first 4 SYNS)
b) sessions=5, lbtot=1, err_conn=0, retr=4, redis=0
-> request OK
Tests #2:
retries 4
i) 1 server(s): b (down)
b) sessions=5, lbtot=1, err_conn=1, retr=4, redis=0
-> request FAILED
diff --git a/include/proto/backend.h b/include/proto/backend.h
index 444e53a..c4ffe85 100644
--- a/include/proto/backend.h
+++ b/include/proto/backend.h
@@ -50,10 +50,10 @@
* If any server is found, it will be returned and px->lbprm.map.rr_idx will be updated
* to point to the next server. If no valid server is found, NULL is returned.
*/
-static inline struct server *get_server_rr_with_conns(struct proxy *px)
+static inline struct server *get_server_rr_with_conns(struct proxy *px, struct server *srvtoavoid)
{
- int newidx;
- struct server *srv;
+ int newidx, avoididx;
+ struct server *srv, *avoided;
if (px->lbprm.tot_weight == 0)
return NULL;
@@ -65,17 +65,28 @@
px->lbprm.map.rr_idx = 0;
newidx = px->lbprm.map.rr_idx;
+ avoided = NULL;
do {
srv = px->lbprm.map.srv[newidx++];
if (!srv->maxconn || srv->cur_sess < srv_dynamic_maxconn(srv)) {
- px->lbprm.map.rr_idx = newidx;
- return srv;
+ /* make sure it is not the server we are try to exclude... */
+ if (srv != srvtoavoid) {
+ px->lbprm.map.rr_idx = newidx;
+ return srv;
+ }
+
+ avoided = srv; /* ...but remember that is was selected yet avoided */
+ avoididx = newidx;
}
if (newidx == px->lbprm.tot_weight)
newidx = 0;
} while (newidx != px->lbprm.map.rr_idx);
- return NULL;
+ if (avoided)
+ px->lbprm.map.rr_idx = avoididx;
+
+ /* return NULL or srvtoavoid if found */
+ return avoided;
}