MEDIUM: global: Add a "close-spread-time" option to spread soft-stop on time window
The new 'close-spread-time' global option can be used to spread idle and
active HTTP connction closing after a SIGUSR1 signal is received. This
allows to limit bursts of reconnections when too many idle connections
are closed at once. Indeed, without this new mechanism, in case of
soft-stop, all the idle connections would be closed at once (after the
grace period is over), and all active HTTP connections would be closed
by appending a "Connection: close" header to the next response that goes
over it (or via a GOAWAY frame in case of HTTP2).
This patch adds the support of this new option for HTTP as well as HTTP2
connections. It works differently on active and idle connections.
On active connections, instead of sending systematically the GOAWAY
frame or adding the 'Connection: close' header like before once the
soft-stop has started, a random based on the remainder of the close
window is calculated, and depending on its result we could decide to
keep the connection alive. The random will be recalculated for any
subsequent request/response on this connection so the GOAWAY will still
end up being sent, but we might wait a few more round trips. This will
ensure that goaways are distributed along a longer time window than
before.
On idle connections, a random factor is used when determining the expire
field of the connection's task, which should naturally spread connection
closings on the time window (see h2c_update_timeout).
This feature request was described in GitHub issue #1614.
This patch should be backported to 2.5. It depends on "BUG/MEDIUM:
mux-h2: make use of http-request and keep-alive timeouts" which
refactorized the timeout management of HTTP2 connections.
diff --git a/src/proxy.c b/src/proxy.c
index 3ff2035..dc02863 100644
--- a/src/proxy.c
+++ b/src/proxy.c
@@ -2046,6 +2046,34 @@
return 0;
}
+static int proxy_parse_close_spread_time(char **args, int section_type, struct proxy *curpx,
+ const struct proxy *defpx, const char *file, int line,
+ char **err)
+{
+ const char *res;
+
+ if (!*args[1]) {
+ memprintf(err, "'%s' expects <time> as argument.\n", args[0]);
+ return -1;
+ }
+ res = parse_time_err(args[1], &global.close_spread_time, TIME_UNIT_MS);
+ if (res == PARSE_TIME_OVER) {
+ memprintf(err, "timer overflow in argument '%s' to '%s' (maximum value is 2147483647 ms or ~24.8 days)",
+ args[1], args[0]);
+ return -1;
+ }
+ else if (res == PARSE_TIME_UNDER) {
+ memprintf(err, "timer underflow in argument '%s' to '%s' (minimum non-null value is 1 ms)",
+ args[1], args[0]);
+ return -1;
+ }
+ else if (res) {
+ memprintf(err, "unexpected character '%c' in argument to <%s>.\n", *res, args[0]);
+ return -1;
+ }
+ return 0;
+}
+
struct task *hard_stop(struct task *t, void *context, unsigned int state)
{
struct proxy *p;
@@ -2099,6 +2127,10 @@
/* disable busy polling to avoid cpu eating for the new process */
global.tune.options &= ~GTUNE_BUSY_POLLING;
+ if (tick_isset(global.close_spread_time)) {
+ global.close_spread_end = tick_add(now_ms, global.close_spread_time);
+ }
+
/* schedule a hard-stop after a delay if needed */
if (tick_isset(global.hard_stop_after)) {
task = task_new_anywhere();
@@ -2501,6 +2533,7 @@
static struct cfg_kw_list cfg_kws = {ILH, {
{ CFG_GLOBAL, "grace", proxy_parse_grace },
{ CFG_GLOBAL, "hard-stop-after", proxy_parse_hard_stop_after },
+ { CFG_GLOBAL, "close-spread-time", proxy_parse_close_spread_time },
{ CFG_LISTEN, "timeout", proxy_parse_timeout },
{ CFG_LISTEN, "clitimeout", proxy_parse_timeout }, /* This keyword actually fails to parse, this line remains for better error messages. */
{ CFG_LISTEN, "contimeout", proxy_parse_timeout }, /* This keyword actually fails to parse, this line remains for better error messages. */