BUG/MEDIUM: checks: unblock signals in external checks
As discussed in issue #140, processes are forked with signals blocked
resulting in haproxy's kill being ignored. This happens when the command
takes more time to complete than the configured check timeout or interval.
Just calling "sleep 30" every second makes the problem obvious.
The fix simply consists in unblocking the signals in the child after the
fork. It needs to be backported to all stable branches containing external
checks and where signals are blocked on startup. It's unclear when it
started, but the following config exhibits the issue :
global
external-check
listen www
bind :8001
timeout client 5s
timeout server 5s
timeout connect 5s
option external-check
external-check command "$PWD/sleep10.sh"
server local 127.0.0.1:80 check inter 200
$ cat sleep10.sh
#!/bin/sh
exec /bin/sleep 10
The "sleep" processes keep accumulating for 10 seconds and stabilize
around 25 when the bug is present. Just issuing "killall sleep" has no
effect on them, and stopping haproxy leaves these processes behind.
(cherry picked from commit 2df8cad0fea2d1a4ca8dd58f384df3c3c3f5d7ee)
Signed-off-by: Willy Tarreau <w@1wt.eu>
diff --git a/src/checks.c b/src/checks.c
index c175a75..e31eb17 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -1997,6 +1997,7 @@
environ = check->envp;
extchk_setenv(check, EXTCHK_HAPROXY_SERVER_CURCONN, ultoa_r(s->cur_sess, buf, sizeof(buf)));
+ haproxy_unblock_signals();
execvp(px->check_command, check->argv);
ha_alert("Failed to exec process for external health check: %s. Aborting.\n",
strerror(errno));