MINOR: checks: simplify and improve reporting of state changes when using log-health-checks
Function set_server_check_status() is very weird. It is called at the
end of a check to update the server's state before the new state is even
calculated, and possibly to log status changes, only if the proxy has
"option log-health-checks" set.
In order to do so, it employs an exhaustive list of the combinations
which can lead to a state change, while in practice almost all of
them may simply be deduced from the change of check status. Better,
some changes of check status are currently not detected while they
can be very valuable (eg: changes between L4/L6/TOUT/HTTP 500 for
example).
The doc was updated to reflect this.
Also, a minor change was made to consider s->uweight and not s->eweight
as meaning "DRAIN" since eweight can be null without the DRAIN mode (eg:
throttle, NOLB, ...).
diff --git a/doc/configuration.txt b/doc/configuration.txt
index 6da8b3e..a5ec732 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -4377,20 +4377,27 @@
option log-health-checks
no option log-health-checks
- Enable or disable logging of health checks
+ Enable or disable logging of health checks status updates
May be used in sections : defaults | frontend | listen | backend
yes | no | yes | yes
Arguments : none
- Enable health checks logging so it possible to check for example what
- was happening before a server crash. Failed health check are logged if
- server is UP and succeeded health checks if server is DOWN, so the amount
- of additional information is limited.
+ By default, failed health check are logged if server is UP and successful
+ health checks are logged if server is DOWN, so the amount of additional
+ information is limited.
- If health check logging is enabled no health check status is printed
- when servers is set up UP/DOWN/ENABLED/DISABLED.
+ When this option is enabled, any change of the health check status or to
+ the server's health will be logged, so that it becomes possible to know
+ that a server was failing occasional checks before crashing, or exactly when
+ it failed to respond a valid HTTP status, then when the port started to
+ reject connections, then when the server stopped responding at all.
+
+ Note that status changes not caused by health checks (eg: enable/disable on
+ the CLI) are intentionally not logged by this option.
- See also: "log" and section 8 about logging.
+ See also: "option httpchk", "option ldap-check", "option mysql-check",
+ "option pgsql-check", "option redis-check", "option smtpchk",
+ "option tcp-check", "log" and section 8 about logging.
option log-separate-errors
diff --git a/src/checks.c b/src/checks.c
index b94ff62..b4d11c1 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -207,6 +207,7 @@
static void set_server_check_status(struct check *check, short status, const char *desc)
{
struct server *s = check->server;
+ short prev_status = check->status;
if (status == HCHK_STATUS_START) {
check->result = CHK_RES_UNKNOWN; /* no result yet */
@@ -243,13 +244,9 @@
return;
if (s->proxy->options2 & PR_O2_LOGHCHKS &&
- (((check->health != 0) && (check->result == CHK_RES_FAILED)) ||
- (((check->health != check->rise + check->fall - 1) ||
- (!s->uweight && !(s->state & SRV_DRAIN)) ||
- (s->uweight && (s->state & SRV_DRAIN))) &&
- (check->result >= CHK_RES_PASSED)) ||
- ((s->state & SRV_GOINGDOWN) && (check->result != CHK_RES_CONDPASS)) ||
- (!(s->state & SRV_GOINGDOWN) && (check->result == CHK_RES_CONDPASS)))) {
+ ((status != prev_status) ||
+ ((check->health != 0) && (check->result == CHK_RES_FAILED)) ||
+ (((check->health != check->rise + check->fall - 1)) && (check->result >= CHK_RES_PASSED)))) {
int health, rise, fall, state;
@@ -305,7 +302,7 @@
chunk_appendf(&trash, ", status: %d/%d %s",
(state & SRV_RUNNING) ? (health - rise + 1) : (health),
(state & SRV_RUNNING) ? (fall) : (rise),
- (state & SRV_RUNNING)?(s->eweight?"UP":"DRAIN"):"DOWN");
+ (state & SRV_RUNNING) ? (s->uweight?"UP":"DRAIN"):"DOWN");
Warning("%s.\n", trash.str);
send_log(s->proxy, LOG_NOTICE, "%s.\n", trash.str);