MAJOR: agent: rework the response processing and support additional actions

We now retrieve a lot of information from a single line of response, which
can be made up of various words delimited by spaces/tabs/commas. We try to
arrange all this and report whatever unusual we detect. The agent now supports :
  - "up", "down", "stopped", "fail" for the operational states
  - "ready", "drain", "maint" for the administrative states
  - any "%" number for the weight
  - an optional reason after a "#" that can be reported on the stats page

The line parser and processor should move to its own function so that
we can reuse the exact same one for http-based agent checks later.
diff --git a/doc/configuration.txt b/doc/configuration.txt
index c565389..bfb5fc3 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -8469,40 +8469,56 @@
 
 agent-check
   Enable an auxiliary agent check which is run independently of a regular
-  health check. An agent health check is performed by making a TCP
-  connection to the port set by the "agent-port" parameter" and reading
-  an ASCII string. The string should have one of the following forms:
+  health check. An agent health check is performed by making a TCP connection
+  to the port set by the "agent-port" parameter and reading an ASCII string.
+  The string is made of a series of words delimited by spaces, tabs or commas
+  in any order, optionally terminated by '\r' and/or '\n', each consisting of :
 
-  * An ASCII representation of an positive integer percentage.
-    e.g. "75%"
-
+  - An ASCII representation of a positive integer percentage, e.g. "75%".
     Values in this format will set the weight proportional to the initial
     weight of a server as configured when haproxy starts.
 
-  * The string "drain".
-
-    This will cause the weight of a server to be set to 0, and thus it will
-    not accept any new connections other than those that are accepted via
-    persistence.
+  - The word "ready". This will turn the server's administrative state to the
+    READY mode, thus cancelling any DRAIN or MAINT state
 
-  * The string "down", optionally followed by a description string.
+  - The word "drain". This will turn the server's administrative state to the
+    DRAIN mode, thus it will not accept any new connections other than those
+    that are accepted via persistence.
 
-    Mark the server as down and log the description string as the reason.
+  - The word "maint". This will turn the server's administrative state to the
+    MAINT mode, thus it will not accept any new connections at all, and health
+    checks will be stopped.
 
-  * The string "stopped", optionally followed by a description string.
+  - The words "down", "failed", or "stopped", optionally followed by a
+    description string after a sharp ('#'). All of these mark the server's
+    operating state as DOWN, but since the word itself is reported on the stats
+    page, the difference allows an administrator to know if the situation was
+    expected or not : the service may intentionally be stopped, may appear up
+    but fail some validity tests, or may be seen as down (eg: missing process,
+    or port not responding).
 
-    This currently has the same behaviour as "down".
+  - The word "up" sets back the server's operating state as UP if health checks
+    also report that the service is accessible.
 
-  * The string "fail", optionally followed by a description string.
-
-    This currently has the same behaviour as "down".
+  Parameters which are not advertised by the agent are not changed. For
+  example, an agent might be designed to monitor CPU usage and only report a
+  relative weight and never interact with the operating status. Similarly, an
+  agent could be designed as an end-user interface with 3 radio buttons
+  allowing an administrator to change only the administrative state. However,
+  it is important to consider that only the agent may revert its own actions,
+  so if a server is set to DRAIN mode or to DOWN state using the agent, the
+  agent must implement the other equivalent actions to bring the service into
+  operations again.
 
   Failure to connect to the agent is not considered an error as connectivity
   is tested by the regular health check which is enabled by the "check"
-  parameter.
+  parameter. Warning though, it is not a good idea to stop an agent after it
+  reports "down", since only an agent reporting "up" will be able to turn the
+  server up again. Note that the CLI on the Unix stats socket is also able to
+  force an agent's result in order to workaround a bogus agent if needed.
 
-  Requires the ""agent-port" parameter to be set.
-  See also the "agent-check" parameter.
+  Requires the "agent-port" parameter to be set. See also the "agent-inter"
+  parameter.
 
   Supported in default-server: No
 
diff --git a/src/checks.c b/src/checks.c
index fc7e776..e54e46a 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -880,13 +880,38 @@
 		break;
 
 	case PR_O2_LB_AGENT_CHK: {
-		short status = HCHK_STATUS_L7RSP;
-		const char *desc = "Unknown feedback string";
-		const char *down_cmd = NULL;
-		int disabled;
-		char *p;
+		int status = HCHK_STATUS_CHECKED;
+		const char *hs = NULL; /* health status      */
+		const char *as = NULL; /* admin status */
+		const char *ps = NULL; /* performance status */
+		const char *err = NULL; /* first error to report */
+		const char *wrn = NULL; /* first warning to report */
+		char *cmd, *p;
 
-		/* get a complete line first */
+		/* We're getting an agent check response. The agent could
+		 * have been disabled in the mean time with a long check
+		 * still pending. It is important that we ignore the whole
+		 * response.
+		 */
+		if (!(check->server->agent.state & CHK_ST_ENABLED))
+			break;
+
+		/* The agent supports strings made of a single line ended by the
+		 * first CR ('\r') or LF ('\n'). This line is composed of words
+		 * delimited by spaces (' '), tabs ('\t'), or commas (','). The
+		 * line may optionally contained a description of a state change
+		 * after a sharp ('#'), which is only considered if a health state
+		 * is announced.
+		 *
+		 * Words may be composed of :
+		 *   - a numeric weight suffixed by the percent character ('%').
+		 *   - a health status among "up", "down", "stopped", and "fail".
+		 *   - an admin status among "ready", "drain", "maint".
+		 *
+		 * These words may appear in any order. If multiple words of the
+		 * same category appear, the last one wins.
+		 */
+
 		p = check->bi->data;
 		while (*p && *p != '\n' && *p != '\r')
 			p++;
@@ -899,57 +924,148 @@
 			set_server_check_status(check, check->status, "Ignoring incomplete line from agent");
 			break;
 		}
+
 		*p = 0;
+		cmd = check->bi->data;
 
-		/*
-		 * The agent may have been disabled after a check was
-		 * initialised.  If so, ignore weight changes and drain
-		 * settings from the agent.  Note that the setting is
-		 * always present in the state of the agent the server,
-		 * regardless of if the agent is being run as a primary or
-		 * secondary check. That is, regardless of if the check
-		 * parameter of this function is the agent or check field
-		 * of the server.
-		 */
-		disabled = !(check->server->agent.state & CHK_ST_ENABLED);
+		while (*cmd) {
+			/* look for next word */
+			if (*cmd == ' ' || *cmd == '\t' || *cmd == ',') {
+				cmd++;
+				continue;
+			}
 
-		if (strchr(check->bi->data, '%')) {
-			if (disabled)
+			if (*cmd == '#') {
+				/* this is the beginning of a health status description,
+				 * skip the sharp and blanks.
+				 */
+				cmd++;
+				while (*cmd == '\t' || *cmd == ' ')
+					cmd++;
 				break;
-			desc = server_parse_weight_change_request(s, check->bi->data);
-			if (!desc) {
-				status = HCHK_STATUS_L7OKD;
-				desc = check->bi->data;
 			}
-		} else if (!strcasecmp(check->bi->data, "drain")) {
-			if (disabled)
-				break;
-			desc = server_parse_weight_change_request(s, "0%");
-			if (!desc) {
-				desc = "drain";
+
+			/* find the end of the word so that we have a null-terminated
+			 * word between <cmd> and <p>.
+			 */
+			p = cmd + 1;
+			while (*p && *p != '\t' && *p != ' ' && *p != '\n' && *p != ',')
+				p++;
+			if (*p)
+				*p++ = 0;
+
+			/* first, health statuses */
+			if (strcasecmp(cmd, "up") == 0) {
+				check->health = check->rise + check->fall - 1;
 				status = HCHK_STATUS_L7OKD;
+				hs = cmd;
+			}
+			else if (strcasecmp(cmd, "down") == 0) {
+				check->health = 0;
+				status = HCHK_STATUS_L7STS;
+				hs = cmd;
 			}
-		} else if (!strncasecmp(check->bi->data, "down", strlen("down"))) {
-			down_cmd = "down";
-		} else if (!strncasecmp(check->bi->data, "stopped", strlen("stopped"))) {
-			down_cmd = "stopped";
-		} else if (!strncasecmp(check->bi->data, "fail", strlen("fail"))) {
-			down_cmd = "fail";
+			else if (strcasecmp(cmd, "stopped") == 0) {
+				check->health = 0;
+				status = HCHK_STATUS_L7STS;
+				hs = cmd;
+			}
+			else if (strcasecmp(cmd, "fail") == 0) {
+				check->health = 0;
+				status = HCHK_STATUS_L7STS;
+				hs = cmd;
+			}
+			/* admin statuses */
+			else if (strcasecmp(cmd, "ready") == 0) {
+				as = cmd;
+			}
+			else if (strcasecmp(cmd, "drain") == 0) {
+				as = cmd;
+			}
+			else if (strcasecmp(cmd, "maint") == 0) {
+				as = cmd;
+			}
+			/* else try to parse a weight here and keep the last one */
+			else if (isdigit((unsigned char)*cmd) && strchr(cmd, '%') != NULL) {
+				ps = cmd;
+			}
+			else {
+				/* keep a copy of the first error */
+				if (!err)
+					err = cmd;
+			}
+			/* skip to next word */
+			cmd = p;
+		}
+		/* here, cmd points either to \0 or to the beginning of a
+		 * description. Skip possible leading spaces.
+		 */
+		while (*cmd == ' ' || *cmd == '\n')
+			cmd++;
+
+		/* First, update the admin status so that we avoid sending other
+		 * possibly useless warnings and can also update the health if
+		 * present after going back up.
+		 */
+		if (as) {
+			if (strcasecmp(as, "drain") == 0)
+				srv_adm_set_drain(check->server);
+			else if (strcasecmp(as, "maint") == 0)
+				srv_adm_set_maint(check->server);
+			else
+				srv_adm_set_ready(check->server);
 		}
 
-		if (down_cmd) {
-			const char *end = check->bi->data + strlen(down_cmd);
-			/*
-			 * The command keyword must terminated the string or
-			 * be followed by a blank.
+		/* now change weights */
+		if (ps) {
+			const char *msg;
+
+			msg = server_parse_weight_change_request(s, ps);
+			if (!wrn || !*wrn)
+				wrn = msg;
+		}
+
+		/* and finally health status */
+		if (hs) {
+			/* We'll report some of the warnings and errors we have
+			 * here. Down reports are critical, we leave them untouched.
+			 * Lack of report, or report of 'UP' leaves the room for
+			 * ERR first, then WARN.
 			 */
-			if (end[0] == '\0' || end[0] == ' ' || end[0] == '\t') {
-				status = HCHK_STATUS_L7STS;
-				desc = check->bi->data;
+			const char *msg = cmd;
+			struct chunk *t;
+
+			if (!*msg || status == HCHK_STATUS_L7OKD) {
+				if (err && *err)
+					msg = err;
+				else if (wrn && *wrn)
+					msg = wrn;
 			}
+
+			t = get_trash_chunk();
+			chunk_printf(t, "via agent : %s%s%s%s",
+				     hs, *msg ? " (" : "",
+				     msg, *msg ? ")" : "");
+
+			set_server_check_status(check, status, t->str);
 		}
+		else if (err && *err) {
+			/* No status change but we'd like to report something odd.
+			 * Just report the current state and copy the message.
+			 */
+			chunk_printf(&trash, "agent reports an error : %s", err);
+			set_server_check_status(check, status/*check->status*/, trash.str);
 
-		set_server_check_status(check, status, desc);
+		}
+		else if (wrn && *wrn) {
+			/* No status change but we'd like to report something odd.
+			 * Just report the current state and copy the message.
+			 */
+			chunk_printf(&trash, "agent warns : %s", wrn);
+			set_server_check_status(check, status/*check->status*/, trash.str);
+		}
+		else
+			set_server_check_status(check, status, NULL);
 		break;
 	}