BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution status
When a server relies on a SRV resolution, a task is created to clean it up
(fqdn/port and address) when the SRV resolution is considered as outdated
(based on the resolvers 'timeout' value). It is only possible if the server
inherits outdated info from a state file and is no longer selected to be
attached to a SRV item. Note that most of time, a server is attached to a
SRV item. Thus when the item becomes obsolete, the server is cleaned
up.
It is important to have such task to be sure the server will be free again
to have a chance to be resolved again with fresh information. Of course,
this patch is a workaround to solve a design issue. But there is no other
obvious way to fix it without rewritting all the resolvers part. And it must
be backportable.
This patch relies on following commits:
* MINOR: resolvers: Clean server in a dedicated function when removing a SRV item
* MINOR: resolvers: Remove server from named_servers tree when removing a SRV item
All the series must be backported as far as 2.2 after some observation
period. Backports to 2.0 and 1.8 must be evaluated.
(cherry picked from commit dcac41806239d4f3b2de5e4d1bacf9c03ca99efb)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 2c0f527b61837bfcd3c1a86c175778a1d5ba2e2d)
[cf: Changes applied in src/dns.c instead of src/resolvers.c]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
diff --git a/src/dns.c b/src/dns.c
index 1733fae..2394db5 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -654,6 +654,26 @@
HA_SPIN_UNLOCK(SERVER_LOCK, &srv->lock);
LIST_DEL(&srv->srv_rec_item);
LIST_ADDQ(&srv->srvrq->attached_servers, &srv->srv_rec_item);
+
+ srv->srvrq_check->expire = TICK_ETERNITY;
+}
+
+/* Takes care to cleanup a server resolution when it is outdated. This only
+ * happens for a server relying on a SRV record.
+ */
+static struct task *dns_srvrq_expire_task(struct task *t, void *context, unsigned short state)
+{
+ struct server *srv = context;
+
+ if (!tick_is_expired(t->expire, now_ms))
+ goto end;
+
+ HA_SPIN_LOCK(DNS_LOCK, &srv->srvrq->resolvers);
+ dns_srvrq_cleanup_srv(srv);
+ HA_SPIN_UNLOCK(DNS_LOCK, &srv->srvrq->resolvers);
+
+ end:
+ return t;
}
/* Checks for any obsolete record, also identify any SRV request, and try to
@@ -787,6 +807,7 @@
if (srv) {
/* re-enable DNS resolution for this server by default */
srv->flags &= ~SRV_F_NO_RESOLUTION;
+ srv->srvrq_check->expire = TICK_ETERNITY;
/* Check if an Additional Record is associated to this SRV record.
* Perform some sanity checks too to ensure the record can be used.
@@ -2594,17 +2615,29 @@
continue;
}
srv->resolvers = resolvers;
-
- if (srv->srvrq && !srv->srvrq->resolvers) {
- srv->srvrq->resolvers = srv->resolvers;
- if (dns_link_resolution(srv->srvrq, OBJ_TYPE_SRVRQ, 0) == -1) {
- ha_alert("config : %s '%s' : unable to set DNS resolution for server '%s'.\n",
+ srv->srvrq_check = NULL;
+ if (srv->srvrq) {
+ if (!srv->srvrq->resolvers) {
+ srv->srvrq->resolvers = srv->resolvers;
+ if (dns_link_resolution(srv->srvrq, OBJ_TYPE_SRVRQ, 0) == -1) {
+ ha_alert("config : %s '%s' : unable to set DNS resolution for server '%s'.\n",
+ proxy_type_str(px), px->id, srv->id);
+ err_code |= (ERR_ALERT|ERR_ABORT);
+ continue;
+ }
+ }
+ srv->srvrq_check = task_new(MAX_THREADS_MASK);
+ if (!srv->srvrq_check) {
+ ha_alert("config: %s '%s' : unable to create SRVRQ task for server '%s'.\n",
proxy_type_str(px), px->id, srv->id);
err_code |= (ERR_ALERT|ERR_ABORT);
- continue;
+ goto err;
}
+ srv->srvrq_check->process = dns_srvrq_expire_task;
+ srv->srvrq_check->context = srv;
+ srv->srvrq_check->expire = TICK_ETERNITY;
}
- if (!srv->srvrq && dns_link_resolution(srv, OBJ_TYPE_SERVER, 0) == -1) {
+ else if (dns_link_resolution(srv, OBJ_TYPE_SERVER, 0) == -1) {
ha_alert("config : %s '%s', unable to set DNS resolution for server '%s'.\n",
proxy_type_str(px), px->id, srv->id);
err_code |= (ERR_ALERT|ERR_ABORT);