1c759956112996245eaccbf20e2506b9c9cbceb2 - haproxy

commit	1c759956112996245eaccbf20e2506b9c9cbceb2	[log] [tgz]
author	Willy Tarreau <w@1wt.eu>	Tue Dec 10 18:38:09 2019 +0100
committer	Willy Tarreau <w@1wt.eu>	Tue Dec 10 19:09:15 2019 +0100
tree	d01236e8948cddb470fedd09f4e882cae5560e9f
parent	eaefc3c5032506e89cceb6ad5fdd1c5955c4ea66 [diff]

BUG/MAJOR: dns: add minimalist error processing on the Rx path

It was reported in bug #399 that the DNS sometimes enters endless loops
after hours working fine. The issue is caused by a lack of error
processing in the DNS's recv() path combined with an exclusive recv OR
send in the UDP layer, resulting in some errors causing CPU loops that
will never stop until the process is restarted.

The basic cause is that the FD_POLL_ERR and FD_POLL_HUP flags are sticky
on the FD, and contrary to a stream socket, receiving an error on a
datagram socket doesn't indicate that this socket cannot be used anymore.
Thus the Rx code must at least handle this situation and flush the error
otherwise it will constantly be reported. In theory this should not be a
big issue but in practise it is due to another bug in the UDP datagram
handler which prevents the send() callback from being called when Rx
readiness was reported, so the situation cannot go away. It happens way
more easily with threads enabled, so that there is no dead time between
the moment the FD is disabled and another recv() is called, such as in
the example below where the request was sent to a closed port on the
loopback provoking an ICMP unreachable to be sent back:

  [pid 20888] 18:26:57.826408 sendto(29, ";\340\1\0\0\1\0\0\0\0\0\1\0031wt\2eu\0\0\34\0\1\0\0)\2\0\0\0\0\0\0\0", 35, 0, NULL, >
  [pid 20893] 18:26:57.826566 recvfrom(29, 0x7f97c54ef2f0, 513, 0, NULL, NULL) = -1 ECONNREFUSED (Connection refused)
  [pid 20889] 18:26:57.826601 recvfrom(29, 0x7f97c76182f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20892] 18:26:57.826630 recvfrom(29, 0x7f97c5cf02f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20891] 18:26:57.826684 recvfrom(29, 0x7f97c66162f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20895] 18:26:57.826716 recvfrom(29, 0x7f97bffda2f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20894] 18:26:57.826747 recvfrom(29, 0x7f97c4cee2f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20888] 18:26:58.419838 recvfrom(29, 0x7ffcc8712c20, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20893] 18:26:58.419900 recvfrom(29, 0x7f97c54ef2f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  (... hundreds before next sendto() ...)

This situation was handled by clearing HUP and ERR when recv()
returns <0.

A second case was handled, there was a control for a missing dgram
handler, but it does nothing, causing the FD to ring again if this
situation ever happens. After looking at the rest of the code, it
doesn't seem possible to face such a situation because these handlers
are registered during startup, but at least we need to handle it
properly.

A third case was handled, that's mainly a small optimization. With
threads and massive responses, due to the large lock around the loop,
it's likely that some threads will have seen fd_recv_ready() and will
wait at the lock(). But if they wait here, chances are that other
threads will have eliminated pending data and issued fd_cant_recv().
In this case, better re-check fd_recv_ready() before performing the
recv() call to avoid the huge amounts of syscalls that happen on
massively threaded setups.

This patch must be backported as far as 1.6 (the atomic AND just
needs to be turned to a regular AND).

src/dns.c[diff]

1 file changed

tree: d01236e8948cddb470fedd09f4e882cae5560e9f