BUG/MEDIUM: ssl: check a connection's status before computing a handshake

As spotted in issue #822, we're having a problem with error detection in
the SSL layer. The problem is that on an overwhelmed machine, accepted
connections can start to pile up, each of them requiring a slow handshake,
and during all this time if the client aborts, the handshake will still be
calculated.

The error controls are properly placed, it's just that the SSL layer
reads records exactly of the advertised size, without having the ability
to encounter a pending connection error. As such if injecting many TLS
connections to a listener with a huge backlog, it's fairly possible to
meet this situation:

  12:50:48.236056 accept4(8, {sa_family=AF_INET, sin_port=htons(62794), sin_addr=inet_addr("127.0.0.1")}, [128->16], SOCK_NONBLOCK) = 1109
  12:50:48.236071 setsockopt(1109, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  (process other connections' handshakes)

  12:50:48.257270 getsockopt(1109, SOL_SOCKET, SO_ERROR, [ECONNRESET], [4]) = 0
  (proof that error was detectable there but this code was added for the PoC)

  12:50:48.257297 recvfrom(1109, "\26\3\1\2\0", 5, 0, NULL, NULL) = 5
  12:50:48.257310 recvfrom(1109, "\1\0\1\3"..., 512, 0, NULL, NULL) = 512

  (handshake calculation taking 700us)

  12:50:48.258004 sendto(1109, "\26\3\3\0z"..., 1421, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = -1 EPIPE (Broken pipe)
  12:50:48.258036 close(1109)             = 0

The situation was amplified by the multi-queue accept code, as it resulted
in many incoming connections to be accepted long before they could be
handled. Prior to this they would have been accepted and the handshake
immediately started, which would have resulted in most of the connections
waiting in the the system's accept queue, and dying there when the client
aborted, thus the error would have been detected before even trying to
pass them to the handshake code.

As a result, with a listener running on a very large backlog, it's possible
to quickly accept tens of thousands of connections and waste time slowly
running their handshakes while they get replaced by other ones.

This patch adds an SO_ERROR check on the connection's FD before starting
the handshake. This is not pretty as it requires to access the FD, but it
does the job.

Some improvements should be made over the long term so that the transport
layers can report extra information with their ->rcv_buf() call, or at the
very least, implement a ->get_conn_status() function to report various
flags such as shutr, shutw, error at various stages, allowing an upper
layer to inquire for the relevance of engaging into a long operation if
it's known the connection is not usable anymore. An even simpler step
could probably consist in implementing this in the control layer.

This patch is simple enough to be backported as far as 2.0.

Many thanks to @ngaugler for his numerous tests with detailed feedback.

(cherry picked from commit 0630038e771d4d08ae726080e2ef240d5ddaba68)
[wt: context adjustments]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit d287140ade12f0e5cc228929255b3d19c1f6f3cd)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 900dd4cec76daacd7a83fc423d0b8f14ff86e3e9)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit be849f6c5185c99a29bf8f225e14bc7b64bd20c6)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
1 file changed