Diff - 0630038e771d4d08ae726080e2ef240d5ddaba68^! - haproxy

commit	0630038e771d4d08ae726080e2ef240d5ddaba68	[log] [tgz]
author	Willy Tarreau <w@1wt.eu>	Tue Feb 02 15:42:25 2021 +0100
committer	Willy Tarreau <w@1wt.eu>	Tue Feb 02 15:55:53 2021 +0100
tree	9d6263c3857fa8bc96e76fbf6ec482e5551b9749
parent	8695ce0bae21238eba660438c819797a245be71e [diff] [blame]

BUG/MEDIUM: ssl: check a connection's status before computing a handshake

As spotted in issue #822, we're having a problem with error detection in
the SSL layer. The problem is that on an overwhelmed machine, accepted
connections can start to pile up, each of them requiring a slow handshake,
and during all this time if the client aborts, the handshake will still be
calculated.

The error controls are properly placed, it's just that the SSL layer
reads records exactly of the advertised size, without having the ability
to encounter a pending connection error. As such if injecting many TLS
connections to a listener with a huge backlog, it's fairly possible to
meet this situation:

  12:50:48.236056 accept4(8, {sa_family=AF_INET, sin_port=htons(62794), sin_addr=inet_addr("127.0.0.1")}, [128->16], SOCK_NONBLOCK) = 1109
  12:50:48.236071 setsockopt(1109, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  (process other connections' handshakes)

  12:50:48.257270 getsockopt(1109, SOL_SOCKET, SO_ERROR, [ECONNRESET], [4]) = 0
  (proof that error was detectable there but this code was added for the PoC)

  12:50:48.257297 recvfrom(1109, "\26\3\1\2\0", 5, 0, NULL, NULL) = 5
  12:50:48.257310 recvfrom(1109, "\1\0\1\3"..., 512, 0, NULL, NULL) = 512

  (handshake calculation taking 700us)

  12:50:48.258004 sendto(1109, "\26\3\3\0z"..., 1421, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = -1 EPIPE (Broken pipe)
  12:50:48.258036 close(1109)             = 0

The situation was amplified by the multi-queue accept code, as it resulted
in many incoming connections to be accepted long before they could be
handled. Prior to this they would have been accepted and the handshake
immediately started, which would have resulted in most of the connections
waiting in the the system's accept queue, and dying there when the client
aborted, thus the error would have been detected before even trying to
pass them to the handshake code.

As a result, with a listener running on a very large backlog, it's possible
to quickly accept tens of thousands of connections and waste time slowly
running their handshakes while they get replaced by other ones.

This patch adds an SO_ERROR check on the connection's FD before starting
the handshake. This is not pretty as it requires to access the FD, but it
does the job.

Some improvements should be made over the long term so that the transport
layers can report extra information with their ->rcv_buf() call, or at the
very least, implement a ->get_conn_status() function to report various
flags such as shutr, shutw, error at various stages, allowing an upper
layer to inquire for the relevance of engaging into a long operation if
it's known the connection is not usable anymore. An even simpler step
could probably consist in implementing this in the control layer.

This patch is simple enough to be backported as far as 2.0.

Many thanks to @ngaugler for his numerous tests with detailed feedback.

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index c25f447..628a9b0 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c

@@ -5345,6 +5345,9 @@
 	struct ssl_counters *counters_px = NULL;
 	struct listener *li;
 	struct server *srv;
+	socklen_t lskerr;
+	int skerr;
+
 
 	if (!conn_ctrl_ready(conn))
 		return 0;
@@ -5372,6 +5375,21 @@
 	if (!conn->xprt_ctx)
 		goto out_error;
 
+	/* don't start calculating a handshake on a dead connection */
+	if (conn->flags & (CO_FL_ERROR | CO_FL_SOCK_RD_SH | CO_FL_SOCK_WR_SH))
+		goto out_error;
+
+	/* FIXME/WT: for now we don't have a clear way to inspect the connection
+	 * status from the lower layers, so let's check the FD directly. Ideally
+	 * the xprt layers should provide some status indicating their knowledge
+	 * of shutdowns or error.
+	 */
+	skerr = 0;
+	lskerr = sizeof(skerr);
+	if ((getsockopt(conn->handle.fd, SOL_SOCKET, SO_ERROR, &skerr, &lskerr) < 0) ||
+	    skerr != 0)
+		goto out_error;
+
 #ifdef SSL_READ_EARLY_DATA_SUCCESS
 	/*
 	 * Check if we have early data. If we do, we have to read them