BUG/MAJOR: stream-int: always detach a faulty endpoint on connect failure

During the 2.0 development cycle we experienced a long saga of issues
related to the inconsistent handling of errors during connection setup
and connection reuse, which ultimately led to the new SI_ST_RDY state
being introduced (commits 4f283fa6 & b27f54a8) and which finally ended
after a long code audit in depth resulted in commits 04400bc7 ("BUG/MAJOR:
stream-int: Don't receive data from mux until SI_ST_EST is reached") and
e8f5f5d8b ("BUG/MEDIUM: servers: Only set SF_SRV_REUSED if the connection
if fully ready"). By the way the former already mentions the design flaw
in the connection handling, which was ultimately addressed in 2.2 by
moving the connection's address to the stream and changing how
connect_server() works so that the previous connection can be released
much earlier.

Among the initial attempts at fixing the issues, commit bf66bd1b8
("MEDIUM: stream-int: always mark pending outgoing SI_ST_CON") took
care of not closing a failed reuse of a connection because that case
was known to leave the stream waiting forever.

But this copes very badly with L7 retries. It caused the crash that commit
926c4fc05 ("BUG/MEDIUM: backend: don't access a non-existing mux from a
previous connection") tried to solve but this one is not enough. The
detailed situation could be reproduced and analysed now. The issue
happens only when ALPN is enabled on a server, at least two servers are
present the backend with two of them producing an error at least once,
a non-determinist LB algorithm is used, option redispatch is present
and L7 retries are enabled. A client makes a first request which is sent
to server 1, then a second over the same connection, which is sent to
server 1 again, but this time the server fails to respond then stops
(e.g. 503 with L7 retries enabled). The request is aborted and a retry
is performed, which this time ends on error because the server doesn't
listen anymore. Due to ALPN, the mux was not yet assigned. Due to the
SF_SRV_REUSED flag, the failed connection is not closed in
sess_update_st_con_tcp(). The next attempt will try to use the same
connection again, which is still attached to the stream-interface and
had its transport layer attached, but no mux yet. Connect_server() goes
on and reaches si_connect() which dereferences conn->mux and dies. If
no redispatch is involved, no server is reassigned and connect_server()
wants to reuse the connection address (which was not set), assigns the
control layer based on protocol_by_family() on that empty address, and
si_connect() rightfully returns an internal error since it's normally
not possible to have a NULL control layer here.

An example config to reproduce this is this one:

   listen h2h1
        mode http
        bind :4445
        retry-on all-retryable-errors
        option prefer-last-server
        option redispatch
        server first 127.0.0.1:5555 ssl verify none alpn h2
        server fail 255.255.255.255:255

Just run openssl s_server -accept 5555 -cert rsa+dh2048.pem on localhost
responding 204 once, then 503, and launch two requests to see the process
die (or see the 500 without redispatch) :

  $ curl 127.0.0.1:4445/ 127.0.0.1:4445/

The most durable way to address this in 2.0 would be to backport all
the changes done in 2.1 and 2.2 to connect_server() and related parts,
but these ones are extremely tricky, and failing to adapt them or
missing the slightest change could have way more impact than the
current bug.

A better alternative was found, consisting in doing almost the same
as commits 7c30642ede ("MEDIUM: streams: Don't close the connection
in back_handle_st_con().") and 1fc5a648bf ("MEDIUM: streams: Don't
close the connection in back_handle_st_rdy()."). These ones removed
the conn_full_close() from the connection error management because
in newer versions it was expected to be done either on the stream's
error path or on the next retry in connect_server(). On 2.0 we can
change this to si_release_endpoint() (which is not automatic in
connect_server) and make it unconditional and not only for non-reused
connections thanks to the patches above which removed the immediate
transition to SI_ST_EST and the early use of SF_SRV_REUSED.

Thanks to this we don't keep an incomplete connection anymore and
have no risk of dereferencing its elements without being able to
finalize their initialization.

It's worth noting that the test above behaves slightly differently in 2.0
than in 2.1+, as a failed l7 retry over a reused connection will return
a silent close allowing the client to retry, while in 2.1+ it always
returns the last error, but this is unrelated to this fix and both
cases are valid.

In addition it was found to address a nasty side effect that could
affect server-side H2 till 2.0: if a stream fails to established on an
outgoing multiplexed connection (e.g. because a GOAWAY frame was received
in between limiting the highest stream ID), instead of detaching the
stream only, the whole connection was killed, causing the death of all
other streams that were still active on it. By using si_release_endpoint()
the appropriate method to detach is now used instead.

As explained above there is no direct mainline commit ID for this fix,
and it must not be backported as it's solely for 2.0.

Cc: Olivier Houchard <cognet@ci0.org>
1 file changed