96d9a7f89222f41350cc847dfd205d3ad41d9cec - haproxy

commit	96d9a7f89222f41350cc847dfd205d3ad41d9cec	[log] [tgz]
author	Willy Tarreau <w@1wt.eu>	Fri Jan 28 18:40:06 2022 +0100
committer	Willy Tarreau <w@1wt.eu>	Thu Feb 17 06:49:15 2022 +0100
tree	67269949407c8fba34034653a0c34b5f1f9607b2
parent	9f77576196d94fee7b8e45d67c3f132f8e28b4ae [diff]

BUG/MEDIUM: mworker: close unused transferred FDs on load failure When the master process is reloaded on a new config, it will try to connect to the previous process' socket to retrieve all known listening FDs to be reused by the new listeners. If listeners were removed, their unused FDs are simply closed. However there's a catch. In case a socket fails to bind, the master will cancel its startup and swithc to wait mode for a new operation to happen. In this case it didn't close the possibly remaining FDs that were left unused. It is very hard to hit this case, but it can happen during a troubleshooting session with fat fingers. For example, let's say a config runs like this: frontend ftp bind 1.2.3.4:20000-29999 The admin wants to extend the port range down to 10000-29999 and by mistake ends up with: frontend ftp bind 1.2.3.41:20000-29999 Upon restart the bind will fail if the address is not present, and the master will then switch to wait mode without releasing the previous FDs for 1.2.3.4:20000-29999 since they're now apparently unused. Then once the admin fixes the config and does: frontend ftp bind 1.2.3.4:10000-29999 The service will start, but will bind new sockets, half of them overlapping with the previous ones that were not properly closed. This may result in a startup error (if SO_REUSEPORT is not enabled or not available), in a FD number exhaustion (if the error is repeated many times), or in connections being randomly accepted by the process if they sometimes land on the old FD that nobody listens on. This patch will need to be backported as far as 1.8, and depends on previous patch: MINOR: sock: move the unused socket cleaning code into its own function Note that before 2.3 most of the code was located inside haproxy.c, so the patch above should probably relocate the function there instead of sock.c. (cherry picked from commit e08acaed19c6d6a86ebaf2b5f3089ebef78bc69d) Signed-off-by: William Lallemand <wlallemand@haproxy.org> (cherry picked from commit 16b5687af872dd5570687271796f4356791edefb) [wt: added to reexec_on_failure() before calling mworker_reload()] Signed-off-by: Willy Tarreau <w@1wt.eu>