BUG/MEDIUM: systemd: let the wrapper know that haproxy has completed or failed

Pierre Cheynier found that there's a persistent issue with the systemd
wrapper. Too fast reloads can lead to certain old processes not being
signaled at all and continuing to run. The problem was tracked down as
a race between the startup and the signal processing : nothing prevents
the wrapper from starting new processes while others are still starting,
and the resulting pid file will only contain the latest pids in this
case. This can happen with large configs and/or when a lot of SSL
certificates are involved.

In order to solve this we want the wrapper to wait for the new processes
to complete their startup. But we also want to ensure it doesn't wait for
nothing in case of error.

The solution found here is to create a pipe between the wrapper and the
sub-processes. The wrapper waits on the pipe and the sub-processes are
expected to close this pipe once they completed their startup. That way
we don't queue up new processes until the previous ones have registered
their pids to the pid file. And if anything goes wrong, the wrapper is
immediately released. The only thing is that we need the sub-processes
to know the pipe's file descriptor. We pass it in an environment variable
called HAPROXY_WRAPPER_FD.

It was confirmed both by Pierre and myself that this completely solves
the "zombie" process issue so that only the new processes continue to
listen on the sockets.

It seems that in the future this stuff could be moved to the haproxy
master process, also getting rid of an environment variable.

This fix needs to be backported to 1.6 and 1.5.
(cherry picked from commit b957109727f7fed556c049d40bacf45f0df211db)
(cherry picked from commit ff81ac47267c4e0227d1e3fbc5b1dedfd81a2a2f)
2 files changed