BUG/MEDIUM: evports: do not clear returned events list on signal

Since 2.0 with commit 0ba4f483d2 ("MAJOR: polling: add event ports
support (Solaris)"), the polling system on Solaris suffers from a
signal handling problem. It turns out that this API is very bizarre,
as reported events are automatically unregistered and their counter
is updated in the same variable that was used to pass the count on
input, making it difficult to handle certain error codes (how should
one handle ENOSYS for example?). And to complete everything, the API
is able to return both EINTR and an event if a signal is reported.

The code tries to deal with certain such cases (e.g. ETIME for timeout
can also report an event), otherwise it defaults to clearing the
event counter upon error. This has the effect that EINTR clears the
list of events, which are also automatically cleared from the set by
the system.

This is visible when using external checks where the SIGCHLD of the
leaving child causes a wakeup that ruins the event counter and causes
endless loops, apparently due to the queued inter-thread byte in the
pipe used to wake threads up that never gets removed in this case.
Note that extcheck would also deserve deeper investigation because it
can immediately re-trigger a check in such a case, which is not normal.

Removing the wiping of the nevlist variable fixes the problem.

This can be backported to all versions since it affects 2.0.

(cherry picked from commit 36d92dcd9b62c5af0d7499c07479d6995565db9f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit b50266791d70658ab7f600c6ff53d9893e088523)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit ffdceda0b4a01e9d3e67b54f2ed94ba521513899)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit a7ee7b4385d75ad0b5cee56f196b3df48bf50f22)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
1 file changed