BUG/MAJOR: connection: always disable ready events once reported
This effectively reverts the two following commits:
6f95f6e11 ("OPTIM: connection: disable receiving on disabled events when the run queue is too high")
065a02561 ("MEDIUM: connection: don't stop receiving events in the FD handler")
The problem as reported in issue #662 is that when the events signals
the readiness of input data that has to be forwarded over a congested
stream, the mux will read data and wake the stream up to forward them,
but the buffer full condition makes this impossible immediately, then
nobody in the chain will be able to disable the event after it was
first reported. And given we don't know at the connection level whether
an event was already reported or not, we can't decide anymore to
forcefully stop it if for any reason its processing gets delayed.
The problem is magnified in issue #662 by the fact that a shutdown is
reported with pending data occupying the buffer. The shutdown will
strike in loops and cause the upper layer stream to be notified until
it's handled, but with a buffer full it's not possible to call cs_recv()
hence to purge the event.
All this can only be handled optimally by implementing a lower layer,
direct mux-to-mux forwarding that will not require any scheduling. This
was no wake up will be needed and the event will be instantly handled
or paused for a long time.
For now let's simply revert these optimizations. Running a 1 MB transfer
test over H2 using 8 connections having each 32 streams with a limited
link of 320 Mbps shows the following profile before this fix:
calls syscall (100% CPU)
------ -------
259878 epoll_wait
519759 clock_gettime
17277 sendto
17129 recvfrom
672 epoll_ctl
And the following one after the fix:
calls syscall (2-3% CPU)
------ -------
17201 sendto
17357 recvfrom
2304 epoll_wait
4609 clock_gettime
1200 epoll_ctl
Thus the behavior is much better.
No backport is needed as these patches were only in 2.2-dev.
Many thanks to William Dauchy for reporting a lot of details around this
difficult issue.
1 file changed