BUG/MINOR: epoll: do not actively poll for Rx after an error
In 2.2, commit 5d7dcc2a8 ("OPTIM: epoll: always poll for recv if neither
active nor ready") was added to compensate for the fact that our iocbs
are almost always asynchronous now and do not have the opportunity to
update the FD correctly. As such, they just perform a wakeup, the FD is
turned to inactive, the tasklet wakes up, performs the I/O, updates the
FD, most of the time this is done withing the same polling loop, and the
update cancels itself in the poller without having to switch the FD off
then on.
The issue was that when deciding to claim an FD was active for reads
if it was active for writes, we forgot one situation that unfortunately
causes excessive wakeups: dealing with errors. Indeed, errors are
reported and keep ringing as long as the FD is active for sending even
if the consumer disabled the FD for receiving. Usually this only causes
one extra wakeup for the time it takes to consider a potential write
subscriber and to call it, though with many tasks in a run queue, it
can last a bit longer and be reported more often.
The fix consists in checking that we really want to get more receive
events on this FD, that is:
- that no prevous EPOLLERR was reported
- that the FD doesn't carry a sticky error
- that the FD is not shut for reads
With this, after the last epoll_wait() reports EPOLLERR, one last recv()
is performed to flush pending data and the FD is immediately unregistered.
It's probably not needed to backport this as its effects are not much
visible, though it should not harm.
Before, EPOLLERR was seen twice:
accept4(4, {sa_family=AF_INET, sin_port=htons(22314), sin_addr=inet_addr("127.0.0.1")}, [128 => 16], SOCK_NONBLOCK) = 8
accept4(4, 0x261b160, [128], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(8, "POST / HTTP/1.1\r\nConnection: close\r\nTransfer-encoding: chunk"..., 16320, 0, NULL, NULL) = 66
socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 9
connect(9, {sa_family=AF_INET, sin_port=htons(8002), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
epoll_ctl(3, EPOLL_CTL_ADD, 8, {events=EPOLLIN|EPOLLRDHUP, data={u32=8, u64=8}}) = 0
epoll_ctl(3, EPOLL_CTL_ADD, 9, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP, data={u32=9, u64=9}}) = 0
epoll_wait(3, [{events=EPOLLOUT, data={u32=9, u64=9}}], 200, 355) = 1
recvfrom(9, 0x25cfb30, 16320, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
sendto(9, "POST / HTTP/1.1\r\ntransfer-encoding: chunked\r\n\r\n", 47, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 47
epoll_ctl(3, EPOLL_CTL_MOD, 9, {events=EPOLLIN|EPOLLRDHUP, data={u32=9, u64=9}}) = 0
epoll_wait(3, [{events=EPOLLIN|EPOLLERR|EPOLLHUP|EPOLLRDHUP, data={u32=9, u64=9}}], 200, 354) = 1
recvfrom(9, "HTTP/1.1 200 OK\r\ncontent-length: 0\r\nconnection: close\r\n\r\n", 16320, 0, NULL, NULL) = 57
sendto(8, "HTTP/1.1 200 OK\r\ncontent-length: 0\r\nconnection: close\r\n\r\n", 57, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 57
->epoll_wait(3, [{events=EPOLLIN|EPOLLERR|EPOLLHUP|EPOLLRDHUP, data={u32=9, u64=9}}], 200, 354) = 1
epoll_ctl(3, EPOLL_CTL_DEL, 9, 0x7ffe0b65fb24) = 0
epoll_wait(3, [{events=EPOLLIN, data={u32=8, u64=8}}], 200, 354) = 1
recvfrom(8, "A\n0123456789\r\n0\r\n\r\n", 16320, 0, NULL, NULL) = 19
close(9) = 0
close(8) = 0
After, EPOLLERR is only seen only once, with one less call to epoll_wait():
accept4(4, {sa_family=AF_INET, sin_port=htons(22362), sin_addr=inet_addr("127.0.0.1")}, [128 => 16], SOCK_NONBLOCK) = 8
accept4(4, 0x20d0160, [128], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(8, "POST / HTTP/1.1\r\nConnection: close\r\nTransfer-encoding: chunk"..., 16320, 0, NULL, NULL) = 66
socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 9
connect(9, {sa_family=AF_INET, sin_port=htons(8002), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
epoll_ctl(3, EPOLL_CTL_ADD, 8, {events=EPOLLIN|EPOLLRDHUP, data={u32=8, u64=8}}) = 0
epoll_ctl(3, EPOLL_CTL_ADD, 9, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP, data={u32=9, u64=9}}) = 0
epoll_wait(3, [{events=EPOLLOUT, data={u32=9, u64=9}}], 200, 411) = 1
recvfrom(9, 0x2084b30, 16320, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
sendto(9, "POST / HTTP/1.1\r\ntransfer-encoding: chunked\r\n\r\n", 47, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 47
epoll_ctl(3, EPOLL_CTL_MOD, 9, {events=EPOLLIN|EPOLLRDHUP, data={u32=9, u64=9}}) = 0
epoll_wait(3, [{events=EPOLLIN|EPOLLERR|EPOLLHUP|EPOLLRDHUP, data={u32=9, u64=9}}], 200, 411) = 1
recvfrom(9, "HTTP/1.1 200 OK\r\ncontent-length: 0\r\nconnection: close\r\n\r\n", 16320, 0, NULL, NULL) = 57
sendto(8, "HTTP/1.1 200 OK\r\ncontent-length: 0\r\nconnection: close\r\n\r\n", 57, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 57
epoll_ctl(3, EPOLL_CTL_DEL, 9, 0x7ffc95d46f04) = 0
epoll_wait(3, [{events=EPOLLIN, data={u32=8, u64=8}}], 200, 411) = 1
recvfrom(8, "A\n0123456789\r\n0\r\n\r\n", 16320, 0, NULL, NULL) = 19
close(9) = 0
close(8) = 0
1 file changed