OPTIM: stream-int: don't disable polling anymore on DONT_READ
Commit 5fddab0 ("OPTIM: stream_interface: disable reading when
CF_READ_DONTWAIT is set") improved the connection layer's efficiency
back in 1.5-dev13 by avoiding successive read attempts on an active
FD. But by disabling this on a polled FD, it causes an unpleasant
side effect which is that the FD that was subscribed to polling is
suddenly stopped and may need to be re-enabled once the kernel
starts to slow down on data eviction (eg: saturated server at the
other end, bursty traffic caused by too large maxpollevents).
This behaviour is observable with persistent connections when there
is a large enough connection count so that there's no data in the
early connection and polling is required, because there are then
up to 4 epoll_ctl() calls per request. It's important that the
server is slower than haproxy to cause some delays when reading
response.
The current connection layer as designed in 1.6 with the FD cache
doesn't require this trick anymore, though it still benefits from
it when it saves an FD from being uselessly polled. But compared
to the increased cost of enabling and disabling poll all the time,
it's still better to disable it. In some cases it's possible to
observe a performance increase as high as 30% by avoiding this
epoll_ctl() dance.
In the end we only want to disable it when the FD is speculatively
read and not when it's polled. For this we introduce a new function
__conn_data_done_recv() which is used to indicate that we're done
with recv() and not interested in new attempts. If/when we later
support event-triggered epoll, this function will have to change
a bit to do the same even in the polled case.
A quick test with keep-alive requests run on a dual-core / dual-
thread Atom shows a significant improvement :
single process, 0 bytes :
before: Requests per second: 12243.20 [#/sec] (mean)
after: Requests per second: 13354.54 [#/sec] (mean)
single process, 4k :
before: Requests per second: 9639.81 [#/sec] (mean)
after: Requests per second: 10991.89 [#/sec] (mean)
dual process, 0 bytes (unstable) :
before: Requests per second: 16900-19800 ~ 17600 [#/sec] (mean)
after: Requests per second: 18600-21400 ~ 20500 [#/sec] (mean)
diff --git a/include/proto/connection.h b/include/proto/connection.h
index fce6025..6a24dec 100644
--- a/include/proto/connection.h
+++ b/include/proto/connection.h
@@ -296,6 +296,20 @@
c->flags &= ~CO_FL_DATA_RD_ENA;
}
+/* this one is used only to stop speculative recv(). It doesn't stop it if the
+ * fd is already polled in order to avoid expensive polling status changes.
+ * Since it might require the upper layer to re-enable reading, we'll return 1
+ * if we've really stopped something otherwise zero.
+ */
+static inline int __conn_data_done_recv(struct connection *c)
+{
+ if (!conn_ctrl_ready(c) || !fd_recv_polled(c->t.sock.fd)) {
+ c->flags &= ~CO_FL_DATA_RD_ENA;
+ return 1;
+ }
+ return 0;
+}
+
static inline void __conn_data_want_send(struct connection *c)
{
c->flags |= CO_FL_DATA_WR_ENA;
diff --git a/src/stream_interface.c b/src/stream_interface.c
index 4f93a2e..e3e6cc6 100644
--- a/src/stream_interface.c
+++ b/src/stream_interface.c
@@ -1173,8 +1173,8 @@
}
if ((ic->flags & CF_READ_DONTWAIT) || --read_poll <= 0) {
- si->flags |= SI_FL_WAIT_ROOM;
- __conn_data_stop_recv(conn);
+ if (__conn_data_done_recv(conn))
+ si->flags |= SI_FL_WAIT_ROOM;
break;
}