7e9dd732e4bb1d9a324b645c925bfcd6736da9b7 - haproxy

commit	7e9dd732e4bb1d9a324b645c925bfcd6736da9b7	[log] [tgz]
author	Willy Tarreau <w@1wt.eu>	Fri Jan 17 16:19:34 2020 +0100
committer	Christopher Faulet <cfaulet@haproxy.com>	Mon Jan 20 14:25:56 2020 +0100
tree	872431e43fa8cbcda1bb900038855d8c611716e7
parent	7218bc980d7f148f650ba13251aa29ae63d1b4b7 [diff]

BUG/MEDIUM: connection: add a mux flag to indicate splice usability

Commit c640ef1a7d ("BUG/MINOR: stream-int: avoid calling rcv_buf() when
splicing is still possible") fixed splicing in TCP and legacy mode but
broke it badly in HTX mode.

What happens in HTX mode is that the channel's to_forward value remains
set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is
not a reliable signal anymore to indicate whether more data are expected
or not. Thus, when data are spliced out of the mux using rcv_pipe(), even
when the end is reached (that only the mux knows about), the call to
rcv_buf() to get the final HTX blocks completing the message were skipped
and there was often no new event to wake this up, resulting in transfer
timeouts at the end of large objects.

All this goes down to the fact that the channel has no more information
about whether it can splice or not despite being the one having to take
the decision to call rcv_pipe() or not. And we cannot afford to call
rcv_buf() inconditionally because, as the commit above showed, this
reduces the forwarding performance by 2 to 3 in TCP and legacy modes
due to data lying in the buffer preventing splicing from being used
later.

The approach taken by this patch consists in offering the muxes the ability
to report a bit more information to the upper layers via the conn_stream.
This information could simply be to indicate that more data are awaited
but the real need being to distinguish splicing and receiving, here
instead we clearly report the mux's willingness to be called for splicing
or not. Hence the flag's name, CS_FL_MAY_SPLICE.

The mux sets this flag when it knows that its buffer is empty and that
data waiting past what is currently known may be spliced, and clears it
when it knows there's no more data or that the caller must fall back to
rcv_buf() instead.

The stream-int code now uses this to determine if splicing may be used
or not instead of looking at the rcv_pipe() callbacks through the whole
chain. And after the rcv_pipe() call, it checks the flag again to decide
whether it may safely skip rcv_buf() or not.

All this bitfield dance remains a bit complex and it starts to appear
obvious that splicing vs reading should be a decision of the mux based
on permission granted by the data layer. This would however increase
the API's complexity but definitely need to be thought about, and should
even significantly simplify the data processing layer.

The way it was integrated in mux-h1 will also result in no more calls
to rcv_pipe() on chunked encoded data, since these ones are currently
disabled at the mux level. However once the issue with chunks+splice
is fixed, it will be important to explicitly check for curr_len|CHNK
to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk.

This fix must be backported to 2.1 and 2.0.

(cherry picked from commit 17ccd1a3560a634a17d276833ff41b8063b72206)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 7195d4b9396687e67da196cb92ef25b4bd6938d8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

4 files changed

tree: 872431e43fa8cbcda1bb900038855d8c611716e7