[OPTIM] stream_sock: don't use splice on too small payloads

It's more expensive to call splice() on short payloads than to use
recv()+send(). One of the reasons is that doing a splice() involves
allocating a pipe. One other reason is that the kernel will have to
copy itself if we try to splice less than a page. So let's fix a
short offset of 4kB below which we don't splice.

A quick test shows that on chunked encoded data, with splice we had
6826 syscalls (1715 splice, 3461 recv, 1650 send) while with this
patch, the same transfer resulted in 5793 syscalls (3896 recv, 1897
send).
2 files changed