MINOR: mux-h2: avoid copying large blocks into full buffers

Due to blocking factor being different on H1 and H2, we regularly end
up with tails of data blocks that leave room in the mux buffer, making
it tempting to copy the pending frame into the remaining room left, and
possibly realigning the output buffer.

Here we check if the output buffer contains data, and prefer to wait
if either the current frame doesn't fit or if it's larger than 1/4 of
the buffer. This way upon next call, either a zero copy, or a larger
and aligned copy will be performed, taking the whole chunk at once.

Doing so increases the H2 bandwidth by slightly more than 1% on large
objects.
diff --git a/src/mux_h2.c b/src/mux_h2.c
index a81e3b5..21c657a 100644
--- a/src/mux_h2.c
+++ b/src/mux_h2.c
@@ -3552,6 +3552,17 @@
 		if (outbuf.size >= 9 || !b_space_wraps(&h2c->mbuf))
 			break;
 	realign_again:
+		/* If there are pending data in the output buffer, and we have
+		 * less than 1/4 of the mbuf's size and everything fits, we'll
+		 * still perform a copy anyway. Otherwise we'll pretend the mbuf
+		 * is full and wait, to save some slow realign calls.
+		 */
+		if ((max + 9 > b_room(&h2c->mbuf) || max >= b_size(&h2c->mbuf) / 4)) {
+			h2c->flags |= H2_CF_MUX_MFULL;
+			h2s->flags |= H2_SF_BLK_MROOM;
+			goto end;
+		}
+
 		b_slow_realign(&h2c->mbuf, trash.area, b_data(&h2c->mbuf));
 	}