BUG/MAJOR: h2: verify early that non-http/https schemes match the valid syntax
While we do explicitly check for strict character sets in the scheme,
this is only done when extracting URL components from an assembled one,
and we have special handling for "http" and "https" schemes directly in
the H2-to-HTX conversion. Sadly, this lets all other ones pass through
if they start exactly with "http://" or "https://", allowing the
reconstructed URI to start with a different looking authority if it was
part of the scheme.
It's interesting to note that in this case the valid authority is in
the Host header and that the request will only be wrong if emitted over
H2 on the backend side, since H1 will not emit an absolute URI by
default and will drop the scheme. So in essence, this is a variant of
the scheme-based attack described below in that it only affects H2-H2
and not H2-H1 forwarding:
https://portswigger.net/research/http2
As such, a simple workaround consists in just inserting the following
rule before other ones in the frontend, which will have for effect to
renormalize the authority in the request line according to the
concatenated version (making haproxy see the same authority and host
as what the target server will see):
http-request set-uri %[url]
This patch simply adds the missing syntax checks for non-http/https
schemes before the concatenation in the H2 code. An improvement may
consist in the future in splitting these ones apart in the start
line so that only the "url" sample fetch function requires to access
them together and that all other places continue to access them
separately. This will then allow the core code to perform such checks
itself.
The patch needs to be backported as far as 2.2. Before 2.2 the full
URI was not being reconstructed so the scheme and authority part were
always dropped from H2 requests to leave only origin requests. Note
for backporters: this depends on this previous patch:
MINOR: http: add a new function http_validate_scheme() to validate a scheme
Many thanks to Tim Düsterhus for figuring that one and providing a
reproducer.
diff --git a/src/h2.c b/src/h2.c
index ec8e2fe..b62c130 100644
--- a/src/h2.c
+++ b/src/h2.c
@@ -203,6 +203,8 @@
flags |= HTX_SL_F_SCHM_HTTP;
else if (isteqi(phdr[H2_PHDR_IDX_SCHM], ist("https")))
flags |= HTX_SL_F_SCHM_HTTPS;
+ else if (!http_validate_scheme(phdr[H2_PHDR_IDX_SCHM]))
+ htx->flags |= HTX_FL_PARSING_ERROR;
meth_sl = ist("GET");
@@ -267,6 +269,8 @@
flags |= HTX_SL_F_SCHM_HTTP;
else if (isteqi(phdr[H2_PHDR_IDX_SCHM], ist("https")))
flags |= HTX_SL_F_SCHM_HTTPS;
+ else if (!http_validate_scheme(phdr[H2_PHDR_IDX_SCHM]))
+ htx->flags |= HTX_FL_PARSING_ERROR;
meth_sl = phdr[H2_PHDR_IDX_METH];
}