blob: 0d9873cd29c37f431f12d4aec681b98c13050e0d [file] [log] [blame]
Willy Tarreau7f898512011-03-20 11:32:40 +01001 The PROXY protocol
2 Willy Tarreau
3 2011/03/20
4
5Abstract
6
7 The PROXY protocol provides a convenient way to safely transport connection
8 information such as a client's address across multiple layers of NAT or TCP
9 proxies. It is designed to require little changes to existing components and
10 to limit the performance impact caused by the processing of the transported
11 information.
12
13
14Revision history
15
16 2010/10/29 - first version
17 2011/03/20 - update: implementation and security considerations
18
19
201. Background
Willy Tarreau640cf222010-10-29 21:46:16 +020021
22Relaying TCP connections through proxies generally involves a loss of the
23original TCP connection parameters such as source and destination addresses,
24ports, and so on. Some protocols make it a little bit easier to transfer such
25information. For SMTP, Postfix authors have proposed the XCLIENT protocol which
26received broad adoption and is particularly suited to mail exchanges. In HTTP,
27we have the non-standard but omnipresent X-Forwarded-For header which relays
28information about the original source address, and the less common
29X-Original-To which relays information about the destination address.
30
31However, both mechanisms require a knowledge of the underlying protocol to be
32implemented in intermediaries.
33
34Then comes a new class of products which we'll call "dumb proxies", not because
35they don't do anything, but because they're processing protocol-agnostic data.
36Stunnel is an example of such a "dumb proxy". It talks raw TCP on one side, and
37raw SSL on the other one, and does that reliably.
38
39The problem with such a proxy when it is combined with another one such as
40haproxy is to adapt it to talk the higher level protocol. A patch is available
41for Stunnel to make it capable to insert an X-Forwarded-For header in the first
42HTTP request of each incoming connection. Haproxy is able not to add another
43one when the connection comes from Stunnel, so that it's possible to hide it
44from the servers.
45
46The typical architecture becomes the following one :
47
48
49 +--------+ HTTP :80 +----------+
50 | client | --------------------------------> | |
51 | | | haproxy, |
52 +--------+ +---------+ | 1 or 2 |
53 / / HTTPS | stunnel | HTTP :81 | listening|
54 <________/ ---------> | (server | ---------> | ports |
55 | mode) | | |
56 +---------+ +----------+
57
58
59The problem appears when haproxy runs with keep-alive on the side towards the
60client. The Stunnel patch will only add the X-Forwarded-For header to the first
61request of each connection and all subsequent requests will not have it. One
62solution could be to improve the patch to make it support keep-alive and parse
63all forwarded data, whether they're announced with a Content-Length or with a
64Transfer-Encoding, taking care of special methods such as HEAD which announce
65data without transfering them, etc... In fact, it would require implementing a
66full HTTP stack in Stunnel. It would then become a lot more complex, a lot less
67reliable and would not anymore be the "dumb proxy" that fits every purposes.
68
69In practice, we don't need to add a header for each request because we'll emit
70the exact same information every time : the information related to the client
71side connection. We could then cache that information in haproxy and use it for
72every other request. But that becomes dangerous and is still limited to HTTP
73only.
74
75Another approach would be to prepend each connection with a line reporting the
76characteristics of the other side's connection. This method is a lot simpler to
77implement, does not require any protocol-specific knowledge on either side, and
78completely fits the purpose. That's finally what we did with a small patch to
79Stunnel and another one to haproxy. We have called this protocol the PROXY
80protocol.
81
Willy Tarreau7f898512011-03-20 11:32:40 +010082
832. The PROXY protocol
84
Willy Tarreau640cf222010-10-29 21:46:16 +020085The PROXY protocol's goal is to fill the receiver's internal structures with
86the information it could have found itself if it performed the accept from the
87client. Thus right now we're supporting the following :
88 - INET protocol and family (TCP over IPv4 or IPv6)
89 - layer 3 source and destination addresses
90 - layer 4 source and destination ports if any
91
92Unlike the XCLIENT protocol, the PROXY protocol was designed with limited
93extensibility in order to help the receiver parse it very fast, while keeping
94it human-readable for better debugging possibilities. So it consists in exactly
95the following block prepended before any data flowing from the dumb proxy to
96the next hop :
97
98 - a string identifying the protocol : "PROXY" ( \x50 \x52 \x4F \x58 \x59 )
99
100 - exactly one space : " " ( \x20 )
101
102 - a string indicating the proxied INET protocol and family. At the moment,
103 only "TCP4" ( \x54 \x43 \x50 \x34 ) for TCP over IPv4, and "TCP6"
104 ( \x54 \x43 \x50 \x36 ) for TCP over IPv6 are allowed. Unsupported or
105 unknown protocols must be reported with the name "UNKNOWN" ( \x55 \x4E \x4B
106 \x4E \x4F \x57 \x4E). The remaining fields of the line are then optional
107 and may be ignored, until the CRLF is found.
108
109 - exactly one space : " " ( \x20 )
110
111 - the layer 3 source address in its canonical format. IPv4 addresses must be
112 indicated as a series of exactly 4 integers in the range [0..255] inclusive
113 written in decimal representation separated by exactly one dot between each
114 other. Heading zeroes are not permitted in front of numbers in order to
115 avoid any possible confusion with octal numbers. IPv6 addresses must be
116 indicated as series of 4 hexadecimal digits (upper or lower case) delimited
117 by colons between each other, with the acceptance of one double colon
118 sequence to replace the largest acceptable range of consecutive zeroes. The
119 total number of decoded bits must exactly be 128. The advertised protocol
120 family dictates what format to use.
121
122 - exactly one space : " " ( \x20 )
123
124 - the layer 3 destination address in its canonical format. It is the same
125 format as the layer 3 source address and matches the same family.
126
127 - exactly one space : " " ( \x20 )
128
129 - the TCP source port represented as a decimal integer in the range
130 [0..65535] inclusive. Heading zeroes are not permitted in front of numbers
131 in order to avoid any possible confusion with octal numbers.
132
133 - exactly one space : " " ( \x20 )
134
135 - the TCP destination port represented as a decimal integer in the range
136 [0..65535] inclusive. Heading zeroes are not permitted in front of numbers
137 in order to avoid any possible confusion with octal numbers.
138
139 - the CRLF sequence ( \x0D \x0A )
140
141The receiver MUST be configured to only receive this protocol and MUST not try
142to guess whether the line is prepended or not. That means that the protocol
143explicitly prevents port sharing between public and private access. Otherwise
144it would become a big security issue. The receiver should ensure proper access
145filtering so that only trusted proxies are allowed to use this protocol. The
146receiver must wait for the CRLF sequence to decode the addresses in order to
147ensure they are complete. Any sequence which does not exactly match the
148protocol must be discarded and cause a connection abort. It is recommended
149to abort the connection as soon as possible to that the emitter notices the
150anomaly.
151
152If the announced transport protocol is "UNKNOWN", then the receiver knows that
Willy Tarreau7f898512011-03-20 11:32:40 +0100153the emitter talks the correct protocol, and may or may not decide to accept the
Willy Tarreau640cf222010-10-29 21:46:16 +0200154connection and use the real connection's parameters as if there was no such
155protocol on the wire.
156
157An example of such a line before an HTTP request would look like this (CR
158marked as "\r" and LF marked as "\n") :
159
160 PROXY TCP4 192.168.0.1 192.168.0.11 56324 443\r\n
161 GET / HTTP/1.1\r\n
162 Host: 192.168.0.11\r\n
163 \r\n
164
165For the emitter, the line is easy to put into the output buffers once the
166connection is established. For the receiver, once the line is parsed, it's
167easy to skip it from the input buffers.
168
Willy Tarreau7f898512011-03-20 11:32:40 +0100169
1703. Implementations
171
172Haproxy 1.5 implements the PROXY protocol on both sides :
173 - the listening sockets accept the protocol when the "accept-proxy" setting
174 is passed to the "bind" keyword. Connections accepted on such listeners
175 will behave just as if the source really was the one advertised in the
176 protocol. This is true for logging, ACLs, content filtering, transparent
177 proxying, etc...
178
179 - the protocol may be used to connect to servers if the "send-proxy" setting
180 is present on the "server" line. It is enabled on a per-server basis, so it
181 is possible to have it enabled for remote servers only and still have local
182 ones behave differently. If the incoming connection was accepted with the
183 "accept-proxy", then the relayed information is the one advertised in this
184 connection's PROXY line.
185
Willy Tarreau640cf222010-10-29 21:46:16 +0200186We have a patch available for recent versions of Stunnel that brings it the
Willy Tarreau7f898512011-03-20 11:32:40 +0100187ability to be an emitter. The feature is called "sendproxy" there.
188
189The protocol is so simple that it is expected that other implementations will
190appear, especially in environments such as SMTP, IMAP, FTP, RDP where the
191client's address is an important piece of information for the server and some
192intermediaries.
193
194Proxy developers are encouraged to implement this protocol, because it will
195make their products much more transparent in complex infrastructures, and will
196get rid of a number of issues related to logging and access control.
197
198
1994. Security considerations
200
201The protocol was designed so as to be distinguishable from HTTP. It will not
202parse as a valid HTTP request and an HTTP request will not parse as a valid
203proxy request. That makes it easier to enfore its use certain connections.
204Implementers should be very careful about not trying to automatically detect
205whether they have to decode the line or not, but rather to only rely on a
206configuration parameter. Indeed, if the opportunity is left to a normal client
207to use the protocol, he will be able to hide his activities or make them appear
208as coming from someone else. However, accepting the line only from a number of
209known sources should be safe.
210
211
2125. Future developments
Willy Tarreau640cf222010-10-29 21:46:16 +0200213
214It is possible that the protocol may slightly evolve to present other
215information such as the incoming network interface, or the origin addresses in
216case of network address translation happening before the first proxy, but this
Willy Tarreau7f898512011-03-20 11:32:40 +0100217is not identified as a requirement right now. Suggestions on improvements are
218welcome.
219
220
2216. Contacts
222
223Please use w@1wt.eu to send any comments to the author.
224