blob: fd41b6e90c299722dedae04b1b3001a4b2a55d5e [file] [log] [blame]
Willy Tarreau58f10d72006-12-04 02:26:12 +01001--- Relevant portions of RFC2616 ---
2
3OCTET = <any 8-bit sequence of data>
4CHAR = <any US-ASCII character (octets 0 - 127)>
5UPALPHA = <any US-ASCII uppercase letter "A".."Z">
6LOALPHA = <any US-ASCII lowercase letter "a".."z">
7ALPHA = UPALPHA | LOALPHA
8DIGIT = <any US-ASCII digit "0".."9">
9CTL = <any US-ASCII control character (octets 0 - 31) and DEL (127)>
10CR = <US-ASCII CR, carriage return (13)>
11LF = <US-ASCII LF, linefeed (10)>
12SP = <US-ASCII SP, space (32)>
13HT = <US-ASCII HT, horizontal-tab (9)>
14<"> = <US-ASCII double-quote mark (34)>
15CRLF = CR LF
16LWS = [CRLF] 1*( SP | HT )
17TEXT = <any OCTET except CTLs, but including LWS>
18HEX = "A" | "B" | "C" | "D" | "E" | "F"
19 | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT
20separators = "(" | ")" | "<" | ">" | "@"
21 | "," | ";" | ":" | "\" | <">
22 | "/" | "[" | "]" | "?" | "="
23 | "{" | "}" | SP | HT
24token = 1*<any CHAR except CTLs or separators>
25
26quoted-pair = "\" CHAR
27ctext = <any TEXT excluding "(" and ")">
28qdtext = <any TEXT except <">>
29quoted-string = ( <"> *(qdtext | quoted-pair ) <"> )
30comment = "(" *( ctext | quoted-pair | comment ) ")"
31
32
33
34
35
364 HTTP Message
374.1 Message Types
38
39HTTP messages consist of requests from client to server and responses from
40server to client. Request (section 5) and Response (section 6) messages use the
41generic message format of RFC 822 [9] for transferring entities (the payload of
42the message). Both types of message consist of :
43
44 - a start-line
45 - zero or more header fields (also known as "headers")
46 - an empty line (i.e., a line with nothing preceding the CRLF) indicating the
47 end of the header fields
48 - and possibly a message-body.
49
50
51HTTP-message = Request | Response
52
53start-line = Request-Line | Status-Line
54generic-message = start-line
55 *(message-header CRLF)
56 CRLF
57 [ message-body ]
58
59In the interest of robustness, servers SHOULD ignore any empty line(s) received
60where a Request-Line is expected. In other words, if the server is reading the
61protocol stream at the beginning of a message and receives a CRLF first, it
62should ignore the CRLF.
63
64
654.2 Message headers
66
67- Each header field consists of a name followed by a colon (":") and the field
68 value.
69- Field names are case-insensitive.
70- The field value MAY be preceded by any amount of LWS, though a single SP is
71 preferred.
72- Header fields can be extended over multiple lines by preceding each extra
73 line with at least one SP or HT.
74
75
76message-header = field-name ":" [ field-value ]
77field-name = token
78field-value = *( field-content | LWS )
79field-content = <the OCTETs making up the field-value and consisting of
80 either *TEXT or combinations of token, separators, and
81 quoted-string>
82
83
84The field-content does not include any leading or trailing LWS occurring before
85the first non-whitespace character of the field-value or after the last
86non-whitespace character of the field-value. Such leading or trailing LWS MAY
87be removed without changing the semantics of the field value. Any LWS that
88occurs between field-content MAY be replaced with a single SP before
89interpreting the field value or forwarding the message downstream.
90
91
92=> format des headers = 1*(CHAR & !ctl & !sep) ":" *(OCTET & (!ctl | LWS))
93=> les regex de matching de headers s'appliquent sur field-content, et peuvent
94 utiliser field-value comme espace de travail (mais de préférence après le
95 premier SP).
96
97(19.3) The line terminator for message-header fields is the sequence CRLF.
98However, we recommend that applications, when parsing such headers, recognize
99a single LF as a line terminator and ignore the leading CR.
100
101
102
103
104
105message-body = entity-body
106 | <entity-body encoded as per Transfer-Encoding>
107
108
109
1105 Request
111
112Request = Request-Line
113 *(( general-header
114 | request-header
115 | entity-header ) CRLF)
116 CRLF
117 [ message-body ]
118
119
120
1215.1 Request line
122
123The elements are separated by SP characters. No CR or LF is allowed except in
124the final CRLF sequence.
125
126Request-Line = Method SP Request-URI SP HTTP-Version CRLF
127
128(19.3) Clients SHOULD be tolerant in parsing the Status-Line and servers
129tolerant when parsing the Request-Line. In particular, they SHOULD accept any
130amount of SP or HT characters between fields, even though only a single SP is
131required.
132
1334.5 General headers
134Apply to MESSAGE.
135
136general-header = Cache-Control
137 | Connection
138 | Date
139 | Pragma
140 | Trailer
141 | Transfer-Encoding
142 | Upgrade
143 | Via
144 | Warning
145
146General-header field names can be extended reliably only in combination with a
147change in the protocol version. However, new or experimental header fields may
148be given the semantics of general header fields if all parties in the
149communication recognize them to be general-header fields. Unrecognized header
150fields are treated as entity-header fields.
151
152
153
154
1555.3 Request Header Fields
156
157The request-header fields allow the client to pass additional information about
158the request, and about the client itself, to the server. These fields act as
159request modifiers, with semantics equivalent to the parameters on a programming
160language method invocation.
161
162request-header = Accept
163 | Accept-Charset
164 | Accept-Encoding
165 | Accept-Language
166 | Authorization
167 | Expect
168 | From
169 | Host
170 | If-Match
171 | If-Modified-Since
172 | If-None-Match
173 | If-Range
174 | If-Unmodified-Since
175 | Max-Forwards
176 | Proxy-Authorization
177 | Range
178 | Referer
179 | TE
180 | User-Agent
181
182Request-header field names can be extended reliably only in combination with a
183change in the protocol version. However, new or experimental header fields MAY
184be given the semantics of request-header fields if all parties in the
185communication recognize them to be request-header fields. Unrecognized header
186fields are treated as entity-header fields.
187
188
189
1907.1 Entity header fields
191
192Entity-header fields define metainformation about the entity-body or, if no
193body is present, about the resource identified by the request. Some of this
194metainformation is OPTIONAL; some might be REQUIRED by portions of this
195specification.
196
197entity-header = Allow
198 | Content-Encoding
199 | Content-Language
200 | Content-Length
201 | Content-Location
202 | Content-MD5
203 | Content-Range
204 | Content-Type
205 | Expires
206 | Last-Modified
207 | extension-header
208extension-header = message-header
209
210The extension-header mechanism allows additional entity-header fields to be
211defined without changing the protocol, but these fields cannot be assumed to be
212recognizable by the recipient. Unrecognized header fields SHOULD be ignored by
213the recipient and MUST be forwarded by transparent proxies.
214
Willy Tarreau1ba6a732007-01-07 12:43:29 +0100215---- The correct way to do it ----
216
217- one http_session
218 It is basically any transport session on which we talk HTTP. It may be TCP,
219 SSL over TCP, etc... It knows a way to talk to the client, either the socket
220 file descriptor or a direct access to the client-side buffer. It should hold
221 information about the last accessed server so that we can guarantee that the
222 same server can be used during a whole session if needed. A first version
223 without optimal support for HTTP pipelining will have the client buffers tied
224 to the http_session. It may be possible that it is not sufficient for full
225 pipelining, but this will need further study. The link from the buffers to
Willy Tarreaub326fcc2007-03-03 13:54:32 +0100226 the backend should be managed by the http transaction (http_txn), provided
227 that they are serialized. Each http_session, has 0 to N http_txn. Each
228 http_txn belongs to one and only one http_session.
Willy Tarreau1ba6a732007-01-07 12:43:29 +0100229
Willy Tarreaub326fcc2007-03-03 13:54:32 +0100230- each http_txn has 1 request message (http_req), and 0 or 1 response message
231 (http_rtr). Each of them has 1 and only one http_txn. An http_txn holds
232 informations such as the HTTP method, the URI, the HTTP version, the
233 transfer-encoding, the HTTP status, the authorization, the req and rtr
234 content-length, the timers, logs, etc... The backend and server which process
235 the request are also known from the http_txn.
Willy Tarreau1ba6a732007-01-07 12:43:29 +0100236
237- both request and response messages hold header and parsing informations, such
Willy Tarreaub326fcc2007-03-03 13:54:32 +0100238 as the parsing state, start of headers, start of message, captures, etc...
Willy Tarreau1ba6a732007-01-07 12:43:29 +0100239