blob: c7d4345e779ee67a8fcf06173010b5634e4e857a [file] [log] [blame]
Willy Tarreau1122d9c2012-02-27 19:31:50 +010012012/02/27 - redesigning buffers for better simplicity - w@1wt.eu
2
31) Analysis
4-----------
5
6Buffer handling becomes complex because buffers are circular but many of their
7users don't support wrapping operations (eg: HTTP parsing). Due to this fact,
8some buffer operations automatically realign buffers as soon as possible when
9the buffer is empty, which makes it very hard to track buffer pointers outside
10of the buffer struct itself. The buffer contains a pointer to last processed
11data (buf->lr) which is automatically realigned with such operations. But in
12the end, its semantics are often unclear and whether it's safe or not to use it
13isn't always obvious, as it has acquired multiple roles over the time.
14
15A "struct buffer" is declared this way :
16
17 struct buffer {
18 unsigned int flags; /* BF_* */
19 int rex; /* expiration date for a read, in ticks */
20 int wex; /* expiration date for a write or connect, in ticks */
21 int rto; /* read timeout, in ticks */
22 int wto; /* write timeout, in ticks */
23 unsigned int l; /* data length */
24 char *r, *w, *lr; /* read ptr, write ptr, last read */
25 unsigned int size; /* buffer size in bytes */
26 unsigned int send_max; /* number of bytes the sender can consume om this buffer, <= l */
27 unsigned int to_forward; /* number of bytes to forward after send_max without a wake-up */
28 unsigned int analysers; /* bit field indicating what to do on the buffer */
29 int analyse_exp; /* expiration date for current analysers (if set) */
30 void (*hijacker)(struct session *, struct buffer *); /* alternative content producer */
31 unsigned char xfer_large; /* number of consecutive large xfers */
32 unsigned char xfer_small; /* number of consecutive small xfers */
33 unsigned long long total; /* total data read */
34 struct stream_interface *prod; /* producer attached to this buffer */
35 struct stream_interface *cons; /* consumer attached to this buffer */
36 struct pipe *pipe; /* non-NULL only when data present */
37 char data[0]; /* <size> bytes */
38 };
39
40In order to address this, a struct http_msg was created with other pointers to
41the buffer. The issue is that some of these pointers are absolute and other
42ones are relative, sometimes one to another, sometimes to the beginning of the
43buffer, which doesn't help at all for the case where buffers get realigned.
44
45A "struct http_msg" is defined this way :
46
47 struct http_msg {
48 unsigned int msg_state;
49 unsigned int flags;
50 unsigned int col, sov; /* current header: colon, start of value */
51 unsigned int eoh; /* End Of Headers, relative to buffer */
52 char *sol; /* start of line, also start of message when fully parsed */
53 char *eol; /* end of line */
54 unsigned int som; /* Start Of Message, relative to buffer */
55 int err_pos; /* err handling: -2=block, -1=pass, 0+=detected */
56 union { /* useful start line pointers, relative to ->sol */
57 struct {
58 int l; /* request line length (not including CR) */
59 int m_l; /* METHOD length (method starts at ->som) */
60 int u, u_l; /* URI, length */
61 int v, v_l; /* VERSION, length */
62 } rq; /* request line : field, length */
63 struct {
64 int l; /* status line length (not including CR) */
65 int v_l; /* VERSION length (version starts at ->som) */
66 int c, c_l; /* CODE, length */
67 int r, r_l; /* REASON, length */
68 } st; /* status line : field, length */
69 } sl; /* start line */
70 unsigned long long chunk_len;
71 unsigned long long body_len;
72 char **cap;
73 };
74
75
76The first immediate observation is that nothing in a buffer should be relative
77to the beginning of the storage area, everything should be relative to the
78buffer's origin as a floating location. Right now the buffer's origin is equal
79to (buf->w + buf->send_max). It is the place where the first byte of data not
80yet scheduled for being forwarded is found.
81
82 - buf->w is an absolute pointer, just like buf->data.
83 - buf->send_max is a relative value which oscillates between 0 when nothing
84 has to be forwarded, and buf->l when the whole buffer must be forwarded.
85
86
872) Proposal
88-----------
89
90By having such an origin, we could have everything in http_msg relative to this
91origin. This would resist buffer realigns much better than right now.
92
93At the moment we have msg->som which is relative to buf->data and which points
94to the beginning of the message. The beginning of the message should *always*
95be the buffer's origin. If data are to be skipped in the message, just wait for
96send_max to become zero and move the origin forwards ; this would definitely get
97rid of msg->som. This is already what is done in the HTTP parser except that it
98has to move both buf->lr and msg->som.
99
100Following the same principle, we should then have a relative pointer in
101http_msg to replace buf->lr. It would be relative to the buffer's origin and
102would simply recall what location was last visited.
103
104Doing all this could result in more complex operations where more time is spent
105adding buf->w to buf->send_max and then to msg->anything. It would probably make
106more sense to define the buffer's origin as an absolute pointer and to have
107both the buf->h (head) and buf->t (tail) pointers be positive and negative
108positions relative to this origin. Operating on the buffer would then look like
109this :
110
111 - no buf->l anymore. buf->l is replaced by (head + tail)
112 - no buf->lr anymore. Use origin + msg->last for instance
113 - recv() : head += recv(origin + head);
114 - send() : tail -= send(origin - tail, tail);
115 thus, buf->o effectively replaces buf->send_max.
116 - forward(N) : tail += N; origin += N;
117 - realign() : origin = data
118 - detect risk of wrapping of input : origin + head > data + size
119
120In general it looks like less pointers are manipulated for common operations
121and that maybe an additional wrapping test (hand-made modulo) will have to be
122added so send() and recv() operations.
123
124
1253) Caveats
126----------
127
128The first caveat is that the elements to modify appear at a very large number
129of places.