Willy Tarreau | 1122d9c | 2012-02-27 19:31:50 +0100 | [diff] [blame] | 1 | 2012/02/27 - redesigning buffers for better simplicity - w@1wt.eu |
| 2 | |
| 3 | 1) Analysis |
| 4 | ----------- |
| 5 | |
| 6 | Buffer handling becomes complex because buffers are circular but many of their |
| 7 | users don't support wrapping operations (eg: HTTP parsing). Due to this fact, |
| 8 | some buffer operations automatically realign buffers as soon as possible when |
| 9 | the buffer is empty, which makes it very hard to track buffer pointers outside |
| 10 | of the buffer struct itself. The buffer contains a pointer to last processed |
| 11 | data (buf->lr) which is automatically realigned with such operations. But in |
| 12 | the end, its semantics are often unclear and whether it's safe or not to use it |
| 13 | isn't always obvious, as it has acquired multiple roles over the time. |
| 14 | |
| 15 | A "struct buffer" is declared this way : |
| 16 | |
| 17 | struct buffer { |
| 18 | unsigned int flags; /* BF_* */ |
| 19 | int rex; /* expiration date for a read, in ticks */ |
| 20 | int wex; /* expiration date for a write or connect, in ticks */ |
| 21 | int rto; /* read timeout, in ticks */ |
| 22 | int wto; /* write timeout, in ticks */ |
| 23 | unsigned int l; /* data length */ |
| 24 | char *r, *w, *lr; /* read ptr, write ptr, last read */ |
| 25 | unsigned int size; /* buffer size in bytes */ |
| 26 | unsigned int send_max; /* number of bytes the sender can consume om this buffer, <= l */ |
| 27 | unsigned int to_forward; /* number of bytes to forward after send_max without a wake-up */ |
| 28 | unsigned int analysers; /* bit field indicating what to do on the buffer */ |
| 29 | int analyse_exp; /* expiration date for current analysers (if set) */ |
| 30 | void (*hijacker)(struct session *, struct buffer *); /* alternative content producer */ |
| 31 | unsigned char xfer_large; /* number of consecutive large xfers */ |
| 32 | unsigned char xfer_small; /* number of consecutive small xfers */ |
| 33 | unsigned long long total; /* total data read */ |
| 34 | struct stream_interface *prod; /* producer attached to this buffer */ |
| 35 | struct stream_interface *cons; /* consumer attached to this buffer */ |
| 36 | struct pipe *pipe; /* non-NULL only when data present */ |
| 37 | char data[0]; /* <size> bytes */ |
| 38 | }; |
| 39 | |
| 40 | In order to address this, a struct http_msg was created with other pointers to |
| 41 | the buffer. The issue is that some of these pointers are absolute and other |
| 42 | ones are relative, sometimes one to another, sometimes to the beginning of the |
| 43 | buffer, which doesn't help at all for the case where buffers get realigned. |
| 44 | |
| 45 | A "struct http_msg" is defined this way : |
| 46 | |
| 47 | struct http_msg { |
| 48 | unsigned int msg_state; |
| 49 | unsigned int flags; |
| 50 | unsigned int col, sov; /* current header: colon, start of value */ |
| 51 | unsigned int eoh; /* End Of Headers, relative to buffer */ |
| 52 | char *sol; /* start of line, also start of message when fully parsed */ |
| 53 | char *eol; /* end of line */ |
| 54 | unsigned int som; /* Start Of Message, relative to buffer */ |
| 55 | int err_pos; /* err handling: -2=block, -1=pass, 0+=detected */ |
| 56 | union { /* useful start line pointers, relative to ->sol */ |
| 57 | struct { |
| 58 | int l; /* request line length (not including CR) */ |
| 59 | int m_l; /* METHOD length (method starts at ->som) */ |
| 60 | int u, u_l; /* URI, length */ |
| 61 | int v, v_l; /* VERSION, length */ |
| 62 | } rq; /* request line : field, length */ |
| 63 | struct { |
| 64 | int l; /* status line length (not including CR) */ |
| 65 | int v_l; /* VERSION length (version starts at ->som) */ |
| 66 | int c, c_l; /* CODE, length */ |
| 67 | int r, r_l; /* REASON, length */ |
| 68 | } st; /* status line : field, length */ |
| 69 | } sl; /* start line */ |
| 70 | unsigned long long chunk_len; |
| 71 | unsigned long long body_len; |
| 72 | char **cap; |
| 73 | }; |
| 74 | |
| 75 | |
| 76 | The first immediate observation is that nothing in a buffer should be relative |
| 77 | to the beginning of the storage area, everything should be relative to the |
| 78 | buffer's origin as a floating location. Right now the buffer's origin is equal |
| 79 | to (buf->w + buf->send_max). It is the place where the first byte of data not |
| 80 | yet scheduled for being forwarded is found. |
| 81 | |
| 82 | - buf->w is an absolute pointer, just like buf->data. |
| 83 | - buf->send_max is a relative value which oscillates between 0 when nothing |
| 84 | has to be forwarded, and buf->l when the whole buffer must be forwarded. |
| 85 | |
| 86 | |
| 87 | 2) Proposal |
| 88 | ----------- |
| 89 | |
| 90 | By having such an origin, we could have everything in http_msg relative to this |
| 91 | origin. This would resist buffer realigns much better than right now. |
| 92 | |
| 93 | At the moment we have msg->som which is relative to buf->data and which points |
| 94 | to the beginning of the message. The beginning of the message should *always* |
| 95 | be the buffer's origin. If data are to be skipped in the message, just wait for |
| 96 | send_max to become zero and move the origin forwards ; this would definitely get |
| 97 | rid of msg->som. This is already what is done in the HTTP parser except that it |
| 98 | has to move both buf->lr and msg->som. |
| 99 | |
| 100 | Following the same principle, we should then have a relative pointer in |
| 101 | http_msg to replace buf->lr. It would be relative to the buffer's origin and |
| 102 | would simply recall what location was last visited. |
| 103 | |
| 104 | Doing all this could result in more complex operations where more time is spent |
| 105 | adding buf->w to buf->send_max and then to msg->anything. It would probably make |
| 106 | more sense to define the buffer's origin as an absolute pointer and to have |
| 107 | both the buf->h (head) and buf->t (tail) pointers be positive and negative |
| 108 | positions relative to this origin. Operating on the buffer would then look like |
| 109 | this : |
| 110 | |
| 111 | - no buf->l anymore. buf->l is replaced by (head + tail) |
| 112 | - no buf->lr anymore. Use origin + msg->last for instance |
| 113 | - recv() : head += recv(origin + head); |
| 114 | - send() : tail -= send(origin - tail, tail); |
| 115 | thus, buf->o effectively replaces buf->send_max. |
| 116 | - forward(N) : tail += N; origin += N; |
| 117 | - realign() : origin = data |
| 118 | - detect risk of wrapping of input : origin + head > data + size |
| 119 | |
| 120 | In general it looks like less pointers are manipulated for common operations |
| 121 | and that maybe an additional wrapping test (hand-made modulo) will have to be |
| 122 | added so send() and recv() operations. |
| 123 | |
| 124 | |
| 125 | 3) Caveats |
| 126 | ---------- |
| 127 | |
| 128 | The first caveat is that the elements to modify appear at a very large number |
| 129 | of places. |