Willy Tarreau | 067fcef | 2015-08-06 15:31:23 +0200 | [diff] [blame] | 1 | 2015/08/06 - server connection sharing |
| 2 | |
| 3 | Improvements on the connection sharing strategies |
| 4 | ------------------------------------------------- |
| 5 | |
| 6 | 4 strategies are currently supported : |
| 7 | - never |
| 8 | - safe |
| 9 | - aggressive |
| 10 | - always |
| 11 | |
| 12 | The "aggressive" and "always" strategies take into account the fact that the |
| 13 | connection has already been reused at least once or not. The principle is that |
| 14 | second requests can be used to safely "validate" connection reuse on newly |
| 15 | added connections, and that such validated connections may be used even by |
| 16 | first requests from other sessions. A validated connection is a connection |
| 17 | which has already been reused, hence proving that it definitely supports |
| 18 | multiple requests. Such connections are easy to verify : after processing the |
| 19 | response, if the txn already had the TX_NOT_FIRST flag, then it was not the |
| 20 | first request over that connection, and it is validated as safe for reuse. |
| 21 | Validated connections are put into a distinct list : server->safe_conns. |
| 22 | |
| 23 | Incoming requests with TX_NOT_FIRST first pick from the regular idle_conns |
| 24 | list so that any new idle connection is validated as soon as possible. |
| 25 | |
| 26 | Incoming requests without TX_NOT_FIRST only pick from the safe_conns list for |
| 27 | strategy "aggressive", guaranteeing that the server properly supports connection |
| 28 | reuse, or first from the safe_conns list, then from the idle_conns list for |
| 29 | strategy "always". |
| 30 | |
| 31 | Connections are always stacked into the list (LIFO) so that there are higher |
| 32 | changes to convert recent connections and to use them. This will first optimize |
| 33 | the likeliness that the connection works, and will avoid TCP metrics from being |
| 34 | lost due to an idle state, and/or the congestion window to drop and the |
| 35 | connection going to slow start mode. |
| 36 | |
| 37 | |
| 38 | Handling connections in pools |
| 39 | ----------------------------- |
| 40 | |
| 41 | A per-server "pool-max" setting should be added to permit disposing unused idle |
| 42 | connections not attached anymore to a session for use by future requests. The |
| 43 | principle will be that attached connections are queued from the front of the |
| 44 | list while the detached connections will be queued from the tail of the list. |
| 45 | |
| 46 | This way, most reused connections will be fairly recent and detached connections |
| 47 | will most often be ignored. The number of detached idle connections in the lists |
| 48 | should be accounted for (pool_used) and limited (pool_max). |
| 49 | |
| 50 | After some time, a part of these detached idle connections should be killed. |
| 51 | For this, the list is walked from tail to head and connections without an owner |
| 52 | may be evicted. It may be useful to have a per-server pool_min setting |
| 53 | indicating how many idle connections should remain in the pool, ready for use |
| 54 | by new requests. Conversely, a pool_low metric should be kept between eviction |
| 55 | runs, to indicate the lowest amount of detached connections that were found in |
| 56 | the pool. |
| 57 | |
| 58 | For eviction, the principle of a half-life is appealing. The principle is |
| 59 | simple : over a period of time, half of the connections between pool_min and |
| 60 | pool_low should be gone. Since pool_low indicates how many connections were |
| 61 | remaining unused over a period, it makes sense to kill some of them. |
| 62 | |
| 63 | In order to avoid killing thousands of connections in one run, the purge |
| 64 | interval should be split into smaller batches. Let's call N the ratio of the |
| 65 | half-life interval and the effective interval. |
| 66 | |
| 67 | The algorithm consists in walking over them from the end every interval and |
| 68 | killing ((pool_low - pool_min) + 2 * N - 1) / (2 * N). It ensures that half |
| 69 | of the unused connections are killed over the half-life period, in N batches |
| 70 | of population/2N entries at most. |
| 71 | |
| 72 | Unsafe connections should be evicted first. There should be quite few of them |
| 73 | since most of them are probed and become safe. Since detached connections are |
| 74 | quickly recycled and attached to a new session, there should not be too many |
| 75 | detached connections in the pool, and those present there may be killed really |
| 76 | quickly. |
| 77 | |
| 78 | Another interesting point of pools is that when a pool-max is not null, then it |
| 79 | makes sense to automatically enable pretend-keep-alive on non-private connections |
| 80 | going to the server in order to be able to feed them back into the pool. With |
| 81 | the "aggressive" or "always" strategies, it can allow clients making a single |
| 82 | request over their connection to share persistent connections to the servers. |
| 83 | |
| 84 | |
| 85 | |
Willy Tarreau | c14b7d9 | 2014-06-19 16:03:41 +0200 | [diff] [blame] | 86 | 2013/10/17 - server connection management and reuse |
| 87 | |
| 88 | Current state |
| 89 | ------------- |
| 90 | |
| 91 | At the moment, a connection entity is needed to carry any address |
| 92 | information. This means in the following situations, we need a server |
| 93 | connection : |
| 94 | |
| 95 | - server is elected and the server's destination address is set |
| 96 | |
| 97 | - transparent mode is elected and the destination address is set from |
| 98 | the incoming connection |
| 99 | |
| 100 | - proxy mode is enabled, and the destination's address is set during |
| 101 | the parsing of the HTTP request |
| 102 | |
| 103 | - connection to the server fails and must be retried on the same |
| 104 | server using the same parameters, especially the destination |
| 105 | address (SN_ADDR_SET not removed) |
| 106 | |
| 107 | |
| 108 | On the accepting side, we have further requirements : |
| 109 | |
| 110 | - allocate a clean connection without a stream interface |
| 111 | |
| 112 | - incrementally set the accepted connection's parameters without |
| 113 | clearing it, and keep track of what is set (eg: getsockname). |
| 114 | |
| 115 | - initialize a stream interface in established mode |
| 116 | |
| 117 | - attach the accepted connection to a stream interface |
| 118 | |
| 119 | |
| 120 | This means several things : |
| 121 | |
| 122 | - the connection has to be allocated on the fly the first time it is |
| 123 | needed to store the source or destination address ; |
| 124 | |
| 125 | - the connection has to be attached to the stream interface at this |
| 126 | moment ; |
| 127 | |
| 128 | - it must be possible to incrementally set some settings on the |
| 129 | connection's addresses regardless of the connection's current state |
| 130 | |
| 131 | - the connection must not be released across connection retries ; |
| 132 | |
| 133 | - it must be possible to clear a connection's parameters for a |
| 134 | redispatch without having to detach/attach the connection ; |
| 135 | |
| 136 | - we need to allocate a connection without an existing stream interface |
| 137 | |
| 138 | So on the accept() side, it looks like this : |
| 139 | |
| 140 | fd = accept(); |
| 141 | conn = new_conn(); |
| 142 | get_some_addr_info(&conn->addr); |
| 143 | ... |
| 144 | si = new_si(); |
| 145 | si_attach_conn(si, conn); |
| 146 | si_set_state(si, SI_ST_EST); |
| 147 | ... |
| 148 | get_more_addr_info(&conn->addr); |
| 149 | |
| 150 | On the connect() side, it looks like this : |
| 151 | |
| 152 | si = new_si(); |
| 153 | while (!properly_connected) { |
| 154 | if (!(conn = si->end)) { |
| 155 | conn = new_conn(); |
| 156 | conn_clear(conn); |
| 157 | si_attach_conn(si, conn); |
| 158 | } |
| 159 | else { |
| 160 | if (connected) { |
| 161 | f = conn->flags & CO_FL_XPRT_TRACKED; |
| 162 | conn->flags &= ~CO_FL_XPRT_TRACKED; |
| 163 | conn_close(conn); |
| 164 | conn->flags |= f; |
| 165 | } |
| 166 | if (!correct_dest) |
| 167 | conn_clear(conn); |
| 168 | } |
| 169 | set_some_addr_info(&conn->addr); |
| 170 | si_set_state(si, SI_ST_CON); |
| 171 | ... |
| 172 | set_more_addr_info(&conn->addr); |
| 173 | conn->connect(); |
| 174 | if (must_retry) { |
| 175 | close_conn(conn); |
| 176 | } |
| 177 | } |
| 178 | |
| 179 | Note: we need to be able to set the control and transport protocols. |
| 180 | On outgoing connections, this is set once we know the destination address. |
| 181 | On incoming connections, this is set the earliest possible (once we know |
| 182 | the source address). |
| 183 | |
| 184 | The problem analysed below was solved on 2013/10/22 |
| 185 | |
| 186 | | ==> the real requirement is to know whether a connection is still valid or not |
| 187 | | before deciding to close it. CO_FL_CONNECTED could be enough, though it |
| 188 | | will not indicate connections that are still waiting for a connect to occur. |
| 189 | | This combined with CO_FL_WAIT_L4_CONN and CO_FL_WAIT_L6_CONN should be OK. |
| 190 | | |
| 191 | | Alternatively, conn->xprt could be used for this, but needs some careful checks |
| 192 | | (it's used by conn_full_close at least). |
| 193 | | |
| 194 | | Right now, conn_xprt_close() checks conn->xprt and sets it to NULL. |
| 195 | | conn_full_close() also checks conn->xprt and sets it to NULL, except |
| 196 | | that the check on ctrl is performed within xprt. So conn_xprt_close() |
| 197 | | followed by conn_full_close() will not close the file descriptor. |
| 198 | | Note that conn_xprt_close() is never called, maybe we should kill it ? |
| 199 | | |
| 200 | | Note: at the moment, it's problematic to leave conn->xprt to NULL before doing |
| 201 | | xprt_init() because we might end up with a pending file descriptor. Or at |
| 202 | | least with some transport not de-initialized. We might thus need |
| 203 | | conn_xprt_close() when conn_xprt_init() fails. |
| 204 | | |
| 205 | | The fd should be conditionned by ->ctrl only, and the transport layer by ->xprt. |
| 206 | | |
| 207 | | - conn_prepare_ctrl(conn, ctrl) |
| 208 | | - conn_prepare_xprt(conn, xprt) |
| 209 | | - conn_prepare_data(conn, data) |
| 210 | | |
| 211 | | Note: conn_xprt_init() needs conn->xprt so it's not a problem to set it early. |
| 212 | | |
| 213 | | One problem might be with conn_xprt_close() not being able to know if xprt_init() |
| 214 | | was called or not. That's where it might make sense to only set ->xprt during init. |
| 215 | | Except that it does not fly with outgoing connections (xprt_init is called after |
| 216 | | connect()). |
| 217 | | |
| 218 | | => currently conn_xprt_close() is only used by ssl_sock.c and decides whether |
| 219 | | to do something based on ->xprt_ctx which is set by ->init() from xprt_init(). |
| 220 | | So there is nothing to worry about. We just need to restore conn_xprt_close() |
| 221 | | and rely on ->ctrl to close the fd instead of ->xprt. |
| 222 | | |
| 223 | | => we have the same issue with conn_ctrl_close() : when is the fd supposed to be |
| 224 | | valid ? On outgoing connections, the control is set much before the fd... |