Willy Tarreau | 84c4e74 | 2021-05-09 06:38:07 +0200 | [diff] [blame] | 1 | HAProxy's peers v2.0 protocol 08/18/2016 |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 2 | |
| 3 | Author: Emeric Brun ebrun@haproxy.com |
| 4 | |
| 5 | |
| 6 | I) Encoded Integer and Bitfield. |
| 7 | |
| 8 | |
| 9 | 0 <= X < 240 : 1 byte (7.875 bits) [ XXXX XXXX ] |
| 10 | 240 <= X < 2288 : 2 bytes (11 bits) [ 1111 XXXX ] [ 0XXX XXXX ] |
| 11 | 2288 <= X < 264432 : 3 bytes (18 bits) [ 1111 XXXX ] [ 1XXX XXXX ] [ 0XXX XXXX ] |
| 12 | 264432 <= X < 33818864 : 4 bytes (25 bits) [ 1111 XXXX ] [ 1XXX XXXX ]*2 [ 0XXX XXXX ] |
| 13 | 33818864 <= X < 4328786160 : 5 bytes (32 bits) [ 1111 XXXX ] [ 1XXX XXXX ]*3 [ 0XXX XXXX ] |
| 14 | |
| 15 | |
| 16 | |
| 17 | |
| 18 | I) Handshake |
| 19 | |
| 20 | Each peer try to connect to each others, and each peer is listening |
| 21 | for a connect from other peers. |
| 22 | |
| 23 | |
| 24 | Client Server |
| 25 | Hello Message |
| 26 | ------------------------> |
| 27 | |
| 28 | Status Message |
| 29 | <------------------------ |
| 30 | |
| 31 | 1) Hello Message |
| 32 | |
| 33 | Hello message is composed of 3 lines: |
| 34 | |
| 35 | <protocol> <version> |
| 36 | <remotepeerid> |
| 37 | <localpeerid> <processpid> <relativepid> |
| 38 | |
Willy Tarreau | 84c4e74 | 2021-05-09 06:38:07 +0200 | [diff] [blame] | 39 | protocol: current value is "HAProxyS" |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 40 | version: current value is "2.0" |
| 41 | remotepeerid: is the name of the target peer as defined in the configuration peers section. |
| 42 | localpeerid: is the name of the local peer as defined on cmdline or using hostname. |
| 43 | processid: is the system process id of the local process. |
| 44 | relativepid: is the haproxy's relative pid (0 if nbproc == 1) |
| 45 | |
| 46 | 2) Status Message |
| 47 | |
| 48 | Status message is a code followed by a LF. |
| 49 | |
| 50 | 200: Handshake succeeded |
| 51 | 300: Try again later |
| 52 | 501: Protocol error |
| 53 | 502: Bad version |
| 54 | 503: Local peer name mismatch |
| 55 | 504: Remote peer name mismatch |
| 56 | |
| 57 | |
| 58 | IV) Messages |
| 59 | |
| 60 | Messages: |
| 61 | |
| 62 | 0 - - - - - - - 8 - - - - - - - 16 |
| 63 | Message Class| Message Type |
| 64 | |
| 65 | if Message Type >= 128 |
| 66 | |
| 67 | 0 - - - - - - - 8 - - - - - - - 16 ..... |
| 68 | Message Class| Message Type | encoded data length | data |
| 69 | |
| 70 | Message Classes: |
| 71 | 0: control |
| 72 | 1: error |
| 73 | 10: related to stick table updates |
| 74 | 255: reserved |
| 75 | |
| 76 | |
| 77 | 1) Control Messages Class |
| 78 | |
| 79 | Available message Types for class control: |
| 80 | 0: resync request |
| 81 | 1: resync finished |
| 82 | 2: resync partial |
| 83 | 3: resync confirm |
| 84 | |
| 85 | |
| 86 | a) Resync Request Message |
| 87 | |
| 88 | This message is used to request a full resync from a peer |
| 89 | |
| 90 | b) Resync Finished Message |
| 91 | |
| 92 | This message is used to signal remote peer that locally known updates have been pushed, and local peer was considered up to date. |
| 93 | |
| 94 | c) Resync Partial Message |
| 95 | |
| 96 | This message is used to signal remote peer that locally known updates have been pushed, and but the local peer is not considered up to date. |
| 97 | |
| 98 | d) Resync Confirm Message |
| 99 | |
| 100 | This message is an ack for Resync Partial or Finished Messages. |
| 101 | |
| 102 | It's allow the remote peer to go back to "on the fly" update process. |
| 103 | |
| 104 | |
| 105 | 2) Messages Class |
| 106 | |
| 107 | Available message Types for this class are: |
| 108 | 0: protocol error |
| 109 | 1: size limit reached |
| 110 | |
| 111 | a) Protocol Message |
| 112 | |
Joseph Herlant | 71b4b15 | 2018-11-13 16:55:16 -0800 | [diff] [blame] | 113 | To signal that a protocol error occurred. Connection will be shutdown just after sending this message. |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 114 | |
| 115 | b) Size Limit Error Message |
| 116 | |
| 117 | To signal that a message is outsized and can not be correctly handled. Connection will be broken. |
| 118 | |
| 119 | |
| 120 | |
| 121 | 3) Stick Table Updates Messages Class |
| 122 | |
| 123 | Available message Types for this class are: |
| 124 | 0: Entry update |
| 125 | 1: Incremental entry update |
| 126 | 2: table definition |
| 127 | 3: table switch |
| 128 | 4: updates ack message. |
| 129 | |
| 130 | |
| 131 | a) Update Message |
| 132 | |
| 133 | 0 - - - - - - - 8 - - - - - - - 16 ..... |
| 134 | Message class | Message Type | encoded data length | data |
| 135 | |
| 136 | |
| 137 | data is composed like this |
| 138 | |
| 139 | 0 - - - - - - - 32 ............................. |
| 140 | Local Update ID | Key value | data values .... |
| 141 | |
| 142 | Update ID in a 32bits identifier of the local update. |
| 143 | |
| 144 | Key value format depends of the table key type: |
| 145 | |
| 146 | - for keytype string |
| 147 | |
| 148 | 0 ................................. |
| 149 | encoded string length | string value |
| 150 | |
| 151 | - for keytype integer |
| 152 | 0 - - - - - - - - - - 32 |
| 153 | encoded integer value | |
| 154 | |
| 155 | - for other key type |
| 156 | |
| 157 | The value length is annonced in table definition message |
| 158 | 0 .................... |
| 159 | value |
| 160 | |
| 161 | |
| 162 | b) Incremental Update Message |
| 163 | |
| 164 | Same format than update message except the Update ID is not present, the receiver should |
| 165 | consider that the update ID is an increment of 1 of the previous considered update message (partial or not) |
| 166 | |
| 167 | |
| 168 | c) Table Definition Message |
| 169 | |
| 170 | This message is used by the receiver to identify the stick table concerned by next update messages and |
| 171 | to know which data is pushed in these updates. |
| 172 | |
| 173 | 0 - - - - - - - 8 - - - - - - - 16 ..... |
| 174 | Message class | Message Type | encoded data length | data |
| 175 | |
| 176 | |
| 177 | data is composed like this |
| 178 | |
| 179 | 0 ................................................................... |
| 180 | Encoded Sender Table Id | Encoded Local Table Name Length | Table Name | Encoded Table Type | Encoded Table Keylen | Encoded Table Data Types Bitfield |
| 181 | |
| 182 | |
| 183 | Encoded Sender Table ID present a the id numerical ID affected to that table by the sender |
| 184 | It will be used by "Updates Aknowlegement Messages" and "Table Switch Messages". |
| 185 | |
| 186 | Encoded Local Table Name Length present the length to read the table name. |
| 187 | |
| 188 | "Table Name" is the shared identifier of the table (name of the current table in the configuration) |
| 189 | It permits the receiver to identify the concerned table. The receiver should keep in memory the matching |
| 190 | between the "Sender Table ID" to identify it directly in case of "Table Switch Message". |
| 191 | |
| 192 | Table Type present the numeric type of key used to store stick table entries: |
| 193 | integer |
Emeric Brun | 530ba38 | 2020-06-02 11:17:42 +0200 | [diff] [blame] | 194 | 2: signed integer |
| 195 | 4: IPv4 address |
| 196 | 5: IPv6 address |
| 197 | 6: string |
| 198 | 7: binary |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 199 | |
| 200 | Table Keylen present the key length or max length in case of strings or binary (padded with 0). |
| 201 | |
| 202 | Data Types Bitfield present the types of data linearly pushed in next updates message (they will be linearly pushed in the update message) |
| 203 | Known types are |
| 204 | bit |
| 205 | 0: server id |
| 206 | 1: gpt0 |
| 207 | 2: gpc0 |
| 208 | 3: gpc0 rate |
| 209 | 4: connections counter |
| 210 | 5: connection rate |
| 211 | 6: number of current connections |
| 212 | 7: sessions counter |
| 213 | 8: session rate |
| 214 | 9: http requests counter |
| 215 | 10: http requests rate |
| 216 | 11: errors counter |
| 217 | 12: errors rate |
| 218 | 13: bytes in counter |
| 219 | 14: bytes in rate |
| 220 | 15: bytes out rate |
| 221 | 16: bytes out rate |
Frédéric Lécaille | 6778b27 | 2018-01-29 15:22:53 +0100 | [diff] [blame] | 222 | 17: gpc1 |
| 223 | 18: gpc1 rate |
Willy Tarreau | 536cb58 | 2024-01-19 16:55:17 +0100 | [diff] [blame] | 224 | 19: server key |
| 225 | 20: http fail counter |
| 226 | 21: http fail rate |
| 227 | 22: gpt array |
| 228 | 23: gpc array |
| 229 | 24: gpc rate array |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 230 | |
| 231 | d) Table Switch Message |
| 232 | |
| 233 | After a Table Message Define, this message can be used by the receiver to identify the stick table concerned by next update messages. |
| 234 | |
| 235 | 0 - - - - - - - 8 - - - - - - - 16 ..... |
| 236 | Message class | Message Type | encoded data length | data |
| 237 | |
| 238 | |
| 239 | data is composed like this |
| 240 | |
| 241 | |
| 242 | 0 ..................... |
| 243 | encoded Sender Table Id |
| 244 | |
| 245 | c) Update Ack Message |
| 246 | |
| 247 | 0 - - - - - - - 8 - - - - - - - 16 ..... |
| 248 | Message class | Message Type | encoded data length | data |
| 249 | |
| 250 | data is composed like this |
| 251 | |
| 252 | 0 ....................... - - - - - - - - 32 |
| 253 | Encoded Remote Table Id | Update Id |
| 254 | |
| 255 | |
| 256 | Remote Table Id is the numeric identifier of the table on the remote side. |
Joseph Herlant | 71b4b15 | 2018-11-13 16:55:16 -0800 | [diff] [blame] | 257 | Update Id is the id of the last update locally committed. |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 258 | |
Jackie Tapia | 749f74c | 2020-07-22 18:59:40 -0500 | [diff] [blame] | 259 | If a re-connection occurred, the sender should know they will have to restart the push of updates from this point. |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 260 | |
| 261 | III) Initial full resync process. |
| 262 | |
| 263 | |
| 264 | a) Resync from local old process |
| 265 | |
| 266 | An old soft-stopped process will close all established sessions with remote peers and will try to connect to a new |
| 267 | local process to push all known ending with a Resync Finished Message or a Resync Partial Message (if it it does not consider itself as full updated). |
| 268 | |
| 269 | A new process will wait for a an incoming connection from a local process during 5 seconds. It will learn the updates from this |
Jackie Tapia | 749f74c | 2020-07-22 18:59:40 -0500 | [diff] [blame] | 270 | process until it receives a Resync Finished Message or a Resync Partial Message. If it receive a Resync Finished Message it will consider itself |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 271 | as fully updated and stops to ask for resync. If it receive a Resync Partial Message it will wait once again for 5 seconds for an other incoming connection from a local process. |
| 272 | Same thing if the session was broken before receiving any "Resync Partial Message" or "Resync Finished Message". |
| 273 | |
| 274 | If one of these 5 seconds timeout expire, the process will try to request resync from a remote connected peer (see b). The process will wait until 5seconds |
| 275 | if no available remote peers are found. |
| 276 | |
| 277 | If the timeout expire, the process will consider itself ass fully updated |
| 278 | |
| 279 | b) Resync from remote peers |
| 280 | |
| 281 | The process will randomly choose a remote connected peer and ask for a full resync using a Resync Request Message. The process will wait until 5seconds |
| 282 | if no available remote peers are found. |
| 283 | |
| 284 | The chosen remote peer will push its all known data ending with a Resync Finished Message or a Resync Partial Message (if it it does not consider itself as full updated). |
| 285 | |
Michael Prokop | 4438c60 | 2019-05-24 10:25:45 +0200 | [diff] [blame] | 286 | If it receives a Resync Finished Message it will consider itself as fully updated and stops to ask for resync. |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 287 | |
Michael Prokop | 4438c60 | 2019-05-24 10:25:45 +0200 | [diff] [blame] | 288 | If it receives a Resync Partial Message, the current peer will be flagged to anymore be requested and any other connected peer will be randomly chosen for a resync request (5s). |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 289 | |
Michael Prokop | 4438c60 | 2019-05-24 10:25:45 +0200 | [diff] [blame] | 290 | If the session is broken before receiving any of these messages any other connected peer will be randomly chosen for a resync request (5s). |
Emeric Brun | 9c05c48 | 2017-11-24 18:20:57 +0100 | [diff] [blame] | 291 | |
| 292 | If the timeout expire, the process will consider itself as fully updated |
| 293 | |
| 294 | |