Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1 | ----------------------- |
| 2 | HAProxy Starter Guide |
| 3 | ----------------------- |
Willy Tarreau | 1f97306 | 2021-05-14 09:36:37 +0200 | [diff] [blame^] | 4 | version 2.5 |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 5 | |
| 6 | |
| 7 | This document is an introduction to HAProxy for all those who don't know it, as |
| 8 | well as for those who want to re-discover it when they know older versions. Its |
| 9 | primary focus is to provide users with all the elements to decide if HAProxy is |
| 10 | the product they're looking for or not. Advanced users may find here some parts |
| 11 | of solutions to some ideas they had just because they were not aware of a given |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 12 | new feature. Some sizing information is also provided, the product's lifecycle |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 13 | is explained, and comparisons with partially overlapping products are provided. |
| 14 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 15 | This document doesn't provide any configuration help or hints, but it explains |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 16 | where to find the relevant documents. The summary below is meant to help you |
| 17 | search sections by name and navigate through the document. |
| 18 | |
| 19 | Note to documentation contributors : |
| 20 | This document is formatted with 80 columns per line, with even number of |
| 21 | spaces for indentation and without tabs. Please follow these rules strictly |
| 22 | so that it remains easily printable everywhere. If you add sections, please |
| 23 | update the summary below for easier searching. |
| 24 | |
| 25 | |
| 26 | Summary |
| 27 | ------- |
| 28 | |
| 29 | 1. Available documentation |
| 30 | |
| 31 | 2. Quick introduction to load balancing and load balancers |
| 32 | |
| 33 | 3. Introduction to HAProxy |
| 34 | 3.1. What HAProxy is and is not |
| 35 | 3.2. How HAProxy works |
| 36 | 3.3. Basic features |
| 37 | 3.3.1. Proxying |
| 38 | 3.3.2. SSL |
| 39 | 3.3.3. Monitoring |
| 40 | 3.3.4. High availability |
| 41 | 3.3.5. Load balancing |
| 42 | 3.3.6. Stickiness |
| 43 | 3.3.7. Sampling and converting information |
| 44 | 3.3.8. Maps |
| 45 | 3.3.9. ACLs and conditions |
| 46 | 3.3.10. Content switching |
| 47 | 3.3.11. Stick-tables |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 48 | 3.3.12. Formatted strings |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 49 | 3.3.13. HTTP rewriting and redirection |
| 50 | 3.3.14. Server protection |
| 51 | 3.3.15. Logging |
| 52 | 3.3.16. Statistics |
| 53 | 3.4. Advanced features |
| 54 | 3.4.1. Management |
| 55 | 3.4.2. System-specific capabilities |
| 56 | 3.4.3. Scripting |
| 57 | 3.5. Sizing |
| 58 | 3.6. How to get HAProxy |
| 59 | |
| 60 | 4. Companion products and alternatives |
| 61 | 4.1. Apache HTTP server |
| 62 | 4.2. NGINX |
| 63 | 4.3. Varnish |
| 64 | 4.4. Alternatives |
| 65 | |
Willy Tarreau | 6562623 | 2020-05-05 18:08:07 +0200 | [diff] [blame] | 66 | 5. Contacts |
| 67 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 68 | |
| 69 | 1. Available documentation |
| 70 | -------------------------- |
| 71 | |
| 72 | The complete HAProxy documentation is contained in the following documents. |
| 73 | Please ensure to consult the relevant documentation to save time and to get the |
| 74 | most accurate response to your needs. Also please refrain from sending questions |
| 75 | to the mailing list whose responses are present in these documents. |
| 76 | |
| 77 | - intro.txt (this document) : it presents the basics of load balancing, |
| 78 | HAProxy as a product, what it does, what it doesn't do, some known traps to |
| 79 | avoid, some OS-specific limitations, how to get it, how it evolves, how to |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 80 | ensure you're running with all known fixes, how to update it, complements |
| 81 | and alternatives. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 82 | |
Willy Tarreau | 373933d | 2015-10-13 16:32:20 +0200 | [diff] [blame] | 83 | - management.txt : it explains how to start haproxy, how to manage it at |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 84 | runtime, how to manage it on multiple nodes, and how to proceed with |
| 85 | seamless upgrades. |
Willy Tarreau | 373933d | 2015-10-13 16:32:20 +0200 | [diff] [blame] | 86 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 87 | - configuration.txt : the reference manual details all configuration keywords |
| 88 | and their options. It is used when a configuration change is needed. |
| 89 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 90 | - coding-style.txt : this is for developers who want to propose some code to |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 91 | the project. It explains the style to adopt for the code. It is not very |
| 92 | strict and not all the code base completely respects it, but contributions |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 93 | which diverge too much from it will be rejected. |
| 94 | |
| 95 | - proxy-protocol.txt : this is the de-facto specification of the PROXY |
| 96 | protocol which is implemented by HAProxy and a number of third party |
| 97 | products. |
| 98 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 99 | - README : how to build HAProxy from sources |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 100 | |
| 101 | |
| 102 | 2. Quick introduction to load balancing and load balancers |
| 103 | ---------------------------------------------------------- |
| 104 | |
| 105 | Load balancing consists in aggregating multiple components in order to achieve |
| 106 | a total processing capacity above each component's individual capacity, without |
| 107 | any intervention from the end user and in a scalable way. This results in more |
Willy Tarreau | eff04f4 | 2015-08-27 14:44:43 +0200 | [diff] [blame] | 108 | operations being performed simultaneously by the time it takes a component to |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 109 | perform only one. A single operation however will still be performed on a single |
| 110 | component at a time and will not get faster than without load balancing. It |
| 111 | always requires at least as many operations as available components and an |
| 112 | efficient load balancing mechanism to make use of all components and to fully |
| 113 | benefit from the load balancing. A good example of this is the number of lanes |
| 114 | on a highway which allows as many cars to pass during the same time frame |
| 115 | without increasing their individual speed. |
| 116 | |
| 117 | Examples of load balancing : |
| 118 | |
| 119 | - Process scheduling in multi-processor systems |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 120 | - Link load balancing (e.g. EtherChannel, Bonding) |
| 121 | - IP address load balancing (e.g. ECMP, DNS round-robin) |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 122 | - Server load balancing (via load balancers) |
| 123 | |
| 124 | The mechanism or component which performs the load balancing operation is |
| 125 | called a load balancer. In web environments these components are called a |
| 126 | "network load balancer", and more commonly a "load balancer" given that this |
| 127 | activity is by far the best known case of load balancing. |
| 128 | |
| 129 | A load balancer may act : |
| 130 | |
| 131 | - at the link level : this is called link load balancing, and it consists in |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 132 | choosing what network link to send a packet to; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 133 | |
| 134 | - at the network level : this is called network load balancing, and it |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 135 | consists in choosing what route a series of packets will follow; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 136 | |
| 137 | - at the server level : this is called server load balancing and it consists |
| 138 | in deciding what server will process a connection or request. |
| 139 | |
| 140 | Two distinct technologies exist and address different needs, though with some |
Willy Tarreau | eff04f4 | 2015-08-27 14:44:43 +0200 | [diff] [blame] | 141 | overlapping. In each case it is important to keep in mind that load balancing |
| 142 | consists in diverting the traffic from its natural flow and that doing so always |
| 143 | requires a minimum of care to maintain the required level of consistency between |
| 144 | all routing decisions. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 145 | |
| 146 | The first one acts at the packet level and processes packets more or less |
| 147 | individually. There is a 1-to-1 relation between input and output packets, so |
| 148 | it is possible to follow the traffic on both sides of the load balancer using a |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 149 | regular network sniffer. This technology can be very cheap and extremely fast. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 150 | It is usually implemented in hardware (ASICs) allowing to reach line rate, such |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 151 | as switches doing ECMP. Usually stateless, it can also be stateful (consider |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 152 | the session a packet belongs to and called layer4-LB or L4), may support DSR |
| 153 | (direct server return, without passing through the LB again) if the packets |
| 154 | were not modified, but provides almost no content awareness. This technology is |
| 155 | very well suited to network-level load balancing, though it is sometimes used |
| 156 | for very basic server load balancing at high speed. |
| 157 | |
| 158 | The second one acts on session contents. It requires that the input streams is |
| 159 | reassembled and processed as a whole. The contents may be modified, and the |
| 160 | output stream is segmented into new packets. For this reason it is generally |
| 161 | performed by proxies and they're often called layer 7 load balancers or L7. |
| 162 | This implies that there are two distinct connections on each side, and that |
| 163 | there is no relation between input and output packets sizes nor counts. Clients |
| 164 | and servers are not required to use the same protocol (for example IPv4 vs |
| 165 | IPv6, clear vs SSL). The operations are always stateful, and the return traffic |
| 166 | must pass through the load balancer. The extra processing comes with a cost so |
| 167 | it's not always possible to achieve line rate, especially with small packets. |
| 168 | On the other hand, it offers wide possibilities and is generally achieved by |
| 169 | pure software, even if embedded into hardware appliances. This technology is |
| 170 | very well suited for server load balancing. |
| 171 | |
| 172 | Packet-based load balancers are generally deployed in cut-through mode, so they |
| 173 | are installed on the normal path of the traffic and divert it according to the |
| 174 | configuration. The return traffic doesn't necessarily pass through the load |
| 175 | balancer. Some modifications may be applied to the network destination address |
| 176 | in order to direct the traffic to the proper destination. In this case, it is |
| 177 | mandatory that the return traffic passes through the load balancer. If the |
| 178 | routes doesn't make this possible, the load balancer may also replace the |
| 179 | packets' source address with its own in order to force the return traffic to |
| 180 | pass through it. |
| 181 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 182 | Proxy-based load balancers are deployed as a server with their own IP addresses |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 183 | and ports, without architecture changes. Sometimes this requires to perform some |
| 184 | adaptations to the applications so that clients are properly directed to the |
| 185 | load balancer's IP address and not directly to the server's. Some load balancers |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 186 | may have to adjust some servers' responses to make this possible (e.g. the HTTP |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 187 | Location header field used in HTTP redirects). Some proxy-based load balancers |
| 188 | may intercept traffic for an address they don't own, and spoof the client's |
| 189 | address when connecting to the server. This allows them to be deployed as if |
| 190 | they were a regular router or firewall, in a cut-through mode very similar to |
| 191 | the packet based load balancers. This is particularly appreciated for products |
| 192 | which combine both packet mode and proxy mode. In this case DSR is obviously |
| 193 | still not possible and the return traffic still has to be routed back to the |
| 194 | load balancer. |
| 195 | |
| 196 | A very scalable layered approach would consist in having a front router which |
| 197 | receives traffic from multiple load balanced links, and uses ECMP to distribute |
| 198 | this traffic to a first layer of multiple stateful packet-based load balancers |
| 199 | (L4). These L4 load balancers in turn pass the traffic to an even larger number |
| 200 | of proxy-based load balancers (L7), which have to parse the contents to decide |
| 201 | what server will ultimately receive the traffic. |
| 202 | |
| 203 | The number of components and possible paths for the traffic increases the risk |
| 204 | of failure; in very large environments, it is even normal to permanently have |
| 205 | a few faulty components being fixed or replaced. Load balancing done without |
| 206 | awareness of the whole stack's health significantly degrades availability. For |
| 207 | this reason, any sane load balancer will verify that the components it intends |
| 208 | to deliver the traffic to are still alive and reachable, and it will stop |
| 209 | delivering traffic to faulty ones. This can be achieved using various methods. |
| 210 | |
| 211 | The most common one consists in periodically sending probes to ensure the |
| 212 | component is still operational. These probes are called "health checks". They |
| 213 | must be representative of the type of failure to address. For example a ping- |
| 214 | based check will not detect that a web server has crashed and doesn't listen to |
| 215 | a port anymore, while a connection to the port will verify this, and a more |
| 216 | advanced request may even validate that the server still works and that the |
| 217 | database it relies on is still accessible. Health checks often involve a few |
| 218 | retries to cover for occasional measuring errors. The period between checks |
| 219 | must be small enough to ensure the faulty component is not used for too long |
| 220 | after an error occurs. |
| 221 | |
| 222 | Other methods consist in sampling the production traffic sent to a destination |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 223 | to observe if it is processed correctly or not, and to evict the components |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 224 | which return inappropriate responses. However this requires to sacrifice a part |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 225 | of the production traffic and this is not always acceptable. A combination of |
| 226 | these two mechanisms provides the best of both worlds, with both of them being |
| 227 | used to detect a fault, and only health checks to detect the end of the fault. |
| 228 | A last method involves centralized reporting : a central monitoring agent |
| 229 | periodically updates all load balancers about all components' state. This gives |
| 230 | a global view of the infrastructure to all components, though sometimes with |
| 231 | less accuracy or responsiveness. It's best suited for environments with many |
| 232 | load balancers and many servers. |
| 233 | |
| 234 | Layer 7 load balancers also face another challenge known as stickiness or |
| 235 | persistence. The principle is that they generally have to direct multiple |
| 236 | subsequent requests or connections from a same origin (such as an end user) to |
| 237 | the same target. The best known example is the shopping cart on an online |
| 238 | store. If each click leads to a new connection, the user must always be sent |
| 239 | to the server which holds his shopping cart. Content-awareness makes it easier |
| 240 | to spot some elements in the request to identify the server to deliver it to, |
| 241 | but that's not always enough. For example if the source address is used as a |
| 242 | key to pick a server, it can be decided that a hash-based algorithm will be |
| 243 | used and that a given IP address will always be sent to the same server based |
| 244 | on a divide of the address by the number of available servers. But if one |
| 245 | server fails, the result changes and all users are suddenly sent to a different |
| 246 | server and lose their shopping cart. The solution against this issue consists |
| 247 | in memorizing the chosen target so that each time the same visitor is seen, |
| 248 | he's directed to the same server regardless of the number of available servers. |
| 249 | The information may be stored in the load balancer's memory, in which case it |
| 250 | may have to be replicated to other load balancers if it's not alone, or it may |
| 251 | be stored in the client's memory using various methods provided that the client |
| 252 | is able to present this information back with every request (cookie insertion, |
| 253 | redirection to a sub-domain, etc). This mechanism provides the extra benefit of |
| 254 | not having to rely on unstable or unevenly distributed information (such as the |
| 255 | source IP address). This is in fact the strongest reason to adopt a layer 7 |
| 256 | load balancer instead of a layer 4 one. |
| 257 | |
| 258 | In order to extract information such as a cookie, a host header field, a URL |
| 259 | or whatever, a load balancer may need to decrypt SSL/TLS traffic and even |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 260 | possibly to re-encrypt it when passing it to the server. This expensive task |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 261 | explains why in some high-traffic infrastructures, sometimes there may be a |
| 262 | lot of load balancers. |
| 263 | |
| 264 | Since a layer 7 load balancer may perform a number of complex operations on the |
| 265 | traffic (decrypt, parse, modify, match cookies, decide what server to send to, |
| 266 | etc), it can definitely cause some trouble and will very commonly be accused of |
| 267 | being responsible for a lot of trouble that it only revealed. Often it will be |
| 268 | discovered that servers are unstable and periodically go up and down, or for |
| 269 | web servers, that they deliver pages with some hard-coded links forcing the |
| 270 | clients to connect directly to one specific server without passing via the load |
| 271 | balancer, or that they take ages to respond under high load causing timeouts. |
| 272 | That's why logging is an extremely important aspect of layer 7 load balancing. |
| 273 | Once a trouble is reported, it is important to figure if the load balancer took |
| 274 | a wrong decision and if so why so that it doesn't happen anymore. |
| 275 | |
| 276 | |
| 277 | 3. Introduction to HAProxy |
| 278 | -------------------------- |
| 279 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 280 | HAProxy is written as "HAProxy" to designate the product, and as "haproxy" to |
| 281 | designate the executable program, software package or a process. However, both |
| 282 | are commonly used for both purposes, and are pronounced H-A-Proxy. Very early, |
| 283 | "haproxy" used to stand for "high availability proxy" and the name was written |
| 284 | in two separate words, though by now it means nothing else than "HAProxy". |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 285 | |
| 286 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 287 | 3.1. What HAProxy is and isn't |
| 288 | ------------------------------ |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 289 | |
| 290 | HAProxy is : |
| 291 | |
| 292 | - a TCP proxy : it can accept a TCP connection from a listening socket, |
| 293 | connect to a server and attach these sockets together allowing traffic to |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 294 | flow in both directions; IPv4, IPv6 and even UNIX sockets are supported on |
| 295 | either side, so this can provide an easy way to translate addresses between |
| 296 | different families. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 297 | |
| 298 | - an HTTP reverse-proxy (called a "gateway" in HTTP terminology) : it presents |
| 299 | itself as a server, receives HTTP requests over connections accepted on a |
| 300 | listening TCP socket, and passes the requests from these connections to |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 301 | servers using different connections. It may use any combination of HTTP/1.x |
| 302 | or HTTP/2 on any side and will even automatically detect the protocol |
| 303 | spoken on each side when ALPN is used over TLS. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 304 | |
| 305 | - an SSL terminator / initiator / offloader : SSL/TLS may be used on the |
| 306 | connection coming from the client, on the connection going to the server, |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 307 | or even on both connections. A lot of settings can be applied per name |
| 308 | (SNI), and may be updated at runtime without restarting. Such setups are |
| 309 | extremely scalable and deployments involving tens to hundreds of thousands |
| 310 | of certificates were reported. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 311 | |
| 312 | - a TCP normalizer : since connections are locally terminated by the operating |
| 313 | system, there is no relation between both sides, so abnormal traffic such as |
| 314 | invalid packets, flag combinations, window advertisements, sequence numbers, |
| 315 | incomplete connections (SYN floods), or so will not be passed to the other |
| 316 | side. This protects fragile TCP stacks from protocol attacks, and also |
| 317 | allows to optimize the connection parameters with the client without having |
| 318 | to modify the servers' TCP stack settings. |
| 319 | |
| 320 | - an HTTP normalizer : when configured to process HTTP traffic, only valid |
| 321 | complete requests are passed. This protects against a lot of protocol-based |
| 322 | attacks. Additionally, protocol deviations for which there is a tolerance |
| 323 | in the specification are fixed so that they don't cause problem on the |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 324 | servers (e.g. multiple-line headers). |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 325 | |
| 326 | - an HTTP fixing tool : it can modify / fix / add / remove / rewrite the URL |
| 327 | or any request or response header. This helps fixing interoperability issues |
| 328 | in complex environments. |
| 329 | |
| 330 | - a content-based switch : it can consider any element from the request to |
| 331 | decide what server to pass the request or connection to. Thus it is possible |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 332 | to handle multiple protocols over a same port (e.g. HTTP, HTTPS, SSH). |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 333 | |
| 334 | - a server load balancer : it can load balance TCP connections and HTTP |
| 335 | requests. In TCP mode, load balancing decisions are taken for the whole |
| 336 | connection. In HTTP mode, decisions are taken per request. |
| 337 | |
| 338 | - a traffic regulator : it can apply some rate limiting at various points, |
| 339 | protect the servers against overloading, adjust traffic priorities based on |
| 340 | the contents, and even pass such information to lower layers and outer |
| 341 | network components by marking packets. |
| 342 | |
| 343 | - a protection against DDoS and service abuse : it can maintain a wide number |
| 344 | of statistics per IP address, URL, cookie, etc and detect when an abuse is |
| 345 | happening, then take action (slow down the offenders, block them, send them |
| 346 | to outdated contents, etc). |
| 347 | |
| 348 | - an observation point for network troubleshooting : due to the precision of |
| 349 | the information reported in logs, it is often used to narrow down some |
| 350 | network-related issues. |
| 351 | |
| 352 | - an HTTP compression offloader : it can compress responses which were not |
| 353 | compressed by the server, thus reducing the page load time for clients with |
| 354 | poor connectivity or using high-latency, mobile networks. |
| 355 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 356 | - a caching proxy : it may cache responses in RAM so that subsequent requests |
| 357 | for the same object avoid the cost of another network transfer from the |
| 358 | server as long as the object remains present and valid. It will however not |
| 359 | store objects to any persistent storage. Please note that this caching |
| 360 | feature is designed to be maintenance free and focuses solely on saving |
| 361 | haproxy's precious resources and not on save the server's resources. Caches |
| 362 | designed to optimize servers require much more tuning and flexibility. If |
| 363 | you instead need such an advanced cache, please use Varnish Cache, which |
| 364 | integrates perfectly with haproxy, especially when SSL/TLS is needed on any |
| 365 | side. |
| 366 | |
| 367 | - a FastCGI gateway : FastCGI can be seen as a different representation of |
| 368 | HTTP, and as such, HAProxy can directly load-balance a farm comprising any |
| 369 | combination of FastCGI application servers without requiring to insert |
| 370 | another level of gateway between them. This results in resource savings and |
| 371 | a reduction of maintenance costs. |
| 372 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 373 | HAProxy is not : |
| 374 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 375 | - an explicit HTTP proxy, i.e. the proxy that browsers use to reach the |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 376 | internet. There are excellent open-source software dedicated for this task, |
| 377 | such as Squid. However HAProxy can be installed in front of such a proxy to |
| 378 | provide load balancing and high availability. |
| 379 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 380 | - a data scrubber : it will not modify the body of requests nor responses. |
| 381 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 382 | - a static web server : during startup, it isolates itself inside a chroot |
| 383 | jail and drops its privileges, so that it will not perform any single file- |
| 384 | system access once started. As such it cannot be turned into a static web |
| 385 | server (dynamic servers are supported through FastCGI however). There are |
| 386 | excellent open-source software for this such as Apache or Nginx, and |
| 387 | HAProxy can be easily installed in front of them to provide load balancing, |
| 388 | high availability and acceleration. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 389 | |
| 390 | - a packet-based load balancer : it will not see IP packets nor UDP datagrams, |
| 391 | will not perform NAT or even less DSR. These are tasks for lower layers. |
| 392 | Some kernel-based components such as IPVS (Linux Virtual Server) already do |
| 393 | this pretty well and complement perfectly with HAProxy. |
| 394 | |
| 395 | |
| 396 | 3.2. How HAProxy works |
| 397 | ---------------------- |
| 398 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 399 | HAProxy is an event-driven, non-blocking engine combining a very fast I/O layer |
| 400 | with a priority-based, multi-threaded scheduler. As it is designed with a data |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 401 | forwarding goal in mind, its architecture is optimized to move data as fast as |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 402 | possible with the least possible operations. It focuses on optimizing the CPU |
| 403 | cache's efficiency by sticking connections to the same CPU as long as possible. |
| 404 | As such it implements a layered model offering bypass mechanisms at each level |
| 405 | ensuring data doesn't reach higher levels unless needed. Most of the processing |
| 406 | is performed in the kernel, and HAProxy does its best to help the kernel do the |
| 407 | work as fast as possible by giving some hints or by avoiding certain operation |
| 408 | when it guesses they could be grouped later. As a result, typical figures show |
| 409 | 15% of the processing time spent in HAProxy versus 85% in the kernel in TCP or |
| 410 | HTTP close mode, and about 30% for HAProxy versus 70% for the kernel in HTTP |
| 411 | keep-alive mode. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 412 | |
| 413 | A single process can run many proxy instances; configurations as large as |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 414 | 300000 distinct proxies in a single process were reported to run fine. A single |
| 415 | core, single CPU setup is far more than enough for more than 99% users, and as |
| 416 | such, users of containers and virtual machines are encouraged to use the |
| 417 | absolute smallest images they can get to save on operational costs and simplify |
| 418 | troubleshooting. However the machine HAProxy runs on must never ever swap, and |
| 419 | its CPU must not be artificially throttled (sub-CPU allocation in hypervisors) |
| 420 | nor be shared with compute-intensive processes which would induce a very high |
| 421 | context-switch latency. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 422 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 423 | Threading allows to exploit all available processing capacity by using one |
| 424 | thread per CPU core. This is mostly useful for SSL or when data forwarding |
| 425 | rates above 40 Gbps are needed. In such cases it is critically important to |
| 426 | avoid communications between multiple physical CPUs, which can cause strong |
| 427 | bottlenecks in the network stack and in HAProxy itself. While counter-intuitive |
| 428 | to some, the first thing to do when facing some performance issues is often to |
| 429 | reduce the number of CPUs HAProxy runs on. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 430 | |
| 431 | HAProxy only requires the haproxy executable and a configuration file to run. |
| 432 | For logging it is highly recommended to have a properly configured syslog daemon |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 433 | and log rotations in place. Logs may also be sent to stdout/stderr, which can be |
| 434 | useful inside containers. The configuration files are parsed before starting, |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 435 | then HAProxy tries to bind all listening sockets, and refuses to start if |
| 436 | anything fails. Past this point it cannot fail anymore. This means that there |
| 437 | are no runtime failures and that if it accepts to start, it will work until it |
| 438 | is stopped. |
| 439 | |
| 440 | Once HAProxy is started, it does exactly 3 things : |
| 441 | |
| 442 | - process incoming connections; |
| 443 | |
| 444 | - periodically check the servers' status (known as health checks); |
| 445 | |
| 446 | - exchange information with other haproxy nodes. |
| 447 | |
| 448 | Processing incoming connections is by far the most complex task as it depends |
| 449 | on a lot of configuration possibilities, but it can be summarized as the 9 steps |
| 450 | below : |
| 451 | |
| 452 | - accept incoming connections from listening sockets that belong to a |
| 453 | configuration entity known as a "frontend", which references one or multiple |
| 454 | listening addresses; |
| 455 | |
| 456 | - apply the frontend-specific processing rules to these connections that may |
| 457 | result in blocking them, modifying some headers, or intercepting them to |
| 458 | execute some internal applets such as the statistics page or the CLI; |
| 459 | |
| 460 | - pass these incoming connections to another configuration entity representing |
| 461 | a server farm known as a "backend", which contains the list of servers and |
| 462 | the load balancing strategy for this server farm; |
| 463 | |
| 464 | - apply the backend-specific processing rules to these connections; |
| 465 | |
| 466 | - decide which server to forward the connection to according to the load |
| 467 | balancing strategy; |
| 468 | |
| 469 | - apply the backend-specific processing rules to the response data; |
| 470 | |
| 471 | - apply the frontend-specific processing rules to the response data; |
| 472 | |
| 473 | - emit a log to report what happened in fine details; |
| 474 | |
| 475 | - in HTTP, loop back to the second step to wait for a new request, otherwise |
| 476 | close the connection. |
| 477 | |
| 478 | Frontends and backends are sometimes considered as half-proxies, since they only |
| 479 | look at one side of an end-to-end connection; the frontend only cares about the |
| 480 | clients while the backend only cares about the servers. HAProxy also supports |
| 481 | full proxies which are exactly the union of a frontend and a backend. When HTTP |
| 482 | processing is desired, the configuration will generally be split into frontends |
| 483 | and backends as they open a lot of possibilities since any frontend may pass a |
| 484 | connection to any backend. With TCP-only proxies, using frontends and backends |
| 485 | rarely provides a benefit and the configuration can be more readable with full |
| 486 | proxies. |
| 487 | |
| 488 | |
| 489 | 3.3. Basic features |
| 490 | ------------------- |
| 491 | |
| 492 | This section will enumerate a number of features that HAProxy implements, some |
| 493 | of which are generally expected from any modern load balancer, and some of |
| 494 | which are a direct benefit of HAProxy's architecture. More advanced features |
| 495 | will be detailed in the next section. |
| 496 | |
| 497 | |
| 498 | 3.3.1. Basic features : Proxying |
| 499 | -------------------------------- |
| 500 | |
| 501 | Proxying is the action of transferring data between a client and a server over |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 502 | two independent connections. The following basic features are supported by |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 503 | HAProxy regarding proxying and connection management : |
| 504 | |
| 505 | - Provide the server with a clean connection to protect them against any |
| 506 | client-side defect or attack; |
| 507 | |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 508 | - Listen to multiple IP addresses and/or ports, even port ranges; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 509 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 510 | - Transparent accept : intercept traffic targeting any arbitrary IP address |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 511 | that doesn't even belong to the local system; |
| 512 | |
| 513 | - Server port doesn't need to be related to listening port, and may even be |
| 514 | translated by a fixed offset (useful with ranges); |
| 515 | |
| 516 | - Transparent connect : spoof the client's (or any) IP address if needed |
| 517 | when connecting to the server; |
| 518 | |
| 519 | - Provide a reliable return IP address to the servers in multi-site LBs; |
| 520 | |
| 521 | - Offload the server thanks to buffers and possibly short-lived connections |
| 522 | to reduce their concurrent connection count and their memory footprint; |
| 523 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 524 | - Optimize TCP stacks (e.g. SACK), congestion control, and reduce RTT impacts; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 525 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 526 | - Support different protocol families on both sides (e.g. IPv4/IPv6/Unix); |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 527 | |
| 528 | - Timeout enforcement : HAProxy supports multiple levels of timeouts depending |
| 529 | on the stage the connection is, so that a dead client or server, or an |
| 530 | attacker cannot be granted resources for too long; |
| 531 | |
| 532 | - Protocol validation: HTTP, SSL, or payload are inspected and invalid |
| 533 | protocol elements are rejected, unless instructed to accept them anyway; |
| 534 | |
| 535 | - Policy enforcement : ensure that only what is allowed may be forwarded; |
| 536 | |
| 537 | - Both incoming and outgoing connections may be limited to certain network |
| 538 | namespaces (Linux only), making it easy to build a cross-container, |
| 539 | multi-tenant load balancer; |
| 540 | |
| 541 | - PROXY protocol presents the client's IP address to the server even for |
| 542 | non-HTTP traffic. This is an HAProxy extension that was adopted by a number |
| 543 | of third-party products by now, at least these ones at the time of writing : |
| 544 | - client : haproxy, stud, stunnel, exaproxy, ELB, squid |
| 545 | - server : haproxy, stud, postfix, exim, nginx, squid, node.js, varnish |
| 546 | |
| 547 | |
| 548 | 3.3.2. Basic features : SSL |
| 549 | --------------------------- |
| 550 | |
| 551 | HAProxy's SSL stack is recognized as one of the most featureful according to |
| 552 | Google's engineers (http://istlsfastyet.com/). The most commonly used features |
| 553 | making it quite complete are : |
| 554 | |
| 555 | - SNI-based multi-hosting with no limit on sites count and focus on |
| 556 | performance. At least one deployment is known for running 50000 domains |
| 557 | with their respective certificates; |
| 558 | |
| 559 | - support for wildcard certificates reduces the need for many certificates ; |
| 560 | |
| 561 | - certificate-based client authentication with configurable policies on |
| 562 | failure to present a valid certificate. This allows to present a different |
| 563 | server farm to regenerate the client certificate for example; |
| 564 | |
| 565 | - authentication of the backend server ensures the backend server is the real |
| 566 | one and not a man in the middle; |
| 567 | |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 568 | - authentication with the backend server lets the backend server know it's |
| 569 | really the expected haproxy node that is connecting to it; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 570 | |
| 571 | - TLS NPN and ALPN extensions make it possible to reliably offload SPDY/HTTP2 |
| 572 | connections and pass them in clear text to backend servers; |
| 573 | |
| 574 | - OCSP stapling further reduces first page load time by delivering inline an |
| 575 | OCSP response when the client requests a Certificate Status Request; |
| 576 | |
| 577 | - Dynamic record sizing provides both high performance and low latency, and |
| 578 | significantly reduces page load time by letting the browser start to fetch |
| 579 | new objects while packets are still in flight; |
| 580 | |
| 581 | - permanent access to all relevant SSL/TLS layer information for logging, |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 582 | access control, reporting etc. These elements can be embedded into HTTP |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 583 | header or even as a PROXY protocol extension so that the offloaded server |
| 584 | gets all the information it would have had if it performed the SSL |
| 585 | termination itself. |
| 586 | |
| 587 | - Detect, log and block certain known attacks even on vulnerable SSL libs, |
| 588 | such as the Heartbleed attack affecting certain versions of OpenSSL. |
| 589 | |
Pavlos Parissis | ba56d9c | 2015-08-24 13:14:32 +0200 | [diff] [blame] | 590 | - support for stateless session resumption (RFC 5077 TLS Ticket extension). |
| 591 | TLS tickets can be updated from CLI which provides them means to implement |
| 592 | Perfect Forward Secrecy by frequently rotating the tickets. |
| 593 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 594 | |
| 595 | 3.3.3. Basic features : Monitoring |
| 596 | ---------------------------------- |
| 597 | |
| 598 | HAProxy focuses a lot on availability. As such it cares about servers state, |
| 599 | and about reporting its own state to other network components : |
| 600 | |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 601 | - Servers' state is continuously monitored using per-server parameters. This |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 602 | ensures the path to the server is operational for regular traffic; |
| 603 | |
| 604 | - Health checks support two hysteresis for up and down transitions in order |
| 605 | to protect against state flapping; |
| 606 | |
| 607 | - Checks can be sent to a different address/port/protocol : this makes it |
| 608 | easy to check a single service that is considered representative of multiple |
| 609 | ones, for example the HTTPS port for an HTTP+HTTPS server. |
| 610 | |
| 611 | - Servers can track other servers and go down simultaneously : this ensures |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 612 | that servers hosting multiple services can fail atomically and that no one |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 613 | will be sent to a partially failed server; |
| 614 | |
| 615 | - Agents may be deployed on the server to monitor load and health : a server |
| 616 | may be interested in reporting its load, operational status, administrative |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 617 | status independently from what health checks can see. By running a simple |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 618 | agent on the server, it's possible to consider the server's view of its own |
| 619 | health in addition to the health checks validating the whole path; |
| 620 | |
| 621 | - Various check methods are available : TCP connect, HTTP request, SMTP hello, |
| 622 | SSL hello, LDAP, SQL, Redis, send/expect scripts, all with/without SSL; |
| 623 | |
| 624 | - State change is notified in the logs and stats page with the failure reason |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 625 | (e.g. the HTTP response received at the moment the failure was detected). An |
Willy Tarreau | eff04f4 | 2015-08-27 14:44:43 +0200 | [diff] [blame] | 626 | e-mail can also be sent to a configurable address upon such a change ; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 627 | |
| 628 | - Server state is also reported on the stats interface and can be used to take |
| 629 | routing decisions so that traffic may be sent to different farms depending |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 630 | on their sizes and/or health (e.g. loss of an inter-DC link); |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 631 | |
| 632 | - HAProxy can use health check requests to pass information to the servers, |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 633 | such as their names, weight, the number of other servers in the farm etc. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 634 | so that servers can adjust their response and decisions based on this |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 635 | knowledge (e.g. postpone backups to keep more CPU available); |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 636 | |
| 637 | - Servers can use health checks to report more detailed state than just on/off |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 638 | (e.g. I would like to stop, please stop sending new visitors); |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 639 | |
| 640 | - HAProxy itself can report its state to external components such as routers |
| 641 | or other load balancers, allowing to build very complete multi-path and |
| 642 | multi-layer infrastructures. |
| 643 | |
| 644 | |
| 645 | 3.3.4. Basic features : High availability |
| 646 | ----------------------------------------- |
| 647 | |
| 648 | Just like any serious load balancer, HAProxy cares a lot about availability to |
| 649 | ensure the best global service continuity : |
| 650 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 651 | - Only valid servers are used ; the other ones are automatically evicted from |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 652 | load balancing farms ; under certain conditions it is still possible to |
| 653 | force to use them though; |
| 654 | |
| 655 | - Support for a graceful shutdown so that it is possible to take servers out |
| 656 | of a farm without affecting any connection; |
| 657 | |
| 658 | - Backup servers are automatically used when active servers are down and |
| 659 | replace them so that sessions are not lost when possible. This also allows |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 660 | to build multiple paths to reach the same server (e.g. multiple interfaces); |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 661 | |
| 662 | - Ability to return a global failed status for a farm when too many servers |
| 663 | are down. This, combined with the monitoring capabilities makes it possible |
| 664 | for an upstream component to choose a different LB node for a given service; |
| 665 | |
| 666 | - Stateless design makes it easy to build clusters : by design, HAProxy does |
| 667 | its best to ensure the highest service continuity without having to store |
| 668 | information that could be lost in the event of a failure. This ensures that |
| 669 | a takeover is the most seamless possible; |
| 670 | |
| 671 | - Integrates well with standard VRRP daemon keepalived : HAProxy easily tells |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 672 | keepalived about its state and copes very well with floating virtual IP |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 673 | addresses. Note: only use IP redundancy protocols (VRRP/CARP) over cluster- |
| 674 | based solutions (Heartbeat, ...) as they're the ones offering the fastest, |
| 675 | most seamless, and most reliable switchover. |
| 676 | |
| 677 | |
| 678 | 3.3.5. Basic features : Load balancing |
| 679 | -------------------------------------- |
| 680 | |
| 681 | HAProxy offers a fairly complete set of load balancing features, most of which |
| 682 | are unfortunately not available in a number of other load balancing products : |
| 683 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 684 | - no less than 10 load balancing algorithms are supported, some of which apply |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 685 | to input data to offer an infinite list of possibilities. The most common |
| 686 | ones are round-robin (for short connections, pick each server in turn), |
| 687 | leastconn (for long connections, pick the least recently used of the servers |
| 688 | with the lowest connection count), source (for SSL farms or terminal server |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 689 | farms, the server directly depends on the client's source address), URI (for |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 690 | HTTP caches, the server directly depends on the HTTP URI), hdr (the server |
| 691 | directly depends on the contents of a specific HTTP header field), first |
| 692 | (for short-lived virtual machines, all connections are packed on the |
| 693 | smallest possible subset of servers so that unused ones can be powered |
| 694 | down); |
| 695 | |
| 696 | - all algorithms above support per-server weights so that it is possible to |
| 697 | accommodate from different server generations in a farm, or direct a small |
| 698 | fraction of the traffic to specific servers (debug mode, running the next |
| 699 | version of the software, etc); |
| 700 | |
| 701 | - dynamic weights are supported for round-robin, leastconn and consistent |
| 702 | hashing ; this allows server weights to be modified on the fly from the CLI |
| 703 | or even by an agent running on the server; |
| 704 | |
| 705 | - slow-start is supported whenever a dynamic weight is supported; this allows |
| 706 | a server to progressively take the traffic. This is an important feature |
| 707 | for fragile application servers which require to compile classes at runtime |
| 708 | as well as cold caches which need to fill up before being run at full |
| 709 | throttle; |
| 710 | |
| 711 | - hashing can apply to various elements such as client's source address, URL |
| 712 | components, query string element, header field values, POST parameter, RDP |
| 713 | cookie; |
| 714 | |
| 715 | - consistent hashing protects server farms against massive redistribution when |
| 716 | adding or removing servers in a farm. That's very important in large cache |
| 717 | farms and it allows slow-start to be used to refill cold caches; |
| 718 | |
| 719 | - a number of internal metrics such as the number of connections per server, |
| 720 | per backend, the amount of available connection slots in a backend etc makes |
| 721 | it possible to build very advanced load balancing strategies. |
| 722 | |
| 723 | |
| 724 | 3.3.6. Basic features : Stickiness |
| 725 | ---------------------------------- |
| 726 | |
| 727 | Application load balancing would be useless without stickiness. HAProxy provides |
| 728 | a fairly comprehensive set of possibilities to maintain a visitor on the same |
| 729 | server even across various events such as server addition/removal, down/up |
| 730 | cycles, and some methods are designed to be resistant to the distance between |
| 731 | multiple load balancing nodes in that they don't require any replication : |
| 732 | |
| 733 | - stickiness information can be individually matched and learned from |
| 734 | different places if desired. For example a JSESSIONID cookie may be matched |
| 735 | both in a cookie and in the URL. Up to 8 parallel sources can be learned at |
| 736 | the same time and each of them may point to a different stick-table; |
| 737 | |
| 738 | - stickiness information can come from anything that can be seen within a |
| 739 | request or response, including source address, TCP payload offset and |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 740 | length, HTTP query string elements, header field values, cookies, and so |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 741 | on. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 742 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 743 | - stick-tables are replicated between all nodes in a multi-master fashion; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 744 | |
| 745 | - commonly used elements such as SSL-ID or RDP cookies (for TSE farms) are |
| 746 | directly accessible to ease manipulation; |
| 747 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 748 | - all sticking rules may be dynamically conditioned by ACLs; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 749 | |
| 750 | - it is possible to decide not to stick to certain servers, such as backup |
| 751 | servers, so that when the nominal server comes back, it automatically takes |
| 752 | the load back. This is often used in multi-path environments; |
| 753 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 754 | - in HTTP it is often preferred not to learn anything and instead manipulate |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 755 | a cookie dedicated to stickiness. For this, it's possible to detect, |
| 756 | rewrite, insert or prefix such a cookie to let the client remember what |
| 757 | server was assigned; |
| 758 | |
| 759 | - the server may decide to change or clean the stickiness cookie on logout, |
| 760 | so that leaving visitors are automatically unbound from the server; |
| 761 | |
| 762 | - using ACL-based rules it is also possible to selectively ignore or enforce |
| 763 | stickiness regardless of the server's state; combined with advanced health |
| 764 | checks, that helps admins verify that the server they're installing is up |
| 765 | and running before presenting it to the whole world; |
| 766 | |
| 767 | - an innovative mechanism to set a maximum idle time and duration on cookies |
| 768 | ensures that stickiness can be smoothly stopped on devices which are never |
| 769 | closed (smartphones, TVs, home appliances) without having to store them on |
| 770 | persistent storage; |
| 771 | |
| 772 | - multiple server entries may share the same stickiness keys so that |
| 773 | stickiness is not lost in multi-path environments when one path goes down; |
| 774 | |
| 775 | - soft-stop ensures that only users with stickiness information will continue |
| 776 | to reach the server they've been assigned to but no new users will go there. |
| 777 | |
| 778 | |
| 779 | 3.3.7. Basic features : Sampling and converting information |
| 780 | ----------------------------------------------------------- |
| 781 | |
| 782 | HAProxy supports information sampling using a wide set of "sample fetch |
| 783 | functions". The principle is to extract pieces of information known as samples, |
| 784 | for immediate use. This is used for stickiness, to build conditions, to produce |
| 785 | information in logs or to enrich HTTP headers. |
| 786 | |
| 787 | Samples can be fetched from various sources : |
| 788 | |
| 789 | - constants : integers, strings, IP addresses, binary blocks; |
| 790 | |
| 791 | - the process : date, environment variables, server/frontend/backend/process |
| 792 | state, byte/connection counts/rates, queue length, random generator, ... |
| 793 | |
| 794 | - variables : per-session, per-request, per-response variables; |
| 795 | |
| 796 | - the client connection : source and destination addresses and ports, and all |
| 797 | related statistics counters; |
| 798 | |
| 799 | - the SSL client session : protocol, version, algorithm, cipher, key size, |
| 800 | session ID, all client and server certificate fields, certificate serial, |
| 801 | SNI, ALPN, NPN, client support for certain extensions; |
| 802 | |
| 803 | - request and response buffers contents : arbitrary payload at offset/length, |
| 804 | data length, RDP cookie, decoding of SSL hello type, decoding of TLS SNI; |
| 805 | |
| 806 | - HTTP (request and response) : method, URI, path, query string arguments, |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 807 | status code, headers values, positional header value, cookies, captures, |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 808 | authentication, body elements; |
| 809 | |
| 810 | A sample may then pass through a number of operators known as "converters" to |
| 811 | experience some transformation. A converter consumes a sample and produces a |
| 812 | new one, possibly of a completely different type. For example, a converter may |
| 813 | be used to return only the integer length of the input string, or could turn a |
| 814 | string to upper case. Any arbitrary number of converters may be applied in |
| 815 | series to a sample before final use. Among all available sample converters, the |
| 816 | following ones are the most commonly used : |
| 817 | |
| 818 | - arithmetic and logic operators : they make it possible to perform advanced |
| 819 | computation on input data, such as computing ratios, percentages or simply |
| 820 | converting from one unit to another one; |
| 821 | |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 822 | - IP address masks are useful when some addresses need to be grouped by larger |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 823 | networks; |
| 824 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 825 | - data representation : URL-decode, base64, hex, JSON strings, hashing; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 826 | |
| 827 | - string conversion : extract substrings at fixed positions, fixed length, |
| 828 | extract specific fields around certain delimiters, extract certain words, |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 829 | change case, apply regex-based substitution; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 830 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 831 | - date conversion : convert to HTTP date format, convert local to UTC and |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 832 | conversely, add or remove offset; |
| 833 | |
| 834 | - lookup an entry in a stick table to find statistics or assigned server; |
| 835 | |
| 836 | - map-based key-to-value conversion from a file (mostly used for geolocation). |
| 837 | |
| 838 | |
| 839 | 3.3.8. Basic features : Maps |
| 840 | ---------------------------- |
| 841 | |
| 842 | Maps are a powerful type of converter consisting in loading a two-columns file |
| 843 | into memory at boot time, then looking up each input sample from the first |
| 844 | column and either returning the corresponding pattern on the second column if |
| 845 | the entry was found, or returning a default value. The output information also |
| 846 | being a sample, it can in turn experience other transformations including other |
| 847 | map lookups. Maps are most commonly used to translate the client's IP address |
| 848 | to an AS number or country code since they support a longest match for network |
| 849 | addresses but they can be used for various other purposes. |
| 850 | |
| 851 | Part of their strength comes from being updatable on the fly either from the CLI |
| 852 | or from certain actions using other samples, making them capable of storing and |
| 853 | retrieving information between subsequent accesses. Another strength comes from |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 854 | the binary tree based indexation which makes them extremely fast even when they |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 855 | contain hundreds of thousands of entries, making geolocation very cheap and easy |
| 856 | to set up. |
| 857 | |
| 858 | |
| 859 | 3.3.9. Basic features : ACLs and conditions |
| 860 | ------------------------------------------- |
| 861 | |
| 862 | Most operations in HAProxy can be made conditional. Conditions are built by |
| 863 | combining multiple ACLs using logic operators (AND, OR, NOT). Each ACL is a |
| 864 | series of tests based on the following elements : |
| 865 | |
| 866 | - a sample fetch method to retrieve the element to test ; |
| 867 | |
| 868 | - an optional series of converters to transform the element ; |
| 869 | |
| 870 | - a list of patterns to match against ; |
| 871 | |
| 872 | - a matching method to indicate how to compare the patterns with the sample |
| 873 | |
| 874 | For example, the sample may be taken from the HTTP "Host" header, it could then |
| 875 | be converted to lower case, then matched against a number of regex patterns |
| 876 | using the regex matching method. |
| 877 | |
| 878 | Technically, ACLs are built on the same core as the maps, they share the exact |
| 879 | same internal structure, pattern matching methods and performance. The only real |
| 880 | difference is that instead of returning a sample, they only return "found" or |
| 881 | or "not found". In terms of usage, ACL patterns may be declared inline in the |
| 882 | configuration file and do not require their own file. ACLs may be named for ease |
| 883 | of use or to make configurations understandable. A named ACL may be declared |
| 884 | multiple times and it will evaluate all definitions in turn until one matches. |
| 885 | |
| 886 | About 13 different pattern matching methods are provided, among which IP address |
| 887 | mask, integer ranges, substrings, regex. They work like functions, and just like |
| 888 | with any programming language, only what is needed is evaluated, so when a |
| 889 | condition involving an OR is already true, next ones are not evaluated, and |
| 890 | similarly when a condition involving an AND is already false, the rest of the |
| 891 | condition is not evaluated. |
| 892 | |
| 893 | There is no practical limit to the number of declared ACLs, and a handful of |
| 894 | commonly used ones are provided. However experience has shown that setups using |
| 895 | a lot of named ACLs are quite hard to troubleshoot and that sometimes using |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 896 | anonymous ACLs inline is easier as it requires less references out of the scope |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 897 | being analyzed. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 898 | |
| 899 | |
| 900 | 3.3.10. Basic features : Content switching |
| 901 | ------------------------------------------ |
| 902 | |
| 903 | HAProxy implements a mechanism known as content-based switching. The principle |
| 904 | is that a connection or request arrives on a frontend, then the information |
| 905 | carried with this request or connection are processed, and at this point it is |
| 906 | possible to write ACLs-based conditions making use of these information to |
| 907 | decide what backend will process the request. Thus the traffic is directed to |
| 908 | one backend or another based on the request's contents. The most common example |
| 909 | consists in using the Host header and/or elements from the path (sub-directories |
| 910 | or file-name extensions) to decide whether an HTTP request targets a static |
| 911 | object or the application, and to route static objects traffic to a backend made |
| 912 | of fast and light servers, and all the remaining traffic to a more complex |
| 913 | application server, thus constituting a fine-grained virtual hosting solution. |
| 914 | This is quite convenient to make multiple technologies coexist as a more global |
| 915 | solution. |
| 916 | |
| 917 | Another use case of content-switching consists in using different load balancing |
| 918 | algorithms depending on various criteria. A cache may use a URI hash while an |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 919 | application would use round-robin. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 920 | |
| 921 | Last but not least, it allows multiple customers to use a small share of a |
| 922 | common resource by enforcing per-backend (thus per-customer connection limits). |
| 923 | |
| 924 | Content switching rules scale very well, though their performance may depend on |
| 925 | the number and complexity of the ACLs in use. But it is also possible to write |
| 926 | dynamic content switching rules where a sample value directly turns into a |
| 927 | backend name and without making use of ACLs at all. Such configurations have |
| 928 | been reported to work fine at least with 300000 backends in production. |
| 929 | |
| 930 | |
| 931 | 3.3.11. Basic features : Stick-tables |
| 932 | ------------------------------------- |
| 933 | |
| 934 | Stick-tables are commonly used to store stickiness information, that is, to keep |
| 935 | a reference to the server a certain visitor was directed to. The key is then the |
| 936 | identifier associated with the visitor (its source address, the SSL ID of the |
| 937 | connection, an HTTP or RDP cookie, the customer number extracted from the URL or |
| 938 | from the payload, ...) and the stored value is then the server's identifier. |
| 939 | |
| 940 | Stick tables may use 3 different types of samples for their keys : integers, |
| 941 | strings and addresses. Only one stick-table may be referenced in a proxy, and it |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 942 | is designated everywhere with the proxy name. Up to 8 keys may be tracked in |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 943 | parallel. The server identifier is committed during request or response |
| 944 | processing once both the key and the server are known. |
| 945 | |
| 946 | Stick-table contents may be replicated in active-active mode with other HAProxy |
| 947 | nodes known as "peers" as well as with the new process during a reload operation |
| 948 | so that all load balancing nodes share the same information and take the same |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 949 | routing decision if client's requests are spread over multiple nodes. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 950 | |
| 951 | Since stick-tables are indexed on what allows to recognize a client, they are |
| 952 | often also used to store extra information such as per-client statistics. The |
| 953 | extra statistics take some extra space and need to be explicitly declared. The |
| 954 | type of statistics that may be stored includes the input and output bandwidth, |
| 955 | the number of concurrent connections, the connection rate and count over a |
| 956 | period, the amount and frequency of errors, some specific tags and counters, |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 957 | etc. In order to support keeping such information without being forced to |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 958 | stick to a given server, a special "tracking" feature is implemented and allows |
| 959 | to track up to 3 simultaneous keys from different tables at the same time |
| 960 | regardless of stickiness rules. Each stored statistics may be searched, dumped |
| 961 | and cleared from the CLI and adds to the live troubleshooting capabilities. |
| 962 | |
| 963 | While this mechanism can be used to surclass a returning visitor or to adjust |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 964 | the delivered quality of service depending on good or bad behavior, it is |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 965 | mostly used to fight against service abuse and more generally DDoS as it allows |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 966 | to build complex models to detect certain bad behaviors at a high processing |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 967 | speed. |
| 968 | |
| 969 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 970 | 3.3.12. Basic features : Formatted strings |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 971 | ----------------------------------------- |
| 972 | |
| 973 | There are many places where HAProxy needs to manipulate character strings, such |
| 974 | as logs, redirects, header additions, and so on. In order to provide the |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 975 | greatest flexibility, the notion of Formatted strings was introduced, initially |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 976 | for logging purposes, which explains why it's still called "log-format". These |
| 977 | strings contain escape characters allowing to introduce various dynamic data |
| 978 | including variables and sample fetch expressions into strings, and even to |
| 979 | adjust the encoding while the result is being turned into a string (for example, |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 980 | adding quotes). This provides a powerful way to build header contents, to build |
| 981 | response data or even response templates, or to customize log lines. |
| 982 | Additionally, in order to remain simple to build most common strings, about 50 |
| 983 | special tags are provided as shortcuts for information commonly used in logs. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 984 | |
| 985 | |
| 986 | 3.3.13. Basic features : HTTP rewriting and redirection |
| 987 | ------------------------------------------------------- |
| 988 | |
| 989 | Installing a load balancer in front of an application that was never designed |
| 990 | for this can be a challenging task without the proper tools. One of the most |
| 991 | commonly requested operation in this case is to adjust requests and response |
| 992 | headers to make the load balancer appear as the origin server and to fix hard |
| 993 | coded information. This comes with changing the path in requests (which is |
| 994 | strongly advised against), modifying Host header field, modifying the Location |
| 995 | response header field for redirects, modifying the path and domain attribute |
| 996 | for cookies, and so on. It also happens that a number of servers are somewhat |
| 997 | verbose and tend to leak too much information in the response, making them more |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 998 | vulnerable to targeted attacks. While it's theoretically not the role of a load |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 999 | balancer to clean this up, in practice it's located at the best place in the |
| 1000 | infrastructure to guarantee that everything is cleaned up. |
| 1001 | |
| 1002 | Similarly, sometimes the load balancer will have to intercept some requests and |
| 1003 | respond with a redirect to a new target URL. While some people tend to confuse |
| 1004 | redirects and rewriting, these are two completely different concepts, since the |
| 1005 | rewriting makes the client and the server see different things (and disagree on |
| 1006 | the location of the page being visited) while redirects ask the client to visit |
| 1007 | the new URL so that it sees the same location as the server. |
| 1008 | |
| 1009 | In order to do this, HAProxy supports various possibilities for rewriting and |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1010 | redirects, among which : |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1011 | |
| 1012 | - regex-based URL and header rewriting in requests and responses. Regex are |
| 1013 | the most commonly used tool to modify header values since they're easy to |
| 1014 | manipulate and well understood; |
| 1015 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1016 | - headers may also be appended, deleted or replaced based on formatted strings |
| 1017 | so that it is possible to pass information there (e.g. client side TLS |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1018 | algorithm and cipher); |
| 1019 | |
| 1020 | - HTTP redirects can use any 3xx code to a relative, absolute, or completely |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1021 | dynamic (formatted string) URI; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1022 | |
| 1023 | - HTTP redirects also support some extra options such as setting or clearing |
| 1024 | a specific cookie, dropping the query string, appending a slash if missing, |
| 1025 | and so on; |
| 1026 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1027 | - a powerful "return" directive allows to customize every part of a response |
| 1028 | like status, headers, body using dynamic contents or even template files. |
| 1029 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1030 | - all operations support ACL-based conditions; |
| 1031 | |
| 1032 | |
| 1033 | 3.3.14. Basic features : Server protection |
| 1034 | ------------------------------------------ |
| 1035 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1036 | HAProxy does a lot to maximize service availability, and for this it takes |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1037 | large efforts to protect servers against overloading and attacks. The first |
| 1038 | and most important point is that only complete and valid requests are forwarded |
| 1039 | to the servers. The initial reason is that HAProxy needs to find the protocol |
| 1040 | elements it needs to stay synchronized with the byte stream, and the second |
| 1041 | reason is that until the request is complete, there is no way to know if some |
| 1042 | elements will change its semantics. The direct benefit from this is that servers |
| 1043 | are not exposed to invalid or incomplete requests. This is a very effective |
| 1044 | protection against slowloris attacks, which have almost no impact on HAProxy. |
| 1045 | |
| 1046 | Another important point is that HAProxy contains buffers to store requests and |
| 1047 | responses, and that by only sending a request to a server when it's complete and |
| 1048 | by reading the whole response very quickly from the local network, the server |
| 1049 | side connection is used for a very short time and this preserves server |
| 1050 | resources as much as possible. |
| 1051 | |
| 1052 | A direct extension to this is that HAProxy can artificially limit the number of |
| 1053 | concurrent connections or outstanding requests to a server, which guarantees |
| 1054 | that the server will never be overloaded even if it continuously runs at 100% of |
| 1055 | its capacity during traffic spikes. All excess requests will simply be queued to |
| 1056 | be processed when one slot is released. In the end, this huge resource savings |
| 1057 | most often ensures so much better server response times that it ends up actually |
| 1058 | being faster than by overloading the server. Queued requests may be redispatched |
| 1059 | to other servers, or even aborted in queue when the client aborts, which also |
| 1060 | protects the servers against the "reload effect", where each click on "reload" |
| 1061 | by a visitor on a slow-loading page usually induces a new request and maintains |
| 1062 | the server in an overloaded state. |
| 1063 | |
| 1064 | The slow-start mechanism also protects restarting servers against high traffic |
| 1065 | levels while they're still finalizing their startup or compiling some classes. |
| 1066 | |
| 1067 | Regarding the protocol-level protection, it is possible to relax the HTTP parser |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1068 | to accept non standard-compliant but harmless requests or responses and even to |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1069 | fix them. This allows bogus applications to be accessible while a fix is being |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 1070 | developed. In parallel, offending messages are completely captured with a |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1071 | detailed report that help developers spot the issue in the application. The most |
| 1072 | dangerous protocol violations are properly detected and dealt with and fixed. |
| 1073 | For example malformed requests or responses with two Content-length headers are |
| 1074 | either fixed if the values are exactly the same, or rejected if they differ, |
| 1075 | since it becomes a security problem. Protocol inspection is not limited to HTTP, |
| 1076 | it is also available for other protocols like TLS or RDP. |
| 1077 | |
| 1078 | When a protocol violation or attack is detected, there are various options to |
| 1079 | respond to the user, such as returning the common "HTTP 400 bad request", |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1080 | closing the connection with a TCP reset, or faking an error after a long delay |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1081 | ("tarpit") to confuse the attacker. All of these contribute to protecting the |
| 1082 | servers by discouraging the offending client from pursuing an attack that |
| 1083 | becomes very expensive to maintain. |
| 1084 | |
| 1085 | HAProxy also proposes some more advanced options to protect against accidental |
| 1086 | data leaks and session crossing. Not only it can log suspicious server responses |
| 1087 | but it will also log and optionally block a response which might affect a given |
| 1088 | visitors' confidentiality. One such example is a cacheable cookie appearing in a |
| 1089 | cacheable response and which may result in an intermediary cache to deliver it |
| 1090 | to another visitor, causing an accidental session sharing. |
| 1091 | |
| 1092 | |
| 1093 | 3.3.15. Basic features : Logging |
| 1094 | -------------------------------- |
| 1095 | |
| 1096 | Logging is an extremely important feature for a load balancer, first because a |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1097 | load balancer is often wrongly accused of causing the problems it reveals, and |
| 1098 | second because it is placed at a critical point in an infrastructure where all |
| 1099 | normal and abnormal activity needs to be analyzed and correlated with other |
| 1100 | components. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1101 | |
| 1102 | HAProxy provides very detailed logs, with millisecond accuracy and the exact |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1103 | connection accept time that can be searched in firewalls logs (e.g. for NAT |
Alain Belkadi | ac52095 | 2019-07-05 10:12:40 +0200 | [diff] [blame] | 1104 | correlation). By default, TCP and HTTP logs are quite detailed and contain |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1105 | everything needed for troubleshooting, such as source IP address and port, |
| 1106 | frontend, backend, server, timers (request receipt duration, queue duration, |
| 1107 | connection setup time, response headers time, data transfer time), global |
| 1108 | process state, connection counts, queue status, retries count, detailed |
| 1109 | stickiness actions and disconnect reasons, header captures with a safe output |
| 1110 | encoding. It is then possible to extend or replace this format to include any |
| 1111 | sampled data, variables, captures, resulting in very detailed information. For |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1112 | example it is possible to log the number of cumulative requests or number of |
| 1113 | different URLs visited by a client. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1114 | |
| 1115 | The log level may be adjusted per request using standard ACLs, so it is possible |
| 1116 | to automatically silent some logs considered as pollution and instead raise |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1117 | warnings when some abnormal behavior happen for a small part of the traffic |
| 1118 | (e.g. too many URLs or HTTP errors for a source address). Administrative logs |
| 1119 | are also emitted with their own levels to inform about the loss or recovery of a |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1120 | server for example. |
| 1121 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1122 | Each frontend and backend may use multiple independent log outputs, which eases |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1123 | multi-tenancy. Logs are preferably sent over UDP, maybe JSON-encoded, and are |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1124 | truncated after a configurable line length in order to guarantee delivery. But |
| 1125 | it is also possible to sned them to stdout/stderr or any file descriptor, as |
| 1126 | well as to a ring buffer that a client can subscribe to in order to retrieve |
| 1127 | them. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1128 | |
| 1129 | |
| 1130 | 3.3.16. Basic features : Statistics |
| 1131 | ----------------------------------- |
| 1132 | |
| 1133 | HAProxy provides a web-based statistics reporting interface with authentication, |
| 1134 | security levels and scopes. It is thus possible to provide each hosted customer |
| 1135 | with his own page showing only his own instances. This page can be located in a |
| 1136 | hidden URL part of the regular web site so that no new port needs to be opened. |
| 1137 | This page may also report the availability of other HAProxy nodes so that it is |
| 1138 | easy to spot if everything works as expected at a glance. The view is synthetic |
| 1139 | with a lot of details accessible (such as error causes, last access and last |
| 1140 | change duration, etc), which are also accessible as a CSV table that other tools |
| 1141 | may import to draw graphs. The page may self-refresh to be used as a monitoring |
| 1142 | page on a large display. In administration mode, the page also allows to change |
| 1143 | server state to ease maintenance operations. |
| 1144 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1145 | A Prometheus exporter is also provided so that the statistics can be consumed |
| 1146 | in a different format depending on the deployment. |
| 1147 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1148 | |
| 1149 | 3.4. Advanced features |
| 1150 | ---------------------- |
| 1151 | |
| 1152 | 3.4.1. Advanced features : Management |
| 1153 | ------------------------------------- |
| 1154 | |
| 1155 | HAProxy is designed to remain extremely stable and safe to manage in a regular |
| 1156 | production environment. It is provided as a single executable file which doesn't |
| 1157 | require any installation process. Multiple versions can easily coexist, meaning |
| 1158 | that it's possible (and recommended) to upgrade instances progressively by |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1159 | order of importance instead of migrating all of them at once. Configuration |
| 1160 | files are easily versioned. Configuration checking is done off-line so it |
| 1161 | doesn't require to restart a service that will possibly fail. During |
| 1162 | configuration checks, a number of advanced mistakes may be detected (e.g. a rule |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1163 | hiding another one, or stickiness that will not work) and detailed warnings and |
| 1164 | configuration hints are proposed to fix them. Backwards configuration file |
| 1165 | compatibility goes very far away in time, with version 1.5 still fully |
| 1166 | supporting configurations for versions 1.1 written 13 years before, and 1.6 |
| 1167 | only dropping support for almost unused, obsolete keywords that can be done |
| 1168 | differently. The configuration and software upgrade mechanism is smooth and non |
| 1169 | disruptive in that it allows old and new processes to coexist on the system, |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1170 | each handling its own connections. System status, build options, and library |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1171 | compatibility are reported on startup. |
| 1172 | |
| 1173 | Some advanced features allow an application administrator to smoothly stop a |
| 1174 | server, detect when there's no activity on it anymore, then take it off-line, |
| 1175 | stop it, upgrade it and ensure it doesn't take any traffic while being upgraded, |
| 1176 | then test it again through the normal path without opening it to the public, and |
| 1177 | all of this without touching HAProxy at all. This ensures that even complicated |
| 1178 | production operations may be done during opening hours with all technical |
| 1179 | resources available. |
| 1180 | |
| 1181 | The process tries to save resources as much as possible, uses memory pools to |
| 1182 | save on allocation time and limit memory fragmentation, releases payload buffers |
| 1183 | as soon as their contents are sent, and supports enforcing strong memory limits |
| 1184 | above which connections have to wait for a buffer to become available instead of |
| 1185 | allocating more memory. This system helps guarantee memory usage in certain |
| 1186 | strict environments. |
| 1187 | |
| 1188 | A command line interface (CLI) is available as a UNIX or TCP socket, to perform |
| 1189 | a number of operations and to retrieve troubleshooting information. Everything |
| 1190 | done on this socket doesn't require a configuration change, so it is mostly used |
| 1191 | for temporary changes. Using this interface it is possible to change a server's |
| 1192 | address, weight and status, to consult statistics and clear counters, dump and |
| 1193 | clear stickiness tables, possibly selectively by key criteria, dump and kill |
| 1194 | client-side and server-side connections, dump captured errors with a detailed |
| 1195 | analysis of the exact cause and location of the error, dump, add and remove |
| 1196 | entries from ACLs and maps, update TLS shared secrets, apply connection limits |
| 1197 | and rate limits on the fly to arbitrary frontends (useful in shared hosting |
| 1198 | environments), and disable a specific frontend to release a listening port |
| 1199 | (useful when daytime operations are forbidden and a fix is needed nonetheless). |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1200 | Updating certificates and their configuration on the fly is permitted, as well |
| 1201 | as enabling and consulting traces of every processing step of the traffic. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1202 | |
| 1203 | For environments where SNMP is mandatory, at least two agents exist, one is |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1204 | provided with the HAProxy sources and relies on the Net-SNMP Perl module. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1205 | Another one is provided with the commercial packages and doesn't require Perl. |
| 1206 | Both are roughly equivalent in terms of coverage. |
| 1207 | |
| 1208 | It is often recommended to install 4 utilities on the machine where HAProxy is |
| 1209 | deployed : |
| 1210 | |
| 1211 | - socat (in order to connect to the CLI, though certain forks of netcat can |
| 1212 | also do it to some extents); |
| 1213 | |
| 1214 | - halog from the latest HAProxy version : this is the log analysis tool, it |
| 1215 | parses native TCP and HTTP logs extremely fast (1 to 2 GB per second) and |
| 1216 | extracts useful information and statistics such as requests per URL, per |
| 1217 | source address, URLs sorted by response time or error rate, termination |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1218 | codes etc. It was designed to be deployed on the production servers to |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1219 | help troubleshoot live issues so it has to be there ready to be used; |
| 1220 | |
| 1221 | - tcpdump : this is highly recommended to take the network traces needed to |
| 1222 | troubleshoot an issue that was made visible in the logs. There is a moment |
| 1223 | where application and haproxy's analysis will diverge and the network traces |
| 1224 | are the only way to say who's right and who's wrong. It's also fairly common |
| 1225 | to detect bugs in network stacks and hypervisors thanks to tcpdump; |
| 1226 | |
| 1227 | - strace : it is tcpdump's companion. It will report what HAProxy really sees |
| 1228 | and will help sort out the issues the operating system is responsible for |
| 1229 | from the ones HAProxy is responsible for. Strace is often requested when a |
| 1230 | bug in HAProxy is suspected; |
| 1231 | |
| 1232 | |
| 1233 | 3.4.2. Advanced features : System-specific capabilities |
| 1234 | ------------------------------------------------------- |
| 1235 | |
| 1236 | Depending on the operating system HAProxy is deployed on, certain extra features |
| 1237 | may be available or needed. While it is supported on a number of platforms, |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 1238 | HAProxy is primarily developed on Linux, which explains why some features are |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1239 | only available on this platform. |
| 1240 | |
| 1241 | The transparent bind and connect features, the support for binding connections |
| 1242 | to a specific network interface, as well as the ability to bind multiple |
| 1243 | processes to the same IP address and ports are only available on Linux and BSD |
| 1244 | systems, though only Linux performs a kernel-side load balancing of the incoming |
| 1245 | requests between the available processes. |
| 1246 | |
| 1247 | On Linux, there are also a number of extra features and optimizations including |
| 1248 | support for network namespaces (also known as "containers") allowing HAProxy to |
| 1249 | be a gateway between all containers, the ability to set the MSS, Netfilter marks |
| 1250 | and IP TOS field on the client side connection, support for TCP FastOpen on the |
| 1251 | listening side, TCP user timeouts to let the kernel quickly kill connections |
| 1252 | when it detects the client has disappeared before the configured timeouts, TCP |
| 1253 | splicing to let the kernel forward data between the two sides of a connections |
| 1254 | thus avoiding multiple memory copies, the ability to enable the "defer-accept" |
| 1255 | bind option to only get notified of an incoming connection once data become |
| 1256 | available in the kernel buffers, and the ability to send the request with the |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1257 | ACK confirming a connect (sometimes called "piggy-back") which is enabled with |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1258 | the "tcp-smart-connect" option. On Linux, HAProxy also takes great care of |
| 1259 | manipulating the TCP delayed ACKs to save as many packets as possible on the |
| 1260 | network. |
| 1261 | |
| 1262 | Some systems have an unreliable clock which jumps back and forth in the past |
| 1263 | and in the future. This used to happen with some NUMA systems where multiple |
| 1264 | processors didn't see the exact same time of day, and recently it became more |
| 1265 | common in virtualized environments where the virtual clock has no relation with |
| 1266 | the real clock, resulting in huge time jumps (sometimes up to 30 seconds have |
| 1267 | been observed). This causes a lot of trouble with respect to timeout enforcement |
| 1268 | in general. Due to this flaw of these systems, HAProxy maintains its own |
| 1269 | monotonic clock which is based on the system's clock but where drift is measured |
| 1270 | and compensated for. This ensures that even with a very bad system clock, timers |
| 1271 | remain reasonably accurate and timeouts continue to work. Note that this problem |
| 1272 | affects all the software running on such systems and is not specific to HAProxy. |
| 1273 | The common effects are spurious timeouts or application freezes. Thus if this |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1274 | behavior is detected on a system, it must be fixed, regardless of the fact that |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1275 | HAProxy protects itself against it. |
| 1276 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1277 | On Linux, a new starting process may communicate with the previous one to reuse |
| 1278 | its listening file descriptors so that the listening sockets are never |
Thayne McCombs | cdbcca9 | 2021-01-07 21:24:41 -0700 | [diff] [blame] | 1279 | interrupted during the process's replacement. |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1280 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1281 | |
| 1282 | 3.4.3. Advanced features : Scripting |
| 1283 | ------------------------------------ |
| 1284 | |
| 1285 | HAProxy can be built with support for the Lua embedded language, which opens a |
| 1286 | wide area of new possibilities related to complex manipulation of requests or |
| 1287 | responses, routing decisions, statistics processing and so on. Using Lua it is |
| 1288 | even possible to establish parallel connections to other servers to exchange |
| 1289 | information. This way it becomes possible (though complex) to develop an |
| 1290 | authentication system for example. Please refer to the documentation in the file |
| 1291 | "doc/lua-api/index.rst" for more information on how to use Lua. |
| 1292 | |
| 1293 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1294 | 3.4.4. Advanced features: Tracing |
| 1295 | --------------------------------- |
| 1296 | |
| 1297 | At any moment an administrator may connect over the CLI and enable tracing in |
| 1298 | various internal subsystems. Various levels of details are provided by default |
| 1299 | so that in practice anything between one line per request to 500 lines per |
| 1300 | request can be retrieved. Filters as well as an automatic capture on/off/pause |
| 1301 | mechanism are available so that it really is possible to wait for a certain |
| 1302 | event and watch it in detail. This is extremely convenient to diagnose protocol |
| 1303 | violations from faulty servers and clients, or denial of service attacks. |
| 1304 | |
| 1305 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1306 | 3.5. Sizing |
| 1307 | ----------- |
| 1308 | |
| 1309 | Typical CPU usage figures show 15% of the processing time spent in HAProxy |
| 1310 | versus 85% in the kernel in TCP or HTTP close mode, and about 30% for HAProxy |
| 1311 | versus 70% for the kernel in HTTP keep-alive mode. This means that the operating |
| 1312 | system and its tuning have a strong impact on the global performance. |
| 1313 | |
| 1314 | Usages vary a lot between users, some focus on bandwidth, other ones on request |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1315 | rate, others on connection concurrency, others on SSL performance. This section |
| 1316 | aims at providing a few elements to help with this task. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1317 | |
| 1318 | It is important to keep in mind that every operation comes with a cost, so each |
| 1319 | individual operation adds its overhead on top of the other ones, which may be |
| 1320 | negligible in certain circumstances, and which may dominate in other cases. |
| 1321 | |
| 1322 | When processing the requests from a connection, we can say that : |
| 1323 | |
| 1324 | - forwarding data costs less than parsing request or response headers; |
| 1325 | |
| 1326 | - parsing request or response headers cost less than establishing then closing |
| 1327 | a connection to a server; |
| 1328 | |
| 1329 | - establishing an closing a connection costs less than a TLS resume operation; |
| 1330 | |
| 1331 | - a TLS resume operation costs less than a full TLS handshake with a key |
| 1332 | computation; |
| 1333 | |
| 1334 | - an idle connection costs less CPU than a connection whose buffers hold data; |
| 1335 | |
| 1336 | - a TLS context costs even more memory than a connection with data; |
| 1337 | |
| 1338 | So in practice, it is cheaper to process payload bytes than header bytes, thus |
| 1339 | it is easier to achieve high network bandwidth with large objects (few requests |
| 1340 | per volume unit) than with small objects (many requests per volume unit). This |
| 1341 | explains why maximum bandwidth is always measured with large objects, while |
| 1342 | request rate or connection rates are measured with small objects. |
| 1343 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1344 | Some operations scale well on multiple processes spread over multiple CPUs, |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1345 | and others don't scale as well. Network bandwidth doesn't scale very far because |
| 1346 | the CPU is rarely the bottleneck for large objects, it's mostly the network |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1347 | bandwidth and data buses to reach the network interfaces. The connection rate |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1348 | doesn't scale well over multiple processors due to a few locks in the system |
| 1349 | when dealing with the local ports table. The request rate over persistent |
| 1350 | connections scales very well as it doesn't involve much memory nor network |
| 1351 | bandwidth and doesn't require to access locked structures. TLS key computation |
| 1352 | scales very well as it's totally CPU-bound. TLS resume scales moderately well, |
| 1353 | but reaches its limits around 4 processes where the overhead of accessing the |
| 1354 | shared table offsets the small gains expected from more power. |
| 1355 | |
| 1356 | The performance numbers one can expect from a very well tuned system are in the |
| 1357 | following range. It is important to take them as orders of magnitude and to |
| 1358 | expect significant variations in any direction based on the processor, IRQ |
| 1359 | setting, memory type, network interface type, operating system tuning and so on. |
| 1360 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1361 | The following numbers were found on a Core i7 running at 3.7 GHz equipped with |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1362 | a dual-port 10 Gbps NICs running Linux kernel 3.10, HAProxy 1.6 and OpenSSL |
| 1363 | 1.0.2. HAProxy was running as a single process on a single dedicated CPU core, |
| 1364 | and two extra cores were dedicated to network interrupts : |
| 1365 | |
| 1366 | - 20 Gbps of maximum network bandwidth in clear text for objects 256 kB or |
| 1367 | higher, 10 Gbps for 41kB or higher; |
| 1368 | |
| 1369 | - 4.6 Gbps of TLS traffic using AES256-GCM cipher with large objects; |
| 1370 | |
| 1371 | - 83000 TCP connections per second from client to server; |
| 1372 | |
| 1373 | - 82000 HTTP connections per second from client to server; |
| 1374 | |
| 1375 | - 97000 HTTP requests per second in server-close mode (keep-alive with the |
| 1376 | client, close with the server); |
| 1377 | |
| 1378 | - 243000 HTTP requests per second in end-to-end keep-alive mode; |
| 1379 | |
| 1380 | - 300000 filtered TCP connections per second (anti-DDoS) |
| 1381 | |
| 1382 | - 160000 HTTPS requests per second in keep-alive mode over persistent TLS |
| 1383 | connections; |
| 1384 | |
| 1385 | - 13100 HTTPS requests per second using TLS resumed connections; |
| 1386 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1387 | - 1300 HTTPS connections per second using TLS connections renegotiated with |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1388 | RSA2048; |
| 1389 | |
| 1390 | - 20000 concurrent saturated connections per GB of RAM, including the memory |
| 1391 | required for system buffers; it is possible to do better with careful tuning |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1392 | but this result it easy to achieve. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1393 | |
| 1394 | - about 8000 concurrent TLS connections (client-side only) per GB of RAM, |
| 1395 | including the memory required for system buffers; |
| 1396 | |
| 1397 | - about 5000 concurrent end-to-end TLS connections (both sides) per GB of |
| 1398 | RAM including the memory required for system buffers; |
| 1399 | |
| 1400 | Thus a good rule of thumb to keep in mind is that the request rate is divided |
| 1401 | by 10 between TLS keep-alive and TLS resume, and between TLS resume and TLS |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1402 | renegotiation, while it's only divided by 3 between HTTP keep-alive and HTTP |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1403 | close. Another good rule of thumb is to remember that a high frequency core |
| 1404 | with AES instructions can do around 5 Gbps of AES-GCM per core. |
| 1405 | |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 1406 | Having more cores rarely helps (except for TLS) and is even counter-productive |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1407 | due to the lower frequency. In general a small number of high frequency cores |
| 1408 | is better. |
| 1409 | |
| 1410 | Another good rule of thumb is to consider that on the same server, HAProxy will |
| 1411 | be able to saturate : |
| 1412 | |
| 1413 | - about 5-10 static file servers or caching proxies; |
| 1414 | |
| 1415 | - about 100 anti-virus proxies; |
| 1416 | |
Willy Tarreau | 16af23c | 2015-08-27 16:30:53 +0200 | [diff] [blame] | 1417 | - and about 100-1000 application servers depending on the technology in use. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1418 | |
| 1419 | |
| 1420 | 3.6. How to get HAProxy |
| 1421 | ----------------------- |
| 1422 | |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1423 | HAProxy is an open source project covered by the GPLv2 license, meaning that |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1424 | everyone is allowed to redistribute it provided that access to the sources is |
| 1425 | also provided upon request, especially if any modifications were made. |
| 1426 | |
| 1427 | HAProxy evolves as a main development branch called "master" or "mainline", from |
| 1428 | which new branches are derived once the code is considered stable. A lot of web |
| 1429 | sites run some development branches in production on a voluntarily basis, either |
| 1430 | to participate to the project or because they need a bleeding edge feature, and |
| 1431 | their feedback is highly valuable to fix bugs and judge the overall quality and |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 1432 | stability of the version being developed. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1433 | |
| 1434 | The new branches that are created when the code is stable enough constitute a |
| 1435 | stable version and are generally maintained for several years, so that there is |
| 1436 | no emergency to migrate to a newer branch even when you're not on the latest. |
| 1437 | Once a stable branch is issued, it may only receive bug fixes, and very rarely |
| 1438 | minor feature updates when that makes users' life easier. All fixes that go into |
| 1439 | a stable branch necessarily come from the master branch. This guarantees that no |
| 1440 | fix will be lost after an upgrade. For this reason, if you fix a bug, please |
| 1441 | make the patch against the master branch, not the stable branch. You may even |
| 1442 | discover it was already fixed. This process also ensures that regressions in a |
| 1443 | stable branch are extremely rare, so there is never any excuse for not upgrading |
| 1444 | to the latest version in your current branch. |
| 1445 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1446 | Branches are numbered with two digits delimited with a dot, such as "1.6". |
| 1447 | Since 1.9, branches with an odd second digit are mostly focused on sensitive |
| 1448 | technical updates and more aimed at advanced users because they are likely to |
| 1449 | trigger more bugs than the other ones. They are maintained for about a year |
| 1450 | only and must not be deployed where they cannot be rolled back in emergency. A |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1451 | complete version includes one or two sub-version numbers indicating the level of |
| 1452 | fix. For example, version 1.5.14 is the 14th fix release in branch 1.5 after |
| 1453 | version 1.5.0 was issued. It contains 126 fixes for individual bugs, 24 updates |
| 1454 | on the documentation, and 75 other backported patches, most of which were needed |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 1455 | to fix the aforementioned 126 bugs. An existing feature may never be modified |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1456 | nor removed in a stable branch, in order to guarantee that upgrades within the |
| 1457 | same branch will always be harmless. |
| 1458 | |
| 1459 | HAProxy is available from multiple sources, at different release rhythms : |
| 1460 | |
| 1461 | - The official community web site : http://www.haproxy.org/ : this site |
| 1462 | provides the sources of the latest development release, all stable releases, |
| 1463 | as well as nightly snapshots for each branch. The release cycle is not fast, |
| 1464 | several months between stable releases, or between development snapshots. |
| 1465 | Very old versions are still supported there. Everything is provided as |
| 1466 | sources only, so whatever comes from there needs to be rebuilt and/or |
| 1467 | repackaged; |
| 1468 | |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1469 | - GitHub : https://github.com/haproxy/haproxy/ : this is the mirror for the |
| 1470 | development branch only, which provides integration with the issue tracker, |
| 1471 | continuous integration and code coverage tools. This is exclusively for |
| 1472 | contributors; |
| 1473 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1474 | - A number of operating systems such as Linux distributions and BSD ports. |
| 1475 | These systems generally provide long-term maintained versions which do not |
| 1476 | always contain all the fixes from the official ones, but which at least |
| 1477 | contain the critical fixes. It often is a good option for most users who do |
| 1478 | not seek advanced configurations and just want to keep updates easy; |
| 1479 | |
| 1480 | - Commercial versions from http://www.haproxy.com/ : these are supported |
| 1481 | professional packages built for various operating systems or provided as |
| 1482 | appliances, based on the latest stable versions and including a number of |
| 1483 | features backported from the next release for which there is a strong |
| 1484 | demand. It is the best option for users seeking the latest features with |
| 1485 | the reliability of a stable branch, the fastest response time to fix bugs, |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1486 | or simply support contracts on top of an open source product; |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1487 | |
| 1488 | |
| 1489 | In order to ensure that the version you're using is the latest one in your |
| 1490 | branch, you need to proceed this way : |
| 1491 | |
| 1492 | - verify which HAProxy executable you're running : some systems ship it by |
| 1493 | default and administrators install their versions somewhere else on the |
| 1494 | system, so it is important to verify in the startup scripts which one is |
| 1495 | used; |
| 1496 | |
| 1497 | - determine which source your HAProxy version comes from. For this, it's |
| 1498 | generally sufficient to type "haproxy -v". A development version will |
| 1499 | appear like this, with the "dev" word after the branch number : |
| 1500 | |
Willy Tarreau | 58000fe | 2021-05-09 06:25:16 +0200 | [diff] [blame] | 1501 | HAProxy version 2.4-dev18-a5357c-137 2021/05/09 - https://haproxy.org/ |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1502 | |
| 1503 | A stable version will appear like this, as well as unmodified stable |
| 1504 | versions provided by operating system vendors : |
| 1505 | |
Willy Tarreau | 58000fe | 2021-05-09 06:25:16 +0200 | [diff] [blame] | 1506 | HAProxy version 1.5.14 2015/07/02 |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1507 | |
| 1508 | And a nightly snapshot of a stable version will appear like this with an |
| 1509 | hexadecimal sequence after the version, and with the date of the snapshot |
| 1510 | instead of the date of the release : |
| 1511 | |
Willy Tarreau | 58000fe | 2021-05-09 06:25:16 +0200 | [diff] [blame] | 1512 | HAProxy version 1.5.14-e4766ba 2015/07/29 |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1513 | |
| 1514 | Any other format may indicate a system-specific package with its own |
| 1515 | patch set. For example HAProxy Enterprise versions will appear with the |
| 1516 | following format (<branch>-<latest commit>-<revision>) : |
| 1517 | |
Willy Tarreau | 58000fe | 2021-05-09 06:25:16 +0200 | [diff] [blame] | 1518 | HAProxy version 1.5.0-994126-357 2015/07/02 |
| 1519 | |
| 1520 | Please note that historically versions prior to 2.4 used to report the |
| 1521 | process name with a hyphen between "HA" and "Proxy", including those above |
| 1522 | which were adjusted to show the correct format only, so better ignore this |
| 1523 | word or use a relaxed match in scripts. Additionally, modern versions add |
| 1524 | a URL linking to the project's home. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1525 | |
Willy Tarreau | 58000fe | 2021-05-09 06:25:16 +0200 | [diff] [blame] | 1526 | Finally, versions 2.1 and above will include a "Status" line indicating |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1527 | whether the version is safe for production or not, and if so, till when, as |
| 1528 | well as a link to the list of known bugs affecting this version. |
| 1529 | |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1530 | - for system-specific packages, you have to check with your vendor's package |
| 1531 | repository or update system to ensure that your system is still supported, |
| 1532 | and that fixes are still provided for your branch. For community versions |
| 1533 | coming from haproxy.org, just visit the site, verify the status of your |
| 1534 | branch and compare the latest version with yours to see if you're on the |
| 1535 | latest one. If not you can upgrade. If your branch is not maintained |
| 1536 | anymore, you're definitely very late and will have to consider an upgrade |
| 1537 | to a more recent branch (carefully read the README when doing so). |
| 1538 | |
| 1539 | HAProxy will have to be updated according to the source it came from. Usually it |
| 1540 | follows the system vendor's way of upgrading a package. If it was taken from |
| 1541 | sources, please read the README file in the sources directory after extracting |
| 1542 | the sources and follow the instructions for your operating system. |
| 1543 | |
| 1544 | |
| 1545 | 4. Companion products and alternatives |
| 1546 | -------------------------------------- |
| 1547 | |
| 1548 | HAProxy integrates fairly well with certain products listed below, which is why |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1549 | they are mentioned here even if not directly related to HAProxy. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1550 | |
| 1551 | |
| 1552 | 4.1. Apache HTTP server |
| 1553 | ----------------------- |
| 1554 | |
| 1555 | Apache is the de-facto standard HTTP server. It's a very complete and modular |
| 1556 | project supporting both file serving and dynamic contents. It can serve as a |
Michael Prokop | 4438c60 | 2019-05-24 10:25:45 +0200 | [diff] [blame] | 1557 | frontend for some application servers. It can even proxy requests and cache |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1558 | responses. In all of these use cases, a front load balancer is commonly needed. |
Patrick Starr | dce734e | 2017-10-09 13:17:12 +0700 | [diff] [blame] | 1559 | Apache can work in various modes, some being heavier than others. Certain |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1560 | modules still require the heavier pre-forked model and will prevent Apache from |
| 1561 | scaling well with a high number of connections. In this case HAProxy can provide |
| 1562 | a tremendous help by enforcing the per-server connection limits to a safe value |
| 1563 | and will significantly speed up the server and preserve its resources that will |
| 1564 | be better used by the application. |
| 1565 | |
| 1566 | Apache can extract the client's address from the X-Forwarded-For header by using |
| 1567 | the "mod_rpaf" extension. HAProxy will automatically feed this header when |
| 1568 | "option forwardfor" is specified in its configuration. HAProxy may also offer a |
| 1569 | nice protection to Apache when exposed to the internet, where it will better |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1570 | resist a wide number of types of DoS attacks. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1571 | |
| 1572 | |
| 1573 | 4.2. NGINX |
| 1574 | ---------- |
| 1575 | |
| 1576 | NGINX is the second de-facto standard HTTP server. Just like Apache, it covers a |
| 1577 | wide range of features. NGINX is built on a similar model as HAProxy so it has |
| 1578 | no problem dealing with tens of thousands of concurrent connections. When used |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1579 | as a gateway to some applications (e.g. using the included PHP FPM) it can often |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1580 | be beneficial to set up some frontend connection limiting to reduce the load |
| 1581 | on the PHP application. HAProxy will clearly be useful there both as a regular |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1582 | load balancer and as the traffic regulator to speed up PHP by decongesting |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1583 | it. Also since both products use very little CPU thanks to their event-driven |
| 1584 | architecture, it's often easy to install both of them on the same system. NGINX |
| 1585 | implements HAProxy's PROXY protocol, thus it is easy for HAProxy to pass the |
| 1586 | client's connection information to NGINX so that the application gets all the |
| 1587 | relevant information. Some benchmarks have also shown that for large static |
| 1588 | file serving, implementing consistent hash on HAProxy in front of NGINX can be |
| 1589 | beneficial by optimizing the OS' cache hit ratio, which is basically multiplied |
| 1590 | by the number of server nodes. |
| 1591 | |
| 1592 | |
| 1593 | 4.3. Varnish |
| 1594 | ------------ |
| 1595 | |
| 1596 | Varnish is a smart caching reverse-proxy, probably best described as a web |
| 1597 | application accelerator. Varnish doesn't implement SSL/TLS and wants to dedicate |
| 1598 | all of its CPU cycles to what it does best. Varnish also implements HAProxy's |
| 1599 | PROXY protocol so that HAProxy can very easily be deployed in front of Varnish |
| 1600 | as an SSL offloader as well as a load balancer and pass it all relevant client |
| 1601 | information. Also, Varnish naturally supports decompression from the cache when |
| 1602 | a server has provided a compressed object, but doesn't compress however. HAProxy |
| 1603 | can then be used to compress outgoing data when backend servers do not implement |
| 1604 | compression, though it's rarely a good idea to compress on the load balancer |
| 1605 | unless the traffic is low. |
| 1606 | |
| 1607 | When building large caching farms across multiple nodes, HAProxy can make use of |
| 1608 | consistent URL hashing to intelligently distribute the load to the caching nodes |
| 1609 | and avoid cache duplication, resulting in a total cache size which is the sum of |
Willy Tarreau | ec8962c | 2020-05-05 17:39:16 +0200 | [diff] [blame] | 1610 | all caching nodes. In addition, caching of very small dumb objects for a short |
| 1611 | duration on HAProxy can sometimes save network round trips and reduce the CPU |
| 1612 | load on both the HAProxy and the Varnish nodes. This is only possible is no |
| 1613 | processing is done on these objects on Varnish (this is often referred to as |
| 1614 | the notion of "favicon cache", by which a sizeable percentage of useless |
| 1615 | downstream requests can sometimes be avoided). However do not enable HAProxy |
| 1616 | caching for a long time (more than a few seconds) in front of any other cache, |
| 1617 | that would significantly complicate troubleshooting without providing really |
| 1618 | significant savings. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1619 | |
| 1620 | |
| 1621 | 4.4. Alternatives |
| 1622 | ----------------- |
| 1623 | |
| 1624 | Linux Virtual Server (LVS or IPVS) is the layer 4 load balancer included within |
| 1625 | the Linux kernel. It works at the packet level and handles TCP and UDP. In most |
| 1626 | cases it's more a complement than an alternative since it doesn't have layer 7 |
| 1627 | knowledge at all. |
| 1628 | |
| 1629 | Pound is another well-known load balancer. It's much simpler and has much less |
| 1630 | features than HAProxy but for many very basic setups both can be used. Its |
| 1631 | author has always focused on code auditability first and wants to maintain the |
| 1632 | set of features low. Its thread-based architecture scales less well with high |
| 1633 | connection counts, but it's a good product. |
| 1634 | |
| 1635 | Pen is a quite light load balancer. It supports SSL, maintains persistence using |
| 1636 | a fixed-size table of its clients' IP addresses. It supports a packet-oriented |
| 1637 | mode allowing it to support direct server return and UDP to some extents. It is |
| 1638 | meant for small loads (the persistence table only has 2048 entries). |
| 1639 | |
| 1640 | NGINX can do some load balancing to some extents, though it's clearly not its |
| 1641 | primary function. Production traffic is used to detect server failures, the |
| 1642 | load balancing algorithms are more limited, and the stickiness is very limited. |
| 1643 | But it can make sense in some simple deployment scenarios where it is already |
| 1644 | present. The good thing is that since it integrates very well with HAProxy, |
Davor Ocelic | 4094ce1 | 2017-12-19 23:30:39 +0100 | [diff] [blame] | 1645 | there's nothing wrong with adding HAProxy later when its limits have been |
| 1646 | reached. |
Willy Tarreau | d8e42b6 | 2015-08-18 21:51:36 +0200 | [diff] [blame] | 1647 | |
| 1648 | Varnish also does some load balancing of its backend servers and does support |
| 1649 | real health checks. It doesn't implement stickiness however, so just like with |
| 1650 | NGINX, as long as stickiness is not needed that can be enough to start with. |
| 1651 | And similarly, since HAProxy and Varnish integrate so well together, it's easy |
| 1652 | to add it later into the mix to complement the feature set. |
| 1653 | |
Willy Tarreau | 6562623 | 2020-05-05 18:08:07 +0200 | [diff] [blame] | 1654 | |
| 1655 | 5. Contacts |
| 1656 | ----------- |
| 1657 | |
| 1658 | If you want to contact the developers or any community member about anything, |
| 1659 | the best way to do it usually is via the mailing list by sending your message |
| 1660 | to haproxy@formilux.org. Please note that this list is public and its archives |
| 1661 | are public as well so you should avoid disclosing sensitive information. A |
| 1662 | thousand of users of various experience levels are present there and even the |
| 1663 | most complex questions usually find an optimal response relatively quickly. |
| 1664 | Suggestions are welcome too. For users having difficulties with e-mail, a |
| 1665 | Discourse platform is available at http://discourse.haproxy.org/ . However |
| 1666 | please keep in mind that there are less people reading questions there and that |
| 1667 | most are handled by a really tiny team. In any case, please be patient and |
| 1668 | respectful with those who devote their spare time helping others. |
| 1669 | |
| 1670 | I you believe you've found a bug but are not sure, it's best reported on the |
| 1671 | mailing list. If you're quite convinced you've found a bug, that your version |
| 1672 | is up-to-date in its branch, and you already have a GitHub account, feel free |
| 1673 | to go directly to https://github.com/haproxy/haproxy/ and file an issue with |
| 1674 | all possibly available details. Again, this is public so be careful not to post |
| 1675 | information you might later regret. Since the issue tracker presents itself as |
| 1676 | a very long thread, please avoid pasting very long dumps (a few hundreds lines |
| 1677 | or more) and attach them instead. |
| 1678 | |
| 1679 | If you've found what you're absolutely certain can be considered a critical |
| 1680 | security issue that would put many users in serious trouble if discussed in a |
| 1681 | public place, then you can send it with the reproducer to security@haproxy.org. |
| 1682 | A small team of trusted developers will receive it and will be able to propose |
| 1683 | a fix. We usually don't use embargoes and once a fix is available it gets |
| 1684 | merged. In some rare circumstances it can happen that a release is coordinated |
| 1685 | with software vendors. Please note that this process usually messes up with |
| 1686 | eveyone's work, and that rushed up releases can sometimes introduce new bugs, |
| 1687 | so it's best avoided unless strictly necessary; as such, there is often little |
| 1688 | consideration for reports that needlessly cause such extra burden, and the best |
| 1689 | way to see your work credited usually is to provide a working fix, which will |
| 1690 | appear in changelogs. |