blob: 06a044622bb846f2c2f5a7f09f6a3655e6c7a985 [file] [log] [blame]
willy tarreau0174f312005-12-18 01:02:42 +01001 -------------------
2 H A - P r o x y
3 Architecture Guide
4 -------------------
5 version 1.1.30
6 willy tarreau
7 2004/11/28
8
9
10This document provides real world examples with working configurations.
11Please note that except stated otherwise, global configuration parameters
12such as logging, chrooting, limits and time-outs are not described here.
13
14===================================================
151. Simple HTTP load-balancing with cookie insertion
16===================================================
17
18A web application often saturates the front-end server with high CPU loads,
19due to the scripting language involved. It also relies on a back-end database
20which is not much loaded. User contexts are stored on the server itself, and
21not in the database, so that simply adding another server with simple IP/TCP
22load-balancing would not work.
23
24 +-------+
25 |clients| clients and/or reverse-proxy
26 +---+---+
27 |
28 -+-----+--------+----
29 | _|_db
30 +--+--+ (___)
31 | web | (___)
32 +-----+ (___)
33 192.168.1.1 192.168.1.2
34
35
36Replacing the web server with a bigger SMP system would cost much more than
37adding low-cost pizza boxes. The solution is to buy N cheap boxes and install
38the application on them. Install haproxy on the old one which will spread the
39load across the new boxes.
40
41 192.168.1.1 192.168.1.11-192.168.1.14 192.168.1.2
42 -------+-----------+-----+-----+-----+--------+----
43 | | | | | _|_db
44 +--+--+ +-+-+ +-+-+ +-+-+ +-+-+ (___)
45 | LB1 | | A | | B | | C | | D | (___)
46 +-----+ +---+ +---+ +---+ +---+ (___)
47 haproxy 4 cheap web servers
48
49
50Config on haproxy (LB1) :
51-------------------------
52
53 listen 192.168.1.1:80
54 mode http
55 balance roundrobin
56 cookie SERVERID insert indirect
57 option httpchk HEAD /index.html HTTP/1.0
58 server webA 192.168.1.11:80 cookie A check
59 server webB 192.168.1.12:80 cookie B check
60 server webC 192.168.1.13:80 cookie C check
61 server webD 192.168.1.14:80 cookie D check
62
63
64Description :
65-------------
66 - LB1 will receive clients requests.
67 - if a request does not contain a cookie, it will be forwarded to a valid
68 server
69 - in return, a cookie "SERVERID" will be inserted in the response holding the
70 server name (eg: "A").
71 - when the client comes again with the cookie "SERVERID=A", LB1 will know that
72 it must be forwarded to server A. The cookie will be removed so that the
73 server does not see it.
74 - if server "webA" dies, the requests will be sent to another valid server
75 and a cookie will be reassigned.
76
77
78Flows :
79-------
80
81(client) (haproxy) (server A)
82 >-- GET /URI1 HTTP/1.0 ------------> |
83 ( no cookie, haproxy forwards in load-balancing mode. )
84 | >-- GET /URI1 HTTP/1.0 ---------->
85 | <-- HTTP/1.0 200 OK -------------<
86 ( the proxy now adds the server cookie in return )
87 <-- HTTP/1.0 200 OK ---------------< |
88 Set-Cookie: SERVERID=A |
89 >-- GET /URI2 HTTP/1.0 ------------> |
90 Cookie: SERVERID=A |
91 ( the proxy sees the cookie. it forwards to server A and deletes it )
92 | >-- GET /URI2 HTTP/1.0 ---------->
93 | <-- HTTP/1.0 200 OK -------------<
94 ( the proxy does not add the cookie in return because the client knows it )
95 <-- HTTP/1.0 200 OK ---------------< |
96 >-- GET /URI3 HTTP/1.0 ------------> |
97 Cookie: SERVERID=A |
98 ( ... )
99
100
101Limits :
102--------
103 - if clients use keep-alive (HTTP/1.1), only the first response will have
104 a cookie inserted, and only the first request of each session will be
105 analyzed. This does not cause trouble in insertion mode because the cookie
106 is put immediately in the first response, and the session is maintained to
107 the same server for all subsequent requests in the same session. However,
108 the cookie will not be removed from the requests forwarded to the servers,
109 so the server must not be sensitive to unknown cookies. If this causes
110 trouble, you can disable keep-alive by adding the following option :
111
112 option httpclose
113
114 - if for some reason the clients cannot learn more than one cookie (eg: the
115 clients are indeed some home-made applications or gateways), and the
116 application already produces a cookie, you can use the "prefix" mode (see
117 below).
118
119 - LB1 becomes a very sensible server. If LB1 dies, nothing works anymore.
120 => you can back it up using keepalived.
121
122 - if the application needs to log the original client's IP, use the
123 "forwardfor" option which will add an "X-Forwarded-For" header with the
124 original client's IP address. You must also use "httpclose" to ensure
125 that you will rewrite every requests and not only the first one of each
126 session :
127
128 option httpclose
129 option forwardfor
130
131 The web server will have to be configured to use this header instead.
132 For example, on apache, you can use LogFormat for this :
133
134 LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b " combined
135 CustomLog /var/log/httpd/access_log combined
136
137
138==================================================================
1392. HTTP load-balancing with cookie prefixing and high availability
140==================================================================
141
142Now you don't want to add more cookies, but rather use existing ones. The
143application already generates a "JSESSIONID" cookie which is enough to track
144sessions, so we'll prefix this cookie with the server name when we see it.
145Since the load-balancer becomes critical, it will be backed up with a second
146one in VRRP mode using keepalived.
147
148Download the latest version of keepalived from this site and install it
149on each load-balancer LB1 and LB2 :
150
151 http://www.keepalived.org/
152
153You then have a shared IP between the two load-balancers (we will still use the
154original IP). It is active only on one of them at any moment. To allow the
155proxy to bind to the shared IP, you must enable it in /proc :
156
157# echo 1 >/proc/sys/net/ipv4/ip_nonlocal_bind
158
159
160 shared IP=192.168.1.1
161 192.168.1.3 192.168.1.4 192.168.1.11-192.168.1.14 192.168.1.2
162 -------+------------+-----------+-----+-----+-----+--------+----
163 | | | | | | _|_db
164 +--+--+ +--+--+ +-+-+ +-+-+ +-+-+ +-+-+ (___)
165 | LB1 | | LB2 | | A | | B | | C | | D | (___)
166 +-----+ +-----+ +---+ +---+ +---+ +---+ (___)
167 haproxy haproxy 4 cheap web servers
168 keepalived keepalived
169
170
171Config on both proxies (LB1 and LB2) :
172--------------------------------------
173
174 listen 192.168.1.1:80
175 mode http
176 balance roundrobin
177 cookie JSESSIONID prefix
178 option httpclose
179 option forwardfor
180 option httpchk HEAD /index.html HTTP/1.0
181 server webA 192.168.1.11:80 cookie A check
182 server webB 192.168.1.12:80 cookie B check
183 server webC 192.168.1.13:80 cookie C check
184 server webD 192.168.1.14:80 cookie D check
185
186
187Notes: the proxy will modify EVERY cookie sent by the client and the server,
188so it is important that it can access to ALL cookies in ALL requests for
189each session. This implies that there is no keep-alive (HTTP/1.1), thus the
190"httpclose" option. Only if you know for sure that the client(s) will never
191use keep-alive, you can remove this option.
192
193
194Description :
195-------------
196 - LB1 is VRRP master (keepalived), LB2 is backup.
197 - LB1 will receive clients requests on IP 192.168.1.1.
198 - both load-balancers send their checks from their native IP.
199 - if a request does not contain a cookie, it will be forwarded to a valid
200 server
201 - in return, if a JESSIONID cookie is seen, the server name will be prefixed
202 into it, followed by a delimitor ('~')
203 - when the client comes again with the cookie "JSESSIONID=A~xxx", LB1 will
204 know that it must be forwarded to server A. The server name will then be
205 extracted from cookie before it is sent to the server.
206 - if server "webA" dies, the requests will be sent to another valid server
207 and a cookie will be reassigned.
208
209
210Flows :
211-------
212
213(client) (haproxy) (server A)
214 >-- GET /URI1 HTTP/1.0 ------------> |
215 ( no cookie, haproxy forwards in load-balancing mode. )
216 | >-- GET /URI1 HTTP/1.0 ---------->
217 | X-Forwarded-For: 10.1.2.3
218 | <-- HTTP/1.0 200 OK -------------<
219 ( no cookie, nothing changed )
220 <-- HTTP/1.0 200 OK ---------------< |
221 >-- GET /URI2 HTTP/1.0 ------------> |
222 ( no cookie, haproxy forwards in lb mode, possibly to another server. )
223 | >-- GET /URI2 HTTP/1.0 ---------->
224 | X-Forwarded-For: 10.1.2.3
225 | <-- HTTP/1.0 200 OK -------------<
226 | Set-Cookie: JSESSIONID=123
227 ( the cookie is identified, it will be prefixed with the server name )
228 <-- HTTP/1.0 200 OK ---------------< |
229 Set-Cookie: JSESSIONID=A~123 |
230 >-- GET /URI3 HTTP/1.0 ------------> |
231 Cookie: JSESSIONID=A~123 |
232 ( the proxy sees the cookie, removes the server name and forwards
233 to server A which sees the same cookie as it previously sent )
234 | >-- GET /URI3 HTTP/1.0 ---------->
235 | Cookie: JSESSIONID=123
236 | X-Forwarded-For: 10.1.2.3
237 | <-- HTTP/1.0 200 OK -------------<
238 ( no cookie, nothing changed )
239 <-- HTTP/1.0 200 OK ---------------< |
240 ( ... )
241
242
243
244========================================================
2452.1 Variations involving external layer 4 load-balancers
246========================================================
247
248Instead of using a VRRP-based active/backup solution for the proxies,
249they can also be load-balanced by a layer4 load-balancer (eg: Alteon)
250which will also check that the services run fine on both proxies :
251
252 | VIP=192.168.1.1
253 +----+----+
254 | Alteon |
255 +----+----+
256 |
257 192.168.1.3 | 192.168.1.4 192.168.1.11-192.168.1.14 192.168.1.2
258 -------+-----+------+-----------+-----+-----+-----+--------+----
259 | | | | | | _|_db
260 +--+--+ +--+--+ +-+-+ +-+-+ +-+-+ +-+-+ (___)
261 | LB1 | | LB2 | | A | | B | | C | | D | (___)
262 +-----+ +-----+ +---+ +---+ +---+ +---+ (___)
263 haproxy haproxy 4 cheap web servers
264
265
266Config on both proxies (LB1 and LB2) :
267--------------------------------------
268
269 listen 0.0.0.0:80
270 mode http
271 balance roundrobin
272 cookie JSESSIONID prefix
273 option httpclose
274 option forwardfor
275 option httplog
276 option dontlognull
277 option httpchk HEAD /index.html HTTP/1.0
278 server webA 192.168.1.11:80 cookie A check
279 server webB 192.168.1.12:80 cookie B check
280 server webC 192.168.1.13:80 cookie C check
281 server webD 192.168.1.14:80 cookie D check
282
283The "dontlognull" option is used to prevent the proxy from logging the health
284checks from the Alteon. If a session exchanges no data, then it will not be
285logged.
286
287Config on the Alteon :
288----------------------
289
290/c/slb/real 11
291 ena
292 name "LB1"
293 rip 192.168.1.3
294/c/slb/real 12
295 ena
296 name "LB2"
297 rip 192.168.1.4
298/c/slb/group 10
299 name "LB1-2"
300 metric roundrobin
301 health tcp
302 add 11
303 add 12
304/c/slb/virt 10
305 ena
306 vip 192.168.1.1
307/c/slb/virt 10/service http
308 group 10
309
310
311=========================================================
3123. Simple HTTP/HTTPS load-balancing with cookie insertion
313=========================================================
314
315This is the same context as in example 1 above, but the web
316server uses HTTPS.
317
318 +-------+
319 |clients| clients
320 +---+---+
321 |
322 -+-----+--------+----
323 | _|_db
324 +--+--+ (___)
325 | SSL | (___)
326 | web | (___)
327 +-----+
328 192.168.1.1 192.168.1.2
329
330
331Since haproxy does not handle SSL, this part will have to be extracted from the
332servers (freeing even more ressources) and installed on the load-balancer
333itself. Install haproxy and apache+mod_ssl on the old box which will spread the
334load between the new boxes. Apache will work in SSL reverse-proxy-cache. If the
335application is correctly developped, it might even lower its load. However,
336since there now is a cache between the clients and haproxy, some security
337measures must be taken to ensure that inserted cookies will not be cached.
338
339
340 192.168.1.1 192.168.1.11-192.168.1.14 192.168.1.2
341 -------+-----------+-----+-----+-----+--------+----
342 | | | | | _|_db
343 +--+--+ +-+-+ +-+-+ +-+-+ +-+-+ (___)
344 | LB1 | | A | | B | | C | | D | (___)
345 +-----+ +---+ +---+ +---+ +---+ (___)
346 apache 4 cheap web servers
347 mod_ssl
348 haproxy
349
350
351Config on haproxy (LB1) :
352-------------------------
353
354 listen 127.0.0.1:8000
355 mode http
356 balance roundrobin
357 cookie SERVERID insert indirect nocache
358 option httpchk HEAD /index.html HTTP/1.0
359 server webA 192.168.1.11:80 cookie A check
360 server webB 192.168.1.12:80 cookie B check
361 server webC 192.168.1.13:80 cookie C check
362 server webD 192.168.1.14:80 cookie D check
363
364
365Description :
366-------------
367 - apache on LB1 will receive clients requests on port 443
368 - it forwards it to haproxy bound to 127.0.0.1:8000
369 - if a request does not contain a cookie, it will be forwarded to a valid
370 server
371 - in return, a cookie "SERVERID" will be inserted in the response holding the
372 server name (eg: "A"), and a "Cache-control: private" header will be added
373 so that the apache does not cache any page containing such cookie.
374 - when the client comes again with the cookie "SERVERID=A", LB1 will know that
375 it must be forwarded to server A. The cookie will be removed so that the
376 server does not see it.
377 - if server "webA" dies, the requests will be sent to another valid server
378 and a cookie will be reassigned.
379
380Notes :
381-------
382 - if the cookie works in "prefix" mode, there is no need to add the "nocache"
383 option because it is an application cookie which will be modified, and the
384 application flags will be preserved.
385 - if apache 1.3 is used as a front-end before haproxy, it always disables
386 HTTP keep-alive on the back-end, so there is no need for the "httpclose"
387 option on haproxy.
388 - configure apache to set the X-Forwarded-For header itself, and do not do
389 it on haproxy if you need the application to know about the client's IP.
390
391
392Flows :
393-------
394
395(apache) (haproxy) (server A)
396 >-- GET /URI1 HTTP/1.0 ------------> |
397 ( no cookie, haproxy forwards in load-balancing mode. )
398 | >-- GET /URI1 HTTP/1.0 ---------->
399 | <-- HTTP/1.0 200 OK -------------<
400 ( the proxy now adds the server cookie in return )
401 <-- HTTP/1.0 200 OK ---------------< |
402 Set-Cookie: SERVERID=A |
403 Cache-Control: private |
404 >-- GET /URI2 HTTP/1.0 ------------> |
405 Cookie: SERVERID=A |
406 ( the proxy sees the cookie. it forwards to server A and deletes it )
407 | >-- GET /URI2 HTTP/1.0 ---------->
408 | <-- HTTP/1.0 200 OK -------------<
409 ( the proxy does not add the cookie in return because the client knows it )
410 <-- HTTP/1.0 200 OK ---------------< |
411 >-- GET /URI3 HTTP/1.0 ------------> |
412 Cookie: SERVERID=A |
413 ( ... )
414
415
416
417========================================
4184. Soft-stop for application maintenance
419========================================
420
421When an application is spread across several severs, the time to update all
422instances increases, so the application seems jerky for a longer period.
423
424HAproxy offers several solutions for this. Although it cannot be reconfigured
425without being stopped, not does it offer any external command, there are other
426working solutions.
427
428
429=========================================
4304.1 Soft-stop using a file on the servers
431=========================================
432
433This trick is quite common and very simple: put a file on the server which will
434be checked by the proxy. When you want to stop the server, first remove this
435file. The proxy will see the server as failed, and will not send it any new
436session, only the old ones if the "persist" option is used. Wait a bit then
437stop the server when it does not receive anymore connections.
438
439
440 listen 192.168.1.1:80
441 mode http
442 balance roundrobin
443 cookie SERVERID insert indirect
444 option httpchk HEAD /running HTTP/1.0
445 server webA 192.168.1.11:80 cookie A check inter 2000 rise 2 fall 2
446 server webB 192.168.1.12:80 cookie B check inter 2000 rise 2 fall 2
447 server webC 192.168.1.13:80 cookie C check inter 2000 rise 2 fall 2
448 server webD 192.168.1.14:80 cookie D check inter 2000 rise 2 fall 2
449 option persist
450 redispatch
451 contimeout 5000
452
453
454Description :
455-------------
456 - every 2 seconds, haproxy will try to access the file "/running" on the
457 servers, and declare the server as down after 2 attempts (4 seconds).
458 - only the servers which respond with a 200 or 3XX response will be used.
459 - if a request does not contain a cookie, it will be forwarded to a valid
460 server
461 - if a request contains a cookie for a failed server, haproxy will insist
462 on trying to reach the server anyway, to let the user finish what he was
463 doing. ("persist" option)
464 - if the server is totally stopped, the connection will fail and the proxy
465 will rebalance the client to another server ("redispatch")
466
467Usage on the web servers :
468--------------------------
469- to start the server :
470 # /etc/init.d/httpd start
471 # touch /home/httpd/www/running
472
473- to soft-stop the server
474 # rm -f /home/httpd/www/running
475
476- to completely stop the server :
477 # /etc/init.d/httpd stop
478
479Limits
480------
481If the server is totally powered down, the proxy will still try to reach it
482for those clients who still have a cookie referencing it, and the connection
483attempt will expire after 5 seconds ("contimeout"), and only after that, the
484client will be redispatched to another server. So this mode is only useful
485for software updates where the server will suddenly refuse the connection
486because the process is stopped. The problem is the same if the server suddenly
487crashes. All of its users will be fairly perturbated.
488
489
490==================================
4914.2 Soft-stop using backup servers
492==================================
493
494A better solution which covers every situation is to use backup servers.
495Version 1.1.30 fixed a bug which prevented a backup server from sharing
496the same cookie as a standard server.
497
498
499 listen 192.168.1.1:80
500 mode http
501 balance roundrobin
502 redispatch
503 cookie SERVERID insert indirect
504 option httpchk HEAD / HTTP/1.0
505 server webA 192.168.1.11:80 cookie A check port 81 inter 2000
506 server webB 192.168.1.12:80 cookie B check port 81 inter 2000
507 server webC 192.168.1.13:80 cookie C check port 81 inter 2000
508 server webD 192.168.1.14:80 cookie D check port 81 inter 2000
509
510 server bkpA 192.168.1.11:80 cookie A check port 80 inter 2000 backup
511 server bkpB 192.168.1.12:80 cookie B check port 80 inter 2000 backup
512 server bkpC 192.168.1.13:80 cookie C check port 80 inter 2000 backup
513 server bkpD 192.168.1.14:80 cookie D check port 80 inter 2000 backup
514
515Description
516-----------
517Four servers webA..D are checked on their port 81 every 2 seconds. The same
518servers named bkpA..D are checked on the port 80, and share the exact same
519cookies. Those servers will only be used when no other server is available
520for the same cookie.
521
522When the web servers are started, only the backup servers are seen as
523available. On the web servers, you need to redirect port 81 to local
524port 80, either with a local proxy (eg: a simple haproxy tcp instance),
525or with iptables (linux) or pf (openbsd). This is because we want the
526real web server to reply on this port, and not a fake one. Eg, with
527iptables :
528
529 # /etc/init.d/httpd start
530 # iptables -t nat -A PREROUTING -p tcp --dport 81 -j REDIRECT --to-port 80
531
532A few seconds later, the standard server is seen up and haproxy starts to send
533it new requests on its real port 80 (only new users with no cookie, of course).
534
535If a server completely crashes (even if it does not respond at the IP level),
536both the standard and backup servers will fail, so clients associated to this
537server will be redispatched to other live servers and will lose their sessions.
538
539Now if you want to enter a server into maintenance, simply stop it from
540responding on port 81 so that its standard instance will be seen as failed,
541but the backup will still work. Users will not notice anything since the
542service is still operational :
543
544 # iptables -t nat -D PREROUTING -p tcp --dport 81 -j REDIRECT --to-port 80
545
546The health checks on port 81 for this server will quickly fail, and the
547standard server will be seen as failed. No new session will be sent to this
548server, and existing clients with a valid cookie will still reach it because
549the backup server will still be up.
550
551Now wait as long as you want for the old users to stop using the service, and
552once you see that the server does not receive any traffic, simply stop it :
553
554 # /etc/init.d/httpd stop
555
556The associated backup server will in turn fail, and if any client still tries
557to access this particular server, he will be redispatched to any other valid
558server because of the "redispatch" option.
559
560This method has an advantage : you never touch the proxy when doing server
561maintenance. The people managing the servers can make them disappear smoothly.
562
563
5644.2.1 Variations for operating systems without any firewall software
565--------------------------------------------------------------------
566
567The downside is that you need a redirection solution on the server just for
568the health-checks. If the server OS does not support any firewall software,
569this redirection can also be handled by a simple haproxy in tcp mode :
570
571 global
572 daemon
573 quiet
574 pidfile /var/run/haproxy-checks.pid
575 listen 0.0.0.0:81
576 mode tcp
577 dispatch 127.0.0.1:80
578 contimeout 1000
579 clitimeout 10000
580 srvtimeout 10000
581
582To start the web service :
583
584 # /etc/init.d/httpd start
585 # haproxy -f /etc/haproxy/haproxy-checks.cfg
586
587To soft-stop the service :
588
589 # kill $(</var/run/haproxy-checks.pid)
590
591The port 81 will stop to respond and the load-balancer will notice the failure.
592
593
5944.2.2 Centralizing the server management
595----------------------------------------
596
597If one find it preferable to manage the servers from the load-balancer itself,
598the port redirector can be installed on the load-balancer itself. See the
599example with iptables below.
600
601Make the servers appear as operational :
602 # iptables -t nat -A OUTPUT -d 192.168.1.11 -p tcp --dport 81 -j DNAT --to-dest :80
603 # iptables -t nat -A OUTPUT -d 192.168.1.12 -p tcp --dport 81 -j DNAT --to-dest :80
604 # iptables -t nat -A OUTPUT -d 192.168.1.13 -p tcp --dport 81 -j DNAT --to-dest :80
605 # iptables -t nat -A OUTPUT -d 192.168.1.14 -p tcp --dport 81 -j DNAT --to-dest :80
606
607Soft stop one server :
608 # iptables -t nat -D OUTPUT -d 192.168.1.12 -p tcp --dport 81 -j DNAT --to-dest :80
609
610Another solution is to use the "COMAFILE" patch provided by Alexander Lazic,
611which is available for download here :
612
613 http://w.ods.org/tools/haproxy/contrib/
614
615
6164.2.3 Notes :
617-------------
618 - Never, ever, start a fake service on port 81 for the health-checks, because
619 a real web service failure will not be detected as long as the fake service
620 runs. You must really forward the check port to the real application.
621
622 - health-checks will be sent twice as often, once for each standard server,
623 and once for reach backup server. All this will be multiplicated by the
624 number of processes if you use multi-process mode. You will have to check
625 that all the checks sent to the server do not load it.
626
627
628==================================================
6295. Multi-site load-balancing with local preference
630==================================================
631
6325.1 Description of the problem
633==============================
634
635Consider a world-wide company with sites on several continents. There are two
636production sites SITE1 and SITE2 which host identical applications. There are
637many offices around the world. For speed and communication cost reasons, each
638office uses the nearest site by default, but can switch to the backup site in
639the event of a site or application failure. There also are users on the
640production sites, which use their local sites by default, but can switch to the
641other site in case of a local application failure.
642
643The main constraints are :
644
645 - application persistence : although the application is the same on both
646 sites, there is no session synchronisation between the sites. A failure
647 of one server or one site can cause a user to switch to another server
648 or site, but when the server or site comes back, the user must not switch
649 again.
650
651 - communication costs : inter-site communication should be reduced to the
652 minimum. Specifically, in case of a local application failure, every
653 office should be able to switch to the other site without continuing to
654 use the default site.
655
6565.2 Solution
657============
658 - Each production site will have two haproxy load-balancers in front of its
659 application servers to balance the load across them and provide local HA.
660 We will call them "S1L1" and "S1L2" on site 1, and "S2L1" and "S2L2" on
661 site 2. These proxies will extend the application's JSESSIONID cookie to
662 put the server name as a prefix.
663
664 - Each production site will have one front-end haproxy director to provide
665 the service to local users and to remote offices. It will load-balance
666 across the two local load-balancers, and will use the other site's
667 load-balancers as backup servers. It will insert the local site identifier
668 in a SITE cookie for the local load-balancers, and the remote site
669 identifier for the remote load-balancers. These front-end directors will
670 be called "SD1" and "SD2" for "Site Director".
671
672 - Each office will have one haproxy near the border gateway which will direct
673 local users to their preference site by default, or to the backup site in
674 the event of a previous failure. It will also analyze the SITE cookie, and
675 direct the users to the site referenced in the cookie. Thus, the preferred
676 site will be declared as a normal server, and the backup site will be
677 declared as a backup server only, which will only be used when the primary
678 site is unreachable, or when the primary site's director has forwarded
679 traffic to the second site. These proxies will be called "OP1".."OPXX"
680 for "Office Proxy #XX".
681
682
6835.3 Network diagram
684===================
685
686Note : offices 1 and 2 are on the same continent as site 1, while
687 office 3 is on the same continent as site 3. Each production
688 site can reach the second one either through the WAN or through
689 a dedicated link.
690
691
692 Office1 Office2 Office3
693 users users users
694192.168 # # # 192.168 # # # # # #
695.1.0/24 | | | .2.0/24 | | | 192.168.3.0/24 | | |
696 --+----+-+-+- --+----+-+-+- ---+----+-+-+-
697 | | .1 | | .1 | | .1
698 | +-+-+ | +-+-+ | +-+-+
699 | |OP1| | |OP2| | |OP3| ...
700 ,-:-. +---+ ,-:-. +---+ ,-:-. +---+
701 ( X ) ( X ) ( X )
702 `-:-' `-:-' ,---. `-:-'
703 --+---------------+------+----~~~( X )~~~~-------+---------+-
704 | `---' |
705 | |
706 +---+ ,-:-. +---+ ,-:-.
707 |SD1| ( X ) |SD2| ( X )
708 ( SITE 1 ) +-+-+ `-:-' ( SITE 2 ) +-+-+ `-:-'
709 |.1 | |.1 |
710 10.1.1.0/24 | | ,---. 10.2.1.0/24 | |
711 -+-+-+-+-+-+-+-----+-+--( X )------+-+-+-+-+-+-+-----+-+--
712 | | | | | | | `---' | | | | | | |
713 ...# # # # # |.11 |.12 ...# # # # # |.11 |.12
714 Site 1 +-+--+ +-+--+ Site 2 +-+--+ +-+--+
715 Local |S1L1| |S1L2| Local |S2L1| |S2L2|
716 users +-+--+ +--+-+ users +-+--+ +--+-+
717 | | | |
718 10.1.2.0/24 -+-+-+--+--++-- 10.2.2.0/24 -+-+-+--+--++--
719 |.1 |.4 |.1 |.4
720 +-+-+ +-+-+ +-+-+ +-+-+
721 |W11| ~~~ |W14| |W21| ~~~ |W24|
722 +---+ +---+ +---+ +---+
723 4 application servers 4 application servers
724 on site 1 on site 2
725
726
727
7285.4 Description
729===============
730
7315.4.1 Local users
732-----------------
733 - Office 1 users connect to OP1 = 192.168.1.1
734 - Office 2 users connect to OP2 = 192.168.2.1
735 - Office 3 users connect to OP3 = 192.168.3.1
736 - Site 1 users connect to SD1 = 10.1.1.1
737 - Site 2 users connect to SD2 = 10.2.1.1
738
7395.4.2 Office proxies
740--------------------
741 - Office 1 connects to site 1 by default and uses site 2 as a backup.
742 - Office 2 connects to site 1 by default and uses site 2 as a backup.
743 - Office 3 connects to site 2 by default and uses site 1 as a backup.
744
745The offices check the local site's SD proxy every 30 seconds, and the
746remote one every 60 seconds.
747
748
749Configuration for Office Proxy OP1
750----------------------------------
751
752 listen 192.168.1.1:80
753 mode http
754 balance roundrobin
755 redispatch
756 cookie SITE
757 option httpchk HEAD / HTTP/1.0
758 server SD1 10.1.1.1:80 cookie SITE1 check inter 30000
759 server SD2 10.2.1.1:80 cookie SITE2 check inter 60000 backup
760
761
762Configuration for Office Proxy OP2
763----------------------------------
764
765 listen 192.168.2.1:80
766 mode http
767 balance roundrobin
768 redispatch
769 cookie SITE
770 option httpchk HEAD / HTTP/1.0
771 server SD1 10.1.1.1:80 cookie SITE1 check inter 30000
772 server SD2 10.2.1.1:80 cookie SITE2 check inter 60000 backup
773
774
775Configuration for Office Proxy OP3
776----------------------------------
777
778 listen 192.168.3.1:80
779 mode http
780 balance roundrobin
781 redispatch
782 cookie SITE
783 option httpchk HEAD / HTTP/1.0
784 server SD2 10.2.1.1:80 cookie SITE2 check inter 30000
785 server SD1 10.1.1.1:80 cookie SITE1 check inter 60000 backup
786
787
7885.4.3 Site directors ( SD1 and SD2 )
789------------------------------------
790The site directors forward traffic to the local load-balancers, and set a
791cookie to identify the site. If no local load-balancer is available, or if
792the local application servers are all down, it will redirect traffic to the
793remote site, and report this in the SITE cookie. In order not to uselessly
794load each site's WAN link, each SD will check the other site at a lower
795rate. The site directors will also insert their client's address so that
796the application server knows which local user or remote site accesses it.
797
798The SITE cookie which is set by these directors will also be understood
799by the office proxies. This is important because if SD1 decides to forward
800traffic to site 2, it will write "SITE2" in the "SITE" cookie, and on next
801request, the office proxy will automatically and directly talk to SITE2 if
802it can reach it. If it cannot, it will still send the traffic to SITE1
803where SD1 will in turn try to reach SITE2.
804
805The load-balancers checks are performed on port 81. As we'll see further,
806the load-balancers provide a health monitoring port 81 which reroutes to
807port 80 but which allows them to tell the SD that they are going down soon
808and that the SD must not use them anymore.
809
810
811Configuration for SD1
812---------------------
813
814 listen 10.1.1.1:80
815 mode http
816 balance roundrobin
817 redispatch
818 cookie SITE insert indirect
819 option httpchk HEAD / HTTP/1.0
820 option forwardfor
821 server S1L1 10.1.1.11:80 cookie SITE1 check port 81 inter 4000
822 server S1L2 10.1.1.12:80 cookie SITE1 check port 81 inter 4000
823 server S2L1 10.2.1.11:80 cookie SITE2 check port 81 inter 8000 backup
824 server S2L2 10.2.1.12:80 cookie SITE2 check port 81 inter 8000 backup
825
826Configuration for SD2
827---------------------
828
829 listen 10.2.1.1:80
830 mode http
831 balance roundrobin
832 redispatch
833 cookie SITE insert indirect
834 option httpchk HEAD / HTTP/1.0
835 option forwardfor
836 server S2L1 10.2.1.11:80 cookie SITE2 check port 81 inter 4000
837 server S2L2 10.2.1.12:80 cookie SITE2 check port 81 inter 4000
838 server S1L1 10.1.1.11:80 cookie SITE1 check port 81 inter 8000 backup
839 server S1L2 10.1.1.12:80 cookie SITE1 check port 81 inter 8000 backup
840
841
8425.4.4 Local load-balancers S1L1, S1L2, S2L1, S2L2
843-------------------------------------------------
844Please first note that because SD1 and SD2 use the same cookie for both
845servers on a same site, the second load-balancer of each site will only
846receive load-balanced requests, but as soon as the SITE cookie will be
847set, only the first LB will receive the requests because it will be the
848first one to match the cookie.
849
850The load-balancers will spread the load across 4 local web servers, and
851use the JSESSIONID provided by the application to provide server persistence
852using the new 'prefix' method. Soft-stop will also be implemented as described
853in section 4 above. Moreover, these proxies will provide their own maintenance
854soft-stop. Port 80 will be used for application traffic, while port 81 will
855only be used for health-checks and locally rerouted to port 80. A grace time
856will be specified to service on port 80, but not on port 81. This way, a soft
857kill (kill -USR1) on the proxy will only kill the health-check forwarder so
858that the site director knows it must not use this load-balancer anymore. But
859the service will still work for 20 seconds and as long as there are established
860sessions.
861
862These proxies will also be the only ones to disable HTTP keep-alive in the
863chain, because it is enough to do it at one place, and it's necessary to do
864it with 'prefix' cookies.
865
866Configuration for S1L1/S1L2
867---------------------------
868
869 listen 10.1.1.11:80 # 10.1.1.12:80 for S1L2
870 grace 20000 # don't kill us until 20 seconds have elapsed
871 mode http
872 balance roundrobin
873 cookie JSESSIONID prefix
874 option httpclose
875 option forwardfor
876 option httpchk HEAD / HTTP/1.0
877 server W11 10.1.2.1:80 cookie W11 check port 81 inter 2000
878 server W12 10.1.2.2:80 cookie W12 check port 81 inter 2000
879 server W13 10.1.2.3:80 cookie W13 check port 81 inter 2000
880 server W14 10.1.2.4:80 cookie W14 check port 81 inter 2000
881
882 server B11 10.1.2.1:80 cookie W11 check port 80 inter 4000 backup
883 server B12 10.1.2.2:80 cookie W12 check port 80 inter 4000 backup
884 server B13 10.1.2.3:80 cookie W13 check port 80 inter 4000 backup
885 server B14 10.1.2.4:80 cookie W14 check port 80 inter 4000 backup
886
887 listen 10.1.1.11:81 # 10.1.1.12:81 for S1L2
888 mode tcp
889 dispatch 10.1.1.11:80 # 10.1.1.12:80 for S1L2
890
891
892Configuration for S2L1/S2L2
893---------------------------
894
895 listen 10.2.1.11:80 # 10.2.1.12:80 for S2L2
896 grace 20000 # don't kill us until 20 seconds have elapsed
897 mode http
898 balance roundrobin
899 cookie JSESSIONID prefix
900 option httpclose
901 option forwardfor
902 option httpchk HEAD / HTTP/1.0
903 server W21 10.2.2.1:80 cookie W21 check port 81 inter 2000
904 server W22 10.2.2.2:80 cookie W22 check port 81 inter 2000
905 server W23 10.2.2.3:80 cookie W23 check port 81 inter 2000
906 server W24 10.2.2.4:80 cookie W24 check port 81 inter 2000
907
908 server B21 10.2.2.1:80 cookie W21 check port 80 inter 4000 backup
909 server B22 10.2.2.2:80 cookie W22 check port 80 inter 4000 backup
910 server B23 10.2.2.3:80 cookie W23 check port 80 inter 4000 backup
911 server B24 10.2.2.4:80 cookie W24 check port 80 inter 4000 backup
912
913 listen 10.2.1.11:81 # 10.2.1.12:81 for S2L2
914 mode tcp
915 dispatch 10.2.1.11:80 # 10.2.1.12:80 for S2L2
916
917
9185.5 Comments
919------------
920Since each site director sets a cookie identifying the site, remote office
921users will have their office proxies direct them to the right site and stick
922to this site as long as the user still uses the application and the site is
923available. Users on production sites will be directed to the right site by the
924site directors depending on the SITE cookie.
925
926If the WAN link dies on a production site, the remote office users will not
927see their site anymore, so they will redirect the traffic to the second site.
928If there are dedicated inter-site links as on the diagram above, the second
929SD will see the cookie and still be able to reach the original site. For
930example :
931
932Office 1 user sends the following to OP1 :
933 GET / HTTP/1.0
934 Cookie: SITE=SITE1; JSESSIONID=W14~123;
935
936OP1 cannot reach site 1 because its external router is dead. So the SD1 server
937is seen as dead, and OP1 will then forward the request to SD2 on site 2,
938regardless of the SITE cookie.
939
940SD2 on site 2 receives a SITE cookie containing "SITE1". Fortunately, it
941can reach Site 1's load balancers S1L1 and S1L2. So it forwards the request
942so S1L1 (the first one with the same cookie).
943
944S1L1 (on site 1) finds "W14" in the JSESSIONID cookie, so it can forward the
945request to the right server, and the user session will continue to work. Once
946the Site 1's WAN link comes back, OP1 will see SD1 again, and will not route
947through SITE 2 anymore.
948
949However, when a new user on Office 1 connects to the application during a
950site 1 failure, it does not contain any cookie. Since OP1 does not see SD1
951because of the network failure, it will direct the request to SD2 on site 2,
952which will by default direct the traffic to the local load-balancers, S2L1 and
953S2L2. So only initial users will load the inter-site link, not the new ones.
954
955
956===================
9576. Source balancing
958===================
959
960Sometimes it may reveal useful to access servers from a pool of IP addresses
961instead of only one or two. Some equipments (NAT firewalls, load-balancers)
962are sensible to source address, and often need many sources to distribute the
963load evenly amongst their internal hash buckets.
964
965To do this, you simply have to use several times the same server with a
966different source. Example :
967
968 listen 0.0.0.0:80
969 mode tcp
970 balance roundrobin
971 server from1to1 10.1.1.1:80 source 10.1.2.1
972 server from2to1 10.1.1.1:80 source 10.1.2.2
973 server from3to1 10.1.1.1:80 source 10.1.2.3
974 server from4to1 10.1.1.1:80 source 10.1.2.4
975 server from5to1 10.1.1.1:80 source 10.1.2.5
976 server from6to1 10.1.1.1:80 source 10.1.2.6
977 server from7to1 10.1.1.1:80 source 10.1.2.7
978 server from8to1 10.1.1.1:80 source 10.1.2.8
979