==================================================
5. Multi-site load-balancing with local preference
==================================================
5.1 Description of the problem
==============================
Consider a world-wide company with sites on several continents. There are two
production sites SITE1 and SITE2 which host identical applications. There are
many offices around the world. For speed and communication cost reasons, each
office uses the nearest site by default, but can switch to the backup site in
the event of a site or application failure. There also are users on the
production sites, which use their local sites by default, but can switch to the
other site in case of a local application failure.
The main constraints are :
- application persistence : although the application is the same on both
sites, there is no session synchronisation between the sites. A failure
of one server or one site can cause a user to switch to another server
or site, but when the server or site comes back, the user must not switch
again.
- communication costs : inter-site communication should be reduced to the
minimum. Specifically, in case of a local application failure, every
office should be able to switch to the other site without continuing to
use the default site.
5.2 Solution
============
- Each production site will have two haproxy load-balancers in front of its
application servers to balance the load across them and provide local HA.
We will call them "S1L1" and "S1L2" on site 1, and "S2L1" and "S2L2" on
site 2. These proxies will extend the application's JSESSIONID cookie to
put the server name as a prefix.
- Each production site will have one front-end haproxy director to provide
the service to local users and to remote offices. It will load-balance
across the two local load-balancers, and will use the other site's
load-balancers as backup servers. It will insert the local site identifier
in a SITE cookie for the local load-balancers, and the remote site
identifier for the remote load-balancers. These front-end directors will
be called "SD1" and "SD2" for "Site Director".
- Each office will have one haproxy near the border gateway which will direct
local users to their preference site by default, or to the backup site in
the event of a previous failure. It will also analyze the SITE cookie, and
direct the users to the site referenced in the cookie. Thus, the preferred
site will be declared as a normal server, and the backup site will be
declared as a backup server only, which will only be used when the primary
site is unreachable, or when the primary site's director has forwarded
traffic to the second site. These proxies will be called "OP1".."OPXX"
for "Office Proxy #XX".
5.3 Network diagram
===================
Note : offices 1 and 2 are on the same continent as site 1, while
office 3 is on the same continent as site 3. Each production
site can reach the second one either through the WAN or through
a dedicated link.
Office1 Office2 Office3
users users users
192.168 # # # 192.168 # # # # # #
.1.0/24 | | | .2.0/24 | | | 192.168.3.0/24 | | |
--+----+-+-+- --+----+-+-+- ---+----+-+-+-
| | .1 | | .1 | | .1
| +-+-+ | +-+-+ | +-+-+
| |OP1| | |OP2| | |OP3| ...
,-:-. +---+ ,-:-. +---+ ,-:-. +---+
( X ) ( X ) ( X )
`-:-' `-:-' ,---. `-:-'
--+---------------+------+----~~~( X )~~~~-------+---------+-
| `---' |
| |
+---+ ,-:-. +---+ ,-:-.
|SD1| ( X ) |SD2| ( X )
( SITE 1 ) +-+-+ `-:-' ( SITE 2 ) +-+-+ `-:-'
|.1 | |.1 |
10.1.1.0/24 | | ,---. 10.2.1.0/24 | |
-+-+-+-+-+-+-+-----+-+--( X )------+-+-+-+-+-+-+-----+-+--
| | | | | | | `---' | | | | | | |
...# # # # # |.11 |.12 ...# # # # # |.11 |.12
Site 1 +-+--+ +-+--+ Site 2 +-+--+ +-+--+
Local |S1L1| |S1L2| Local |S2L1| |S2L2|
users +-+--+ +--+-+ users +-+--+ +--+-+
| | | |
10.1.2.0/24 -+-+-+--+--++-- 10.2.2.0/24 -+-+-+--+--++--
|.1 |.4 |.1 |.4
+-+-+ +-+-+ +-+-+ +-+-+
|W11| ~~~ |W14| |W21| ~~~ |W24|
+---+ +---+ +---+ +---+
4 application servers 4 application servers
on site 1 on site 2
5.4 Description
===============
5.4.1 Local users
-----------------
- Office 1 users connect to OP1 = 192.168.1.1
- Office 2 users connect to OP2 = 192.168.2.1
- Office 3 users connect to OP3 = 192.168.3.1
- Site 1 users connect to SD1 = 10.1.1.1
- Site 2 users connect to SD2 = 10.2.1.1
5.4.2 Office proxies
--------------------
- Office 1 connects to site 1 by default and uses site 2 as a backup.
- Office 2 connects to site 1 by default and uses site 2 as a backup.
- Office 3 connects to site 2 by default and uses site 1 as a backup.
The offices check the local site's SD proxy every 30 seconds, and the
remote one every 60 seconds.
Configuration for Office Proxy OP1
----------------------------------
listen 192.168.1.1:80
mode http
balance roundrobin
redispatch
cookie SITE
option httpchk HEAD / HTTP/1.0
server SD1 10.1.1.1:80 cookie SITE1 check inter 30000
server SD2 10.2.1.1:80 cookie SITE2 check inter 60000 backup
Configuration for Office Proxy OP2
----------------------------------
listen 192.168.2.1:80
mode http
balance roundrobin
redispatch
cookie SITE
option httpchk HEAD / HTTP/1.0
server SD1 10.1.1.1:80 cookie SITE1 check inter 30000
server SD2 10.2.1.1:80 cookie SITE2 check inter 60000 backup
Configuration for Office Proxy OP3
----------------------------------
listen 192.168.3.1:80
mode http
balance roundrobin
redispatch
cookie SITE
option httpchk HEAD / HTTP/1.0
server SD2 10.2.1.1:80 cookie SITE2 check inter 30000
server SD1 10.1.1.1:80 cookie SITE1 check inter 60000 backup
5.4.3 Site directors ( SD1 and SD2 )
------------------------------------
The site directors forward traffic to the local load-balancers, and set a
cookie to identify the site. If no local load-balancer is available, or if
the local application servers are all down, it will redirect traffic to the
remote site, and report this in the SITE cookie. In order not to uselessly
load each site's WAN link, each SD will check the other site at a lower
rate. The site directors will also insert their client's address so that
the application server knows which local user or remote site accesses it.
The SITE cookie which is set by these directors will also be understood
by the office proxies. This is important because if SD1 decides to forward
traffic to site 2, it will write "SITE2" in the "SITE" cookie, and on next
request, the office proxy will automatically and directly talk to SITE2 if
it can reach it. If it cannot, it will still send the traffic to SITE1
where SD1 will in turn try to reach SITE2.
The load-balancers checks are performed on port 81. As we'll see further,
the load-balancers provide a health monitoring port 81 which reroutes to
port 80 but which allows them to tell the SD that they are going down soon
and that the SD must not use them anymore.
Configuration for SD1
---------------------
listen 10.1.1.1:80
mode http
balance roundrobin
redispatch
cookie SITE insert indirect
option httpchk HEAD / HTTP/1.0
option forwardfor
server S1L1 10.1.1.11:80 cookie SITE1 check port 81 inter 4000
server S1L2 10.1.1.12:80 cookie SITE1 check port 81 inter 4000
server S2L1 10.2.1.11:80 cookie SITE2 check port 81 inter 8000 backup
server S2L2 10.2.1.12:80 cookie SITE2 check port 81 inter 8000 backup
Configuration for SD2
---------------------
listen 10.2.1.1:80
mode http
balance roundrobin
redispatch
cookie SITE insert indirect
option httpchk HEAD / HTTP/1.0
option forwardfor
server S2L1 10.2.1.11:80 cookie SITE2 check port 81 inter 4000
server S2L2 10.2.1.12:80 cookie SITE2 check port 81 inter 4000
server S1L1 10.1.1.11:80 cookie SITE1 check port 81 inter 8000 backup
server S1L2 10.1.1.12:80 cookie SITE1 check port 81 inter 8000 backup
5.4.4 Local load-balancers S1L1, S1L2, S2L1, S2L2
-------------------------------------------------
Please first note that because SD1 and SD2 use the same cookie for both
servers on a same site, the second load-balancer of each site will only
receive load-balanced requests, but as soon as the SITE cookie will be
set, only the first LB will receive the requests because it will be the
first one to match the cookie.
The load-balancers will spread the load across 4 local web servers, and
use the JSESSIONID provided by the application to provide server persistence
using the new 'prefix' method. Soft-stop will also be implemented as described
in section 4 above. Moreover, these proxies will provide their own maintenance
soft-stop. Port 80 will be used for application traffic, while port 81 will
only be used for health-checks and locally rerouted to port 80. A grace time
will be specified to service on port 80, but not on port 81. This way, a soft
kill (kill -USR1) on the proxy will only kill the health-check forwarder so
that the site director knows it must not use this load-balancer anymore. But
the service will still work for 20 seconds and as long as there are established
sessions.
These proxies will also be the only ones to disable HTTP keep-alive in the
chain, because it is enough to do it at one place, and it's necessary to do
it with 'prefix' cookies.
Configuration for S1L1/S1L2
---------------------------
listen 10.1.1.11:80 # 10.1.1.12:80 for S1L2
grace 20000 # don't kill us until 20 seconds have elapsed
mode http
balance roundrobin
cookie JSESSIONID prefix
option httpclose
option forwardfor
option httpchk HEAD / HTTP/1.0
server W11 10.1.2.1:80 cookie W11 check port 81 inter 2000
server W12 10.1.2.2:80 cookie W12 check port 81 inter 2000
server W13 10.1.2.3:80 cookie W13 check port 81 inter 2000
server W14 10.1.2.4:80 cookie W14 check port 81 inter 2000
server B11 10.1.2.1:80 cookie W11 check port 80 inter 4000 backup
server B12 10.1.2.2:80 cookie W12 check port 80 inter 4000 backup
server B13 10.1.2.3:80 cookie W13 check port 80 inter 4000 backup
server B14 10.1.2.4:80 cookie W14 check port 80 inter 4000 backup
listen 10.1.1.11:81 # 10.1.1.12:81 for S1L2
mode tcp
dispatch 10.1.1.11:80 # 10.1.1.12:80 for S1L2
Configuration for S2L1/S2L2
---------------------------
listen 10.2.1.11:80 # 10.2.1.12:80 for S2L2
grace 20000 # don't kill us until 20 seconds have elapsed
mode http
balance roundrobin
cookie JSESSIONID prefix
option httpclose
option forwardfor
option httpchk HEAD / HTTP/1.0
server W21 10.2.2.1:80 cookie W21 check port 81 inter 2000
server W22 10.2.2.2:80 cookie W22 check port 81 inter 2000
server W23 10.2.2.3:80 cookie W23 check port 81 inter 2000
server W24 10.2.2.4:80 cookie W24 check port 81 inter 2000
server B21 10.2.2.1:80 cookie W21 check port 80 inter 4000 backup
server B22 10.2.2.2:80 cookie W22 check port 80 inter 4000 backup
server B23 10.2.2.3:80 cookie W23 check port 80 inter 4000 backup
server B24 10.2.2.4:80 cookie W24 check port 80 inter 4000 backup
listen 10.2.1.11:81 # 10.2.1.12:81 for S2L2
mode tcp
dispatch 10.2.1.11:80 # 10.2.1.12:80 for S2L2
5.5 Comments
------------
Since each site director sets a cookie identifying the site, remote office
users will have their office proxies direct them to the right site and stick
to this site as long as the user still uses the application and the site is
available. Users on production sites will be directed to the right site by the
site directors depending on the SITE cookie.
If the WAN link dies on a production site, the remote office users will not
see their site anymore, so they will redirect the traffic to the second site.
If there are dedicated inter-site links as on the diagram above, the second
SD will see the cookie and still be able to reach the original site. For
example :
Office 1 user sends the following to OP1 :
GET / HTTP/1.0
Cookie: SITE=SITE1; JSESSIONID=W14~123;
OP1 cannot reach site 1 because its external router is dead. So the SD1 server
is seen as dead, and OP1 will then forward the request to SD2 on site 2,
regardless of the SITE cookie.
SD2 on site 2 receives a SITE cookie containing "SITE1". Fortunately, it
can reach Site 1's load balancers S1L1 and S1L2. So it forwards the request
so S1L1 (the first one with the same cookie).
S1L1 (on site 1) finds "W14" in the JSESSIONID cookie, so it can forward the
request to the right server, and the user session will continue to work. Once
the Site 1's WAN link comes back, OP1 will see SD1 again, and will not route
through SITE 2 anymore.
However, when a new user on Office 1 connects to the application during a
site 1 failure, it does not contain any cookie. Since OP1 does not see SD1
because of the network failure, it will direct the request to SD2 on site 2,
which will by default direct the traffic to the local load-balancers, S2L1 and
S2L2. So only initial users will load the inter-site link, not the new ones.
===================