========================================
4. Soft-stop for application maintenance
========================================
When an application is spread across several servers, the time to update all
instances increases, so the application seems jerky for a longer period.
HAproxy offers several solutions for this. Although it cannot be reconfigured
without being stopped, nor does it offer any external command, there are other
working solutions.
=========================================
4.1 Soft-stop using a file on the servers
=========================================
This trick is quite common and very simple: put a file on the server which will
be checked by the proxy. When you want to stop the server, first remove this
file. The proxy will see the server as failed, and will not send it any new
session, only the old ones if the "persist" option is used. Wait a bit then
stop the server when it does not receive anymore connections.
listen 192.168.1.1:80
mode http
balance roundrobin
cookie SERVERID insert indirect
option httpchk HEAD /running HTTP/1.0
server webA 192.168.1.11:80 cookie A check inter 2000 rise 2 fall 2
server webB 192.168.1.12:80 cookie B check inter 2000 rise 2 fall 2
server webC 192.168.1.13:80 cookie C check inter 2000 rise 2 fall 2
server webD 192.168.1.14:80 cookie D check inter 2000 rise 2 fall 2
option persist
redispatch
contimeout 5000
Description :
-------------
- every 2 seconds, haproxy will try to access the file "/running" on the
servers, and declare the server as down after 2 attempts (4 seconds).
- only the servers which respond with a 200 or 3XX response will be used.
- if a request does not contain a cookie, it will be forwarded to a valid
server
- if a request contains a cookie for a failed server, haproxy will insist
on trying to reach the server anyway, to let the user finish what he was
doing. ("persist" option)
- if the server is totally stopped, the connection will fail and the proxy
will rebalance the client to another server ("redispatch")
Usage on the web servers :
--------------------------
- to start the server :
# /etc/init.d/httpd start
# touch /home/httpd/www/running
- to soft-stop the server
# rm -f /home/httpd/www/running
- to completely stop the server :
# /etc/init.d/httpd stop
Limits
------
If the server is totally powered down, the proxy will still try to reach it
for those clients who still have a cookie referencing it, and the connection
attempt will expire after 5 seconds ("contimeout"), and only after that, the
client will be redispatched to another server. So this mode is only useful
for software updates where the server will suddenly refuse the connection
because the process is stopped. The problem is the same if the server suddenly
crashes. All of its users will be fairly perturbated.
==================================
4.2 Soft-stop using backup servers
==================================
A better solution which covers every situation is to use backup servers.
Version 1.1.30 fixed a bug which prevented a backup server from sharing
the same cookie as a standard server.
listen 192.168.1.1:80
mode http
balance roundrobin
redispatch
cookie SERVERID insert indirect
option httpchk HEAD / HTTP/1.0
server webA 192.168.1.11:80 cookie A check port 81 inter 2000
server webB 192.168.1.12:80 cookie B check port 81 inter 2000
server webC 192.168.1.13:80 cookie C check port 81 inter 2000
server webD 192.168.1.14:80 cookie D check port 81 inter 2000
server bkpA 192.168.1.11:80 cookie A check port 80 inter 2000 backup
server bkpB 192.168.1.12:80 cookie B check port 80 inter 2000 backup
server bkpC 192.168.1.13:80 cookie C check port 80 inter 2000 backup
server bkpD 192.168.1.14:80 cookie D check port 80 inter 2000 backup
Description
-----------
Four servers webA..D are checked on their port 81 every 2 seconds. The same
servers named bkpA..D are checked on the port 80, and share the exact same
cookies. Those servers will only be used when no other server is available
for the same cookie.
When the web servers are started, only the backup servers are seen as
available. On the web servers, you need to redirect port 81 to local
port 80, either with a local proxy (eg: a simple haproxy tcp instance),
or with iptables (linux) or pf (openbsd). This is because we want the
real web server to reply on this port, and not a fake one. Eg, with
iptables :
# /etc/init.d/httpd start
# iptables -t nat -A PREROUTING -p tcp --dport 81 -j REDIRECT --to-port 80
A few seconds later, the standard server is seen up and haproxy starts to send
it new requests on its real port 80 (only new users with no cookie, of course).
If a server completely crashes (even if it does not respond at the IP level),
both the standard and backup servers will fail, so clients associated to this
server will be redispatched to other live servers and will lose their sessions.
Now if you want to enter a server into maintenance, simply stop it from
responding on port 81 so that its standard instance will be seen as failed,
but the backup will still work. Users will not notice anything since the
service is still operational :
# iptables -t nat -D PREROUTING -p tcp --dport 81 -j REDIRECT --to-port 80
The health checks on port 81 for this server will quickly fail, and the
standard server will be seen as failed. No new session will be sent to this
server, and existing clients with a valid cookie will still reach it because
the backup server will still be up.
Now wait as long as you want for the old users to stop using the service, and
once you see that the server does not receive any traffic, simply stop it :
# /etc/init.d/httpd stop
The associated backup server will in turn fail, and if any client still tries
to access this particular server, he will be redispatched to any other valid
server because of the "redispatch" option.
This method has an advantage : you never touch the proxy when doing server
maintenance. The people managing the servers can make them disappear smoothly.
4.2.1 Variations for operating systems without any firewall software
--------------------------------------------------------------------
The downside is that you need a redirection solution on the server just for
the health-checks. If the server OS does not support any firewall software,
this redirection can also be handled by a simple haproxy in tcp mode :
global
daemon
quiet
pidfile /var/run/haproxy-checks.pid
listen 0.0.0.0:81
mode tcp
dispatch 127.0.0.1:80
contimeout 1000
clitimeout 10000
srvtimeout 10000
To start the web service :
# /etc/init.d/httpd start
# haproxy -f /etc/haproxy/haproxy-checks.cfg
To soft-stop the service :
# kill $(</var/run/haproxy-checks.pid)
The port 81 will stop responding and the load-balancer will notice the failure.
4.2.2 Centralizing the server management
----------------------------------------
If one finds it preferable to manage the servers from the load-balancer itself,
the port redirector can be installed on the load-balancer itself. See the
example with iptables below.
Make the servers appear as operational :
# iptables -t nat -A OUTPUT -d 192.168.1.11 -p tcp --dport 81 -j DNAT --to-dest :80
# iptables -t nat -A OUTPUT -d 192.168.1.12 -p tcp --dport 81 -j DNAT --to-dest :80
# iptables -t nat -A OUTPUT -d 192.168.1.13 -p tcp --dport 81 -j DNAT --to-dest :80
# iptables -t nat -A OUTPUT -d 192.168.1.14 -p tcp --dport 81 -j DNAT --to-dest :80
Soft stop one server :
# iptables -t nat -D OUTPUT -d 192.168.1.12 -p tcp --dport 81 -j DNAT --to-dest :80
Another solution is to use the "COMAFILE" patch provided by Alexander Lazic,
which is available for download here :
http://w.ods.org/tools/haproxy/contrib/
4.2.3 Notes :
-------------
- Never, ever, start a fake service on port 81 for the health-checks, because
a real web service failure will not be detected as long as the fake service
runs. You must really forward the check port to the real application.
- health-checks will be sent twice as often, once for each standard server,
and once for each backup server. All this will be multiplicated by the
number of processes if you use multi-process mode. You will have to ensure
that all the checks sent to the server do not overload it.
=======================
4.3 Hot reconfiguration
=======================
There are two types of haproxy users :
- those who can never do anything in production out of maintenance periods ;
- those who can do anything at any time provided that the consequences are
limited.
The first ones have no problem stopping the server to change configuration
because they got some maintenance periods during which they can break anything.
So they will even prefer doing a clean stop/start sequence to ensure everything
will work fine upon next reload. Since those have represented the majority of
haproxy uses, there has been little effort trying to improve this.
However, the second category is a bit different. They like to be able to fix an
error in a configuration file without anyone noticing. This can sometimes also
be the case for the first category because humans are not failsafe.
For this reason, a new hot reconfiguration mechanism has been introduced in
version 1.1.34. Its usage is very simple and works even in chrooted
environments with lowered privileges. The principle is very simple : upon
reception of a SIGTTOU signal, the proxy will stop listening to all the ports.
This will release the ports so that a new instance can be started. Existing
connections will not be broken at all. If the new instance fails to start,
then sending a SIGTTIN signal back to the original processes will restore
the listening ports. This is possible without any special privileges because
the sockets will not have been closed, so the bind() is still valid. Otherwise,
if the new process starts successfully, then sending a SIGUSR1 signal to the
old one ensures that it will exit as soon as its last session ends.
A hot reconfiguration script would look like this :
# save previous state
mv /etc/haproxy/config /etc/haproxy/config.old
mv /var/run/haproxy.pid /var/run/haproxy.pid.old
mv /etc/haproxy/config.new /etc/haproxy/config
kill -TTOU $(cat /var/run/haproxy.pid.old)
if haproxy -p /var/run/haproxy.pid -f /etc/haproxy/config; then
echo "New instance successfully loaded, stopping previous one."
kill -USR1 $(cat /var/run/haproxy.pid.old)
rm -f /var/run/haproxy.pid.old
exit 1
else
echo "New instance failed to start, resuming previous one."
kill -TTIN $(cat /var/run/haproxy.pid.old)
rm -f /var/run/haproxy.pid
mv /var/run/haproxy.pid.old /var/run/haproxy.pid
mv /etc/haproxy/config /etc/haproxy/config.new
mv /etc/haproxy/config.old /etc/haproxy/config
exit 0
fi
After this, you can still force old connections to end by sending
a SIGTERM to the old process if it still exists :
kill $(cat /var/run/haproxy.pid.old)
rm -f /var/run/haproxy.pid.old
Be careful with this as in multi-process mode, some pids might already
have been reallocated to completely different processes.
==================================================