1. 前言
负载均衡LB,高可用HA,这一小结主要讲双机热备方案保证高可用。这里选择Keepalived作为双机热备方案,下面就对具体的配置进行了解。
2. 下载Keepalived
wget http://www.keepalived.org/software/keepalived-1.4.0.tar.gz
文档 http://www.keepalived.org/doc
参考 https://www.cnblogs.com/abclife/p/7909818.html
https://www.cnblogs.com/kevingrace/p/6138185.html
系统 Debian 8
1 ./configure --prefix=/opt/keepalive #这一步,可能要额外安装一些依赖
1 make 2 make install 3 mkdir /etc/keepalived 4 cp ./etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf 5 cp ./sbin/keepalived /usr/sbin/ 6 vim /etc/init.d/keepalived 7 chmod a+x /etc/init.d/keepalived 8 service keepalived start
/etc/init.d/keepalived 加入如下内容
1 #!/bin/sh 2 # 3 # keepalived High Availability monitor built upon LVS and VRRP 4 # 5 # chkconfig: - 86 14 6 # description: Robust keepalive facility to the Linux Virtual Server project \ 7 # with multilayer TCP/IP stack checks. 8 9 ### BEGIN INIT INFO 10 # Provides: keepalived 11 # Required-Start: $local_fs $network $named $syslog 12 # Required-Stop: $local_fs $network $named $syslog 13 # Should-Start: smtpdaemon httpd 14 # Should-Stop: smtpdaemon httpd 15 # Default-Start: 16 # Default-Stop: 0 1 2 3 4 5 6 17 # Short-Description: High Availability monitor built upon LVS and VRRP 18 # Description: Robust keepalive facility to the Linux Virtual Server 19 # project with multilayer TCP/IP stack checks. 20 ### END INIT INFO 21 22 # Source function library. 23 . /etc/rc.d/init.d/functions 24 25 exec="/usr/sbin/keepalived" 26 prog="keepalived" 27 config="/etc/keepalived/keepalived.conf" 28 29 [ -e /etc/sysconfig/$prog ] && . /etc/sysconfig/$prog 30 31 lockfile=/var/lock/subsys/keepalived 32 33 start() { 34 [ -x $exec ] || exit 5 35 [ -e $config ] || exit 6 36 echo -n $"Starting $prog: " 37 daemon $exec $KEEPALIVED_OPTIONS 38 retval=$? 39 echo 40 [ $retval -eq 0 ] && touch $lockfile 41 return $retval 42 } 43 44 stop() { 45 echo -n $"Stopping $prog: " 46 killproc $prog 47 retval=$? 48 echo 49 [ $retval -eq 0 ] && rm -f $lockfile 50 return $retval 51 } 52 53 restart() { 54 stop 55 start 56 } 57 58 reload() { 59 echo -n $"Reloading $prog: " 60 killproc $prog -1 61 retval=$? 62 echo 63 return $retval 64 } 65 66 force_reload() { 67 restart 68 } 69 70 rh_status() { 71 status $prog 72 } 73 74 rh_status_q() { 75 rh_status &>/dev/null 76 } 77 78 79 case "$1" in 80 start) 81 rh_status_q && exit 0 82 $1 83 ;; 84 stop) 85 rh_status_q || exit 0 86 $1 87 ;; 88 restart) 89 $1 90 ;; 91 reload) 92 rh_status_q || exit 7 93 $1 94 ;; 95 force-reload) 96 force_reload 97 ;; 98 status) 99 rh_status 100 ;; 101 condrestart|try-restart) 102 rh_status_q || exit 0 103 restart 104 ;; 105 *) 106 echo $"Usage: $0 {start|stop|status|restart|condrestart|try-restart|reload|force-reload}" 107 exit 2 108 esac 109 exit $?
3. 双机热备(主从模式)
修改配置文件 keepalived.conf
vim /etc/keeplived/keeplived.conf
1 global_defs { 2 notification_email { #指定Keepalived在发生事情的时候,发送邮件通知,每行一个地址 3 [email protected] 4 } 5 notification_email_from [email protected] #指定发件人 6 smtp_server 192.168.8.208 #发送email的smtp地址 7 smtp_connect_timeout 30 #超时时间 8 router_id nginx_dev #运行Keepalived的机器标识号,可以相同也可以不同 9 } 10 11 vrrp_instance nginx_dev { 12 state MASTER 13 interface eth1 14 #mcast_src_ip 172.16.23.203 15 virtual_router_id 100 16 priority 200 17 advert_int 5 18 authentication { 19 auth_type PASS 20 auth_pass 123456 21 } 22 virtual_ipaddress { 23 172.16.23.222 24 } 25 }
然后 service keepalived restart
然后就可以通过 172.16.23.222 访问当前主机了。如果访问不了的要判断是否所处于同个网络中,即可以通过路由访问到的。这个Keepalived就是通过发广播包这里的通知内网机器实现的。
同理在另外一台电脑上配置上面信息 把state MASTER 改为 state BACKUP
4. 双机热备(主主模式)
主从模式的话,由于一般情况下是不会出现宕机,所以往往会有一台机器浪费,这样是对机器的浪费,所以现在双机热备主主模式是比较推荐的。所谓的主主模式,就是建立两个实例,互为主从而已。
172.16.23.203 配置如下
1 global_defs { 2 notification_email { #指定Keepalived在发生事情的时候,发送邮件通知,每行一个地址 3 [email protected] 4 } 5 notification_email_from [email protected] #指定发件人 6 smtp_server 192.168.8.208 #发送email的smtp地址 7 smtp_connect_timeout 30 #超时时间 8 router_id nginx_dev_1 #运行Keepalived的机器标识号,可以相同也可以不同 9 router_id nginx_dev_2 10 } 11 12 vrrp_instance nginx_dev_1 { 13 state MASTER 14 interface eth1 15 mcast_src_ip 172.16.23.203 16 virtual_router_id 100 17 priority 100 18 advert_int 5 19 authentication { 20 auth_type PASS 21 auth_pass 123456 22 } 23 virtual_ipaddress { 24 172.16.23.222 25 } 26 } 27 28 vrrp_instance nginx_dev_2 { 29 state BACKUP 30 interface eth1 31 mcast_src_ip 172.16.23.203 32 virtual_router_id 101 33 priority 200 34 advert_int 5 35 authentication { 36 auth_type PASS 37 auth_pass 123456 38 } 39 virtual_ipaddress { 40 172.16.23.223 41 } 42 }
172.16.23.204 配置如下
1 global_defs { 2 notification_email { #指定Keepalived在发生事情的时候,发送邮件通知,每行一个地址 3 [email protected] 4 } 5 notification_email_from [email protected] #指定发件人 6 smtp_server 192.168.8.208 #发送email的smtp地址 7 smtp_connect_timeout 30 #超时时间 8 router_id nginx_dev_1 #运行Keepalived的机器标识号,可以相同也可以不同 9 router_id nginx_dev_2 10 } 11 12 vrrp_instance nginx_dev_2 { 13 state MASTER 14 interface eth1 15 mcast_src_ip 172.16.23.204 16 virtual_router_id 101 17 priority 100 18 advert_int 5 19 authentication { 20 auth_type PASS 21 auth_pass 123456 22 } 23 virtual_ipaddress { 24 172.16.23.223 25 } 26 } 27 28 vrrp_instance nginx_dev_1 { 29 state BACKUP 30 interface eth1 31 mcast_src_ip 172.16.23.204 32 virtual_router_id 100 33 priority 200 34 advert_int 5 35 authentication { 36 auth_type PASS 37 auth_pass 123456 38 } 39 virtual_ipaddress { 40 172.16.23.222 41 } 42 }
就是这两份配置,基本是一样的,互为主备,里面一个比较重要修改的是virtual_router_id
上面实现的效果是访问172.16.23.222 时先转到172.16.20.203主机上,访问172.16.23.223时转到172.16.20.204主机上。当其中一台主机宕机时,就会自动切换,切换到好的主机上,这个过程就几秒的时间。
这里是需要两个IP地址,需要客户端进行负载选择,这一步可以通过DNS进行分发处理。
1 root@debian-t6:/usr/local/nginx/html# ip addr 2 1: lo:mtu 65536 qdisc noqueue state UNKNOWN group default 3 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 4 inet 127.0.0.1/8 scope host lo 5 valid_lft forever preferred_lft forever 6 2: eth1: mtu 1500 qdisc pfifo_fast state UP group default 7 link/ether fc:aa:14:9d:d9:5a brd ff:ff:ff:ff:ff:ff 8 inet 172.16.23.204/24 brd 172.16.23.255 scope global eth1 9 valid_lft forever preferred_lft forever 10 inet 172.16.23.223/32 scope global eth1 11 valid_lft forever preferred_lft forever 12 inet 172.16.23.222/32 scope global eth1 13 valid_lft forever preferred_lft forever
日志在 /var/log/message
1 Jan 3 21:02:33 debian-t6 Keepalived_vrrp[31784]: VRRP_Instance(nginx_dev_1) Sending/queueing gratuitous ARPs on eth1 for 172.16.23.222 2 Jan 3 21:02:33 debian-t6 Keepalived_vrrp[31784]: Sending gratuitous ARP on eth1 for 172.16.23.222 3 Jan 3 21:02:33 debian-t6 Keepalived_vrrp[31784]: Sending gratuitous ARP on eth1 for 172.16.23.222 4 Jan 3 21:02:33 debian-t6 Keepalived_vrrp[31784]: Sending gratuitous ARP on eth1 for 172.16.23.222 5 Jan 3 21:02:33 debian-t6 Keepalived_vrrp[31784]: Sending gratuitous ARP on eth1 for 172.16.23.222 6 Jan 3 21:03:02 debian-t6 Keepalived_vrrp[31784]: VRRP_Instance(nginx_dev_1) sent 0 priority 7 Jan 3 21:03:02 debian-t6 Keepalived_healthcheckers[31783]: Stopped 8 Jan 3 21:03:02 debian-t6 Keepalived_vrrp[31784]: VRRP_Instance(nginx_dev_1) removing protocol VIPs. 9 Jan 3 21:03:04 debian-t6 Keepalived_vrrp[31784]: Stopped 10 Jan 3 21:03:16 debian-t6 Keepalived_healthcheckers[31845]: Opening file '/etc/keepalived/keepalived.conf'. 11 Jan 3 21:03:16 debian-t6 Keepalived_vrrp[31846]: Registering Kernel netlink reflector 12 Jan 3 21:03:16 debian-t6 Keepalived_vrrp[31846]: Registering Kernel netlink command channel 13 Jan 3 21:03:16 debian-t6 Keepalived_vrrp[31846]: Registering gratuitous ARP shared channel 14 Jan 3 21:03:16 debian-t6 Keepalived_vrrp[31846]: Opening file '/etc/keepalived/keepalived.conf'. 15 Jan 3 21:03:16 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_2) removing protocol VIPs. 16 Jan 3 21:03:16 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_1) removing protocol VIPs. 17 Jan 3 21:03:16 debian-t6 Keepalived_vrrp[31846]: Using LinkWatch kernel netlink reflector... 18 Jan 3 21:03:16 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_1) Entering BACKUP STATE 19 Jan 3 21:03:16 debian-t6 Keepalived_vrrp[31846]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)] 20 Jan 3 21:03:18 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_1) forcing a new MASTER election 21 Jan 3 21:03:20 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_2) Transition to MASTER STATE 22 Jan 3 21:03:20 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_2) Received advert with higher priority 200, ours 100 23 Jan 3 21:03:20 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_2) Entering BACKUP STATE 24 Jan 3 21:03:23 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_2) Transition to MASTER STATE 25 Jan 3 21:03:23 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_1) Transition to MASTER STATE 26 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_2) Entering MASTER STATE 27 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_2) setting protocol VIPs. 28 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: Sending gratuitous ARP on eth1 for 172.16.23.223 29 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_2) Sending/queueing gratuitous ARPs on eth1 for 172.16.23.223 30 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: Sending gratuitous ARP on eth1 for 172.16.23.223 31 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: Sending gratuitous ARP on eth1 for 172.16.23.223 32 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: Sending gratuitous ARP on eth1 for 172.16.23.223 33 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: Sending gratuitous ARP on eth1 for 172.16.23.223 34 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_1) Entering MASTER STATE 35 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_1) setting protocol VIPs. 36 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: Sending gratuitous ARP on eth1 for 172.16.23.222 37 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: VRRP_Instance(nginx_dev_1) Sending/queueing gratuitous ARPs on eth1 for 172.16.23.222 38 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: Sending gratuitous ARP on eth1 for 172.16.23.222 39 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: Sending gratuitous ARP on eth1 for 172.16.23.222 40 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: Sending gratuitous ARP on eth1 for 172.16.23.222 41 Jan 3 21:03:28 debian-t6 Keepalived_vrrp[31846]: Sending gratuitous ARP on eth1 for 172.16.23.222
5. KeepAlived其他配置
下面讲解一些其他配置和高级应用
(1)email通知,这里就不做了,建议用第三方独立监控服务,如Nagios、Zabbix进行监控
(2)router_id 用户标识本节点名称,通常为hostname
(3)vrrp_instance 实例定义
(4)state 实例状态,只有MASTER 和 BACKUP两种状态,并且需要全部大写。抢占模式下其中MASTER为工作状态,BACKUP为备用状态。当MASTER所在服务器失效时,BACKUP所在服务器会自动把它的状态由BACKUP切换到MASTER状态。当失效的MASTER所在的服务恢复时,BACKUP从MASTER恢复到BACKUP状态
(5)interface 对望提供服务网卡接口,即VIP绑定的网卡接口。一些服务器都有两个以上的网卡接口,在选择接口时要正确选择
(6)mcast_src_ip 本机IP地址
(7)virtual_router_id 虚拟路由的ID号,每个节点必须设置一样。相同的VRID为一个组,将决定多播的MAC地址,不同的实例节点必须不一样
(8)priority 节点优先级,取值0~254, MASTER 要比BACKUP高
(9)advert_int MASTER 与 BACKUP 节点间同步检查的时间间隔,单位秒
(10)authentication 验证类型和验证密码。主要用PASS 密码模式 {auth_type PASS auth_pass 123456} 同一个vrrp实例MASTER与BACKUP使用相同的密码才能正常通信
(11)nopreempt 禁止抢占服务。默认情况,当MASTER服务挂掉之后,BACKUP自动升级为MASTER并接替其任务,当MASTER服务恢复后,升级为MASTER的BACKUP服务又自动降为BACKUP,把工作权交给原MASTER。当配置了nopreempt,MASTER从挂到到恢复,不再将服务抢占过来
(12)virtual_ipaddress 虚拟IP地址池,可以有多个IP,每个IP占一行,不需要指定子网掩码。注意这里的IP必须与我们设定的VIP保持一致 VRRP HA虚拟地址
(13)notify_master 表示当切换到master状态时,要执行的脚本
(14)notify_backup 表示当切换到backup状态时,要执行的脚本
(15)notify_fault 表示切换出现故障是要执行的脚本(这里也可以发送邮件什么的)
(16)track_script 执行监控的服务
(17)vrrp_script VRRP 脚本检测
6. KeepAlived 高级应用
172.16.23.203
/etc/keepalived/keepalived.conf
1 global_defs { 2 router_id nginx_dev_1 3 } 4 5 #用于监控Nginx、Redis等应用是否在运行 6 vrrp_script chk_nginx_port { 7 script "/etc/keepalived/check.sh" #通过脚本检测,根据返回值进行判断 8 interval 2 #脚本执行间隔 每2秒执行一次 9 weight -5 #检测失败,优先级变更 10 fall 2 #连续检测2次失败才算失败 11 rise 1 #检测一次成功算成功,但不修改优先级 12 } 13 14 vrrp_instance nginx_dev_1 { 15 state MASTER 16 interface eth1 17 mcast_src_ip 172.16.23.203 18 virtual_router_id 100 19 priority 100 20 advert_int 5 21 authentication { 22 auth_type PASS 23 auth_pass 123456 24 } 25 virtual_ipaddress { #VRRP HA 虚拟地址,可以写多个 26 172.16.23.222 27 } 28 29 notify_master "/etc/keepalived/run.sh master1" 30 notify_backup "/etc/keepalived/run.sh backup1" 31 notify_fault "/etc/keepalived/run.sh fault1" 32 33 track_script { 34 chk_nginx_port #执行对于脚本 35 } 36 } 37 38 vrrp_instance nginx_dev_2 { 39 state BACKUP 40 interface eth1 41 mcast_src_ip 172.16.23.203 42 virtual_router_id 101 43 priority 200 44 advert_int 5 45 authentication { 46 auth_type PASS 47 auth_pass 123456 48 } 49 virtual_ipaddress { 50 172.16.23.223 #VRRP HA虚拟地址 51 } 52 53 notify_master "/etc/keepalived/run.sh master2" 54 notify_backup "/etc/keepalived/run.sh backup2" 55 notify_fault "/etc/keepalived/run.sh fault2" 56 }
/etc/keepalived/check.sh
1 #!/bin/bash 2 counter=$(ps -C nginx --no-heading | wc -l) 3 if [ "${counter}" = "0" ]; then 4 service nginx start 5 #sleep 2 # 这个在执行过程中有问题 6 counter=$(ps -C nginx --no-heading | wc -l) 7 if [ "${counter}" = "0" ]; then 8 service keepalived stop 9 fi 10 fi
/etc/keepalived/run.sh
1 #!/bin/sh 2 echo $(date +%H:%M:%S) $1 >> /etc/keepalived/time.txt
172.16.23.204
/etc/keepalived/keepalived.conf
1 global_defs { 2 router_id nginx_dev_2 #运行Keepalived的机器标识号,可以相同也可以不同 3 } 4 5 #用于监控Nginx、Redis等应用是否在运行 6 vrrp_script chk_nginx_port { 7 script "/etc/keepalived/check.sh" 8 interval 2 9 weight -5 10 fall 2 11 rise 1 12 } 13 14 vrrp_instance nginx_dev_2 { 15 state MASTER 16 interface eth1 17 mcast_src_ip 172.16.23.204 18 virtual_router_id 101 19 priority 100 20 advert_int 5 21 authentication { 22 auth_type PASS 23 auth_pass 123456 24 } 25 virtual_ipaddress { 26 172.16.23.223 27 } 28 29 notify_master "/etc/keepalived/run.sh master1" 30 notify_backup "/etc/keepalived/run.sh backup1" 31 notify_fault "/etc/keepalived/run.sh fault1" 32 33 track_script { 34 chk_nginx_port 35 } 36 } 37 38 vrrp_instance nginx_dev_1 { 39 state BACKUP 40 interface eth1 41 mcast_src_ip 172.16.23.204 42 virtual_router_id 100 43 priority 200 44 advert_int 5 45 authentication { 46 auth_type PASS 47 auth_pass 123456 48 } 49 virtual_ipaddress { 50 172.16.23.222 51 } 52 53 notify_master "/etc/keepalived/run.sh master2" 54 notify_backup "/etc/keepalived/run.sh backup2" 55 notify_fault "/etc/keepalived/run.sh fault2" 56 57 track_script { 58 chk_nginx_port 59 } 60 }
/etc/keepalived/check.sh
1 #!/bin/bash 2 counter=$(ps -C nginx --no-heading | wc -l) 3 if [ "${counter}" = "0" ]; then 4 service nginx start 5 counter=$(ps -C nginx --no-heading | wc -l) 6 if [ "${counter}" = "0" ]; then 7 service keepalived stop 8 fi 9 fi
/etc/keepalived/run.sh
1 #!/bin/sh 2 echo $(date +%H:%M:%S) $1 >> /etc/keepalived/time.txt
上面的那些sh文件要 chmod a+x *.sh ,在keepalived.conf 文件中最好是使用绝对路径
测试过程:
先启动203 的KeepAlived和Nginx
然后启动204的KeepAlived和Nginx
用浏览器访问 172.16.23.223 / 172.16.23.222 都是没有问题,基本被负载到两台主机上去了
场景一
然后204机器 service nginx stop && service nginx status 模拟异常退出并查看状态
过几秒后再查看204 service nginx status 发现nginx 自动重启了
场景二
修改check.sh 文件 把service nginx start 这一行注释掉
然后204 机器 service nginx stop && service nginx status 模拟异常退出并查看状态
过几秒后再查看204 service nginx status / service keepalived status 发现nginx关闭、对于的204上的KeepAlived也被关闭
这个时候,203服务器 tail -f time.txt 会出现一条master切换信息 表示VIP进行切换
浏览器访问 172.16.23.223/172.16.23.222 访问正常,不过都是指向同一个机器
取消check.sh 文件的注释
启动204的 KeepAlived service keepalived start 由于取消注释,对于的nginx也自动启动了
这时,203服务器 tail -f time.txt 会出现一条backup切换信息,表示203主机切换为Backup状态了。
这时浏览器访问以上两个IP,同样被负载的两台主机上