在上篇文章中我们简单介绍了keepalived这个软件的安装,启动分析情况,这次我们来介绍keepalived的故障切换、故障恢复、及vrrp_script模块实现对集群资源的监控,整体架构还是和上次的一样,这里就不再说明了
1、keepalived的故障切换过程分析
首先在keepalived主节点上面关闭httpd服务,然后看看keepalived是如何实现 故障切换的
[root@centos01 keepalived]# /etc/init.d/httpd stop
Stopping httpd: [ OK ]
观察备用节点log,可以看到vip地址漂移到这里来了
[root@centos02 keepalived]# tail -f /var/log/messages|grep -v PYTHOn
Jul 28 14:07:03 centos02 Keepalived_vrrp[4364]: VRRP_Instance(HA_1) Transition to MASTER STATE
Jul 28 14:07:05 centos02 Keepalived_vrrp[4364]: VRRP_Instance(HA_1) Entering MASTER STATE
Jul 28 14:07:05 centos02 Keepalived_vrrp[4364]: VRRP_Instance(HA_1) setting protocol VIPs.
Jul 28 14:07:05 centos02 Keepalived_vrrp[4364]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.80.100
Jul 28 14:07:05 centos02 Keepalived_healthcheckers[4363]: Netlink reflector reports IP 172.16.80.100 added
Jul 28 14:07:10 centos02 Keepalived_vrrp[4364]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.80.100
同时在主节点上面抓包,我们来简单分析下这个包
[root@centos01 keepalived]# tcpdump -i eth0 -n -vvv -s0 -w httpd.cap
源IP对应的是主节点的IP地址 172.16.80.116,而目的地址是组播 224.0.0.18,当我们把主节点上面httpd服务停止时,可以看到 主节点上面优先级立刻变成0,也看到这里是明文传输的,密码 1111,接下来我们看下一个包
我们可以看到这个包源IP地址变成了备用节点的IP地址 172.16.80.117,而目标地址依然是224.0.0.18
从优先级85 我们也可知道这是备用节点设置的优先级
查看vip漂移情况
主节点IP,可以看到没有vip地址
[root@centos01 keepalived]# ip addr
1: lo:
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0:
link/ether 00:0c:29:4c:62:c9 brd ff:ff:ff:ff:ff:ff
inet 172.16.80.116/24 brd 172.16.80.255 scope global eth0
inet6 fe80::20c:29ff:fe4c:62c9/64 scope link
valid_lft forever preferred_lft forever
备用节点,可以看到vip地址漂移到备用节点这里
[root@centos02 keepalived]# ip addr
1: lo:
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0:
link/ether 00:0c:29:45:fe:30 brd ff:ff:ff:ff:ff:ff
inet 172.16.80.117/24 brd 172.16.80.255 scope global eth0
inet 172.16.80.100/32 scope global eth0
inet6 fe80::20c:29ff:fe45:fe30/64 scope link
valid_lft forever preferred_lft forever
2、故障恢复切换分析
在主用节点上面启动httpd,观察日志
主节点上面
[root@centos01 keepalived]# tail -f /var/log/messages|grep -v "PYTHON"
Jul 28 14:39:38 centos01 Keepalived_vrrp[65334]: VRRP_Script(check_httpd) succeeded
Jul 28 14:39:38 centos01 Keepalived_vrrp[65334]: VRRP_Instance(HA_1) prio is higher than received advert
Jul 28 14:39:38 centos01 Keepalived_vrrp[65334]: VRRP_Instance(HA_1) Transition to MASTER STATE
Jul 28 14:39:38 centos01 Keepalived_vrrp[65334]: VRRP_Instance(HA_1) Received lower prio advert, forcing new election
Jul 28 14:39:40 centos01 Keepalived_vrrp[65334]: VRRP_Instance(HA_1) Entering MASTER STATE
Jul 28 14:39:40 centos01 Keepalived_vrrp[65334]: VRRP_Instance(HA_1) setting protocol VIPs.
Jul 28 14:39:40 centos01 Keepalived_healthcheckers[65333]: Netlink reflector reports IP 172.16.80.100 added
Jul 28 14:39:40 centos01 Keepalived_vrrp[65334]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.80.100
Jul 28 14:39:46 centos01 Keepalived_vrrp[65334]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.80.100
可以看到vip地址再次漂移回主节点上面
备用节点log
[root@centos02 keepalived]# tail -f /var/log/messages|grep -v "PYTHON"
Jul 28 14:39:38 centos02 Keepalived_vrrp[4364]: VRRP_Instance(HA_1) Received higher prio advert
Jul 28 14:39:38 centos02 Keepalived_vrrp[4364]: VRRP_Instance(HA_1) Entering BACKUP STATE
Jul 28 14:39:38 centos02 Keepalived_vrrp[4364]: VRRP_Instance(HA_1) removing protocol VIPs.
Jul 28 14:39:38 centos02 Keepalived_healthcheckers[4363]: Netlink reflector reports IP 172.16.80.100 removed
可以看到备用节点上面vip地址被移除了
再来看看实际的vip地址情况
主用节点
备用节点
纵观keepalived的整个运行过程及切换过程,看似合理,事实以上并非如此,在一个高负载,高并发 追求稳定的业务系统中,执行一次主备切换对业务系统影响很大,因此不到万不得已,尽量不要进行主备切换,也就是说在主节点发生故障后必须要切换到备用节点,而在主节点恢复后,不希望再次切换到主节点,知道备用节点发生故障时才进行切换,这就是里面的不抢占功能 通过keepalived的 nopreempt选项来实现
vrrp_script模块内容比较多,我们还是下次再来介绍吧