我使用的是阿里源,yum安装的keepalived
问题:
起初是测试vip漂移时候发现,主备节点都开启keepalived的状况下,一切正常,master的vip也可以访问,当停掉master的keepalived时,发现vip无法漂移到slave,检查进程时,发现keepalived的进程依然还在。
解决方法:
1、这时候想到应该是systemd启动服务脚本的问题,查看keepalived的启动service文件
路径:vim /usr/lib/systemd/system/keepalived.service
2、注释掉一行
KillMode=process的大致意思是当停止keepalived的时候只会停掉主进程,而主进程产生的子进程是不会被干掉的。而killmode的默认值是control-group,意思时所有进程都会被干掉,我这里选择把这项注释掉。
然后重载配置
systemctl daemon-reload
这样设置完成之后有在停掉keepalived进程就不会再出现双VIP的现象了
一 故障描述
我在台湾合作方给定的两台虚拟机上部署HAProxy+Keepalived负载均衡高可用方案。在配置完Keepalived后,重新启动Keepalived,Keepalived没有绑定VIP。
Keepalived执行程序路径为/data/app_platform/keepalived/sbin/keepalived
配置文件路径为/data/app_platform/keepalived/conf/keepalived.conf
Keepalived的启动脚本为/etc/init.d/keepalived
keepalived.conf的内容
LB1 Master
! Configuration File for keepalived
global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LB1_MASTER
}
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance VI_1 {
state MASTER
interface eth1
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.1.1.200/24 brd 10.1.1.255 dev eth1 label eth1:vip
}
track_script {
chk_haproxy
}
}
重新启动Keepalived查看日志
Mar 3 18:09:00 cv00300005248-1 Keepalived[20138]: Stopping Keepalived v1.2.15 (02/28,2015)
Mar 3 18:09:00 cv00300005248-1 Keepalived[20259]: Starting Keepalived v1.2.15 (02/28,2015)
Mar 3 18:09:00 cv00300005248-1 Keepalived[20260]: Starting Healthcheck child process, pid=20261
Mar 3 18:09:00 cv00300005248-1 Keepalived[20260]: Starting VRRP child process, pid=20262
Mar 3 18:09:00 cv00300005248-1 Keepalived_vrrp[20262]: Registering Kernel netlink reflector
Mar 3 18:09:00 cv00300005248-1 Keepalived_vrrp[20262]: Registering Kernel netlink command channel
Mar 3 18:09:00 cv00300005248-1 Keepalived_vrrp[20262]: Registering gratuitous ARP shared channel
Mar 3 18:09:00 cv00300005248-1 Keepalived_healthcheckers[20261]: Registering Kernel netlink reflector
Mar 3 18:09:00 cv00300005248-1 Keepalived_healthcheckers[20261]: Registering Kernel netlink command channel
Mar 3 18:09:00 cv00300005248-1 Keepalived_healthcheckers[20261]: Configuration is using : 3924 Bytes
Mar 3 18:09:00 cv00300005248-1 Keepalived_healthcheckers[20261]: Using LinkWatch kernel netlink reflector...
Mar 3 18:09:00 cv00300005248-1 Keepalived_vrrp[20262]: Configuration is using : 55712 Bytes
Mar 3 18:09:00 cv00300005248-1 Keepalived_vrrp[20262]: Using LinkWatch kernel netlink reflector...
Mar 3 18:09:18 cv00300005248-1 kernel: __ratelimit: 1964 callbacks suppressed
Mar 3 18:09:18 cv00300005248-1 kernel: Neighbour table overflow.
Mar 3 18:09:18 cv00300005248-1 kernel: Neighbour table overflow.
Mar 3 18:09:18 cv00300005248-1 kernel: Neighbour table overflow.
Mar 3 18:09:18 cv00300005248-1 kernel: Neighbour table overflow.
Mar 3 18:09:18 cv00300005248-1 kernel: Neighbour table overflow.
Mar 3 18:09:18 cv00300005248-1 kernel: Neighbour table overflow.
Mar 3 18:09:18 cv00300005248-1 kernel: Neighbour table overflow.
Mar 3 18:09:18 cv00300005248-1 kernel: Neighbour table overflow.
Mar 3 18:09:18 cv00300005248-1 kernel: Neighbour table overflow.
Mar 3 18:09:18 cv00300005248-1 kernel: Neighbour table overflow.
查看VIP绑定情况
$ ifconfig eth1:vip
eth1:vip Link encap:Ethernet HWaddr 00:16:3E:F2:37:6B
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:13
没有VIP绑定
二 排查过程
1)检查VIP的配置情况
向合作方确认提供的VIP的详细情况
IPADDR 10.1.1.200
NETMASK 255.255.255.0
GATEWAY 10.1.1.1
Brodcast 10.1.1.255
这里设置的是
10.1.1.200/24 brd 10.1.1.255 dev eth1 label eth1:vip
2)检查iptables和selinux的设置情况
$ sudo service iptables stop
$ sudo setenforce 0
setenforce: SELinux is disabled
如果非要开启iptables的话,需要作些设定
iptables -I INPUT -i eth1 -d 224.0.0.0/8 -j ACCEPT
service iptables save
keepalived使用224.0.0.18作为Master和Backup健康检查的通信IP
3)检查相关的内核参数
HAProxy+Keepalived架构需要注意的内核参数有:
# Controls IP packet forwarding
net.ipv4.ip_forward = 1
开启IP转发功能
net.ipv4.ip_nonlocal_bind = 1
开启允许绑定非本机的IP
如果使用LVS的DR或者TUN模式结合Keepalived需要在后端真实服务器上特别设置两个arp相关的参数。这里也设置好。
net.ipv4.conf.lo.arp_ignore = 1
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
4)检查VRRP的设置情况
LB1 Master
state MASTER
interface eth1
virtual_router_id 51
priority 100
LB2 Backup
state BACKUP
interface eth1
virtual_router_id 51
priority 99
Master和Backup的virtual_router_id需要一样,priority需要不一样,数字越大,优先级越高
5)怀疑是编译安装Keepalived版本出现了问题
重新下载并编译2.1.13的版本,并重新启动keepalived,VIP仍然没有被绑定。
线上有个平台的keepalived是通过yum安装的,于是打算先用yum安装keepalived后将配置文件复制过去看看是否可以绑定VIP
rpm -ivh http://ftp.linux.ncsu.edu/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
yum -y install keepalived
cp /data/app_platform/keepalived/conf/keepalived.conf /etc/keepalived/keepalived.conf
重新启动keepalived
然后查看日志
Mar 4 16:42:46 xxxxx Keepalived_healthcheckers[17332]: Registering Kernel netlink reflector
Mar 4 16:42:46 xxxxx Keepalived_healthcheckers[17332]: Registering Kernel netlink command channel
Mar 4 16:42:46 xxxxx Keepalived_vrrp[17333]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 4 16:42:46 xxxxx Keepalived_vrrp[17333]: Configuration is using : 65250 Bytes
Mar 4 16:42:46 xxxxx Keepalived_vrrp[17333]: Using LinkWatch kernel netlink reflector...
Mar 4 16:42:46 xxxxx Keepalived_vrrp[17333]: VRRP sockpool: [ifindex(3), proto(112), unicast(0), fd(10,11)]
Mar 4 16:42:46 xxxxx Keepalived_healthcheckers[17332]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 4 16:42:46 xxxxx Keepalived_healthcheckers[17332]: Configuration is using : 7557 Bytes
Mar 4 16:42:46 xxxxx Keepalived_healthcheckers[17332]: Using LinkWatch kernel netlink reflector...
Mar 4 16:42:46 xxxxx Keepalived_vrrp[17333]: VRRP_Script(chk_haproxy) succeeded
Mar 4 16:42:47 xxxxx Keepalived_vrrp[17333]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 4 16:42:48 xxxxx Keepalived_vrrp[17333]: VRRP_Instance(VI_1) Entering MASTER STATE
Mar 4 16:42:48 xxxxx Keepalived_vrrp[17333]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar 4 16:42:48 xxxxx Keepalived_vrrp[17333]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth1 for 10.1.1.200
Mar 4 16:42:48 xxxxx Keepalived_healthcheckers[17332]: Netlink reflector reports IP 10.1.1.200 added
Mar 4 16:42:53 xxxxx Keepalived_vrrp[17333]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth1 for 10.1.1.200
再查看IP绑定情况
$ ifconfig eth1:vip
eth1:vip Link encap:Ethernet HWaddr 00:16:3E:F2:37:6B
inet addr:10.1.1.200 Bcast:10.1.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:13
再通过yum将keepalived卸载掉
yum remove keepalived
恢复到原来的启动脚本/etc/init.d/keepalived
重新启动keepalived后还是无法绑定VIP
怀疑是keepalived启动脚本/etc/init.d/keepalived的问题
检查/etc/init.d/keepalived
# Source function library.
. /etc/rc.d/init.d/functions
exec="/data/app_platform/keepalived/sbin/keepalived"
prog="keepalived"
config="/data/app_platform/keepalived/conf/keepalived.conf"
[ -e /etc/sysconfig/$prog ] && . /etc/sysconfig/$prog
lockfile=/var/lock/subsys/keepalived
start() {
[ -x $exec ] || exit 5
[ -e $config ] || exit 6
echo -n $"Starting $prog: "
daemon $exec $KEEPALIVED_OPTIONS
retval=$?
echo
[ $retval -eq 0 ] && touch $lockfile
return $retval
}
关键是这一行
daemon $exec $KEEPALIVED_OPTIONS
由于没有复制/etc/sysconfig/keepalived,所以将直接执行damon /data/app_platform/keepalived/sbin/keepalived
由于keepalived默认使用的是/etc/keepalived/keepalived.conf作为配置文件,而这里指定了不同的配置文件,所以要修改成为
daemon $exec -D -f $config
重新启动keepalived,查看日志和VIP绑定情况
$ ifconfig eth1:vip
eth1:vip Link encap:Ethernet HWaddr 00:16:3E:F2:37:6B
inet addr:10.1.1.200 Bcast:10.1.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:13
6)将LB2 Backup的keepalived启动脚本也修改一下,观察VIP接管情况
查看LB1 Master
$ ifconfig eth1:vip
eth1:vip Link encap:Ethernet HWaddr 00:16:3E:F2:37:6B
inet addr:10.1.1.200 Bcast:10.1.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:13
查看LB2 Backup
$ ifconfig eth1:vip
eth1:vip Link encap:Ethernet HWaddr 00:16:3E:F2:37:6B
inet addr:10.1.1.200 Bcast:10.1.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:13
问题出现了,LB1 Master和LB2 Backup都绑定了VIP 10.1.1.200,这是不正常的!!!!
在LB1和LB2上登录10.1.1.200看看
[lb1 ~]$ ssh 10.1.1.200
Last login: Wed Mar 4 17:31:33 2015 from 10.1.1.200
[lb1 ~]$
[lb2 ~]$ ssh 10.1.1.200
Last login: Wed Mar 4 17:54:57 2015 from 101.95.153.246
[b2 ~]$
在LB1上停掉keepalived,ping下10.1.1.200这个IP,发现无法ping通
在LB2上停掉keepalived,ping下10.1.1.200这个IP,发现也无法ping通
然后开启LB1上的keepalived,LB1上可以ping通10.1.1.200,LB2上不行
开启LB2上的keepalived,LB2上可以ping通10.1.1.200
由此得出,LB1和LB2各自都将VIP 10.1.1.200绑定到本机的eth1网卡上。两台主机并没有VRRP通信,没有VRRP的优先级比较。
7)排查影响VRRP通信的原因
重新启动LB1 Master的Keepalived查看日志
Mar 5 15:45:36 gintama-taiwan-lb1 Keepalived_vrrp[32303]: Configuration is using : 65410 Bytes
Mar 5 15:45:36 gintama-taiwan-lb1 Keepalived_vrrp[32303]: Using LinkWatch kernel netlink reflector...
Mar 5 15:45:36 gintama-taiwan-lb1 Keepalived_vrrp[32303]: VRRP sockpool: [ifindex(3), proto(112), unicast(0), fd(10,11)]
Mar 5 15:45:36 gintama-taiwan-lb1 Keepalived_vrrp[32303]: VRRP_Script(chk_haproxy) succeeded
Mar 5 15:45:37 gintama-taiwan-lb1 Keepalived_vrrp[32303]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 5 15:45:38 gintama-taiwan-lb1 Keepalived_vrrp[32303]: VRRP_Instance(VI_1) Entering MASTER STATE
Mar 5 15:45:38 gintama-taiwan-lb1 Keepalived_vrrp[32303]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar 5 15:45:38 gintama-taiwan-lb1 Keepalived_vrrp[32303]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth1 for 10.1.1.200
Mar 5 15:45:38 gintama-taiwan-lb1 Keepalived_healthcheckers[32302]: Netlink reflector reports IP 10.1.1.200 added
Mar 5 15:45:43 gintama-taiwan-lb1 Keepalived_vrrp[32303]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth1 for 10.1.1.200
发现LB1 Master上的Keepalived直接进入Master状态,然后接管VIP
再重新启动LB2 Backup上的Keepalived,查看日志
Mar 5 15:47:42 gintama-taiwan-lb2 Keepalived_vrrp[30619]: Configuration is using : 65408 Bytes
Mar 5 15:47:42 gintama-taiwan-lb2 Keepalived_vrrp[30619]: Using LinkWatch kernel netlink reflector...
Mar 5 15:47:42 gintama-taiwan-lb2 Keepalived_vrrp[30619]: VRRP_Instance(VI_1) Entering BACKUP STATE
Mar 5 15:47:42 gintama-taiwan-lb2 Keepalived_vrrp[30619]: VRRP sockpool: [ifindex(3), proto(112), unicast(0), fd(10,11)]
Mar 5 15:47:46 gintama-taiwan-lb2 Keepalived_vrrp[30619]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 5 15:47:47 gintama-taiwan-lb2 Keepalived_vrrp[30619]: VRRP_Instance(VI_1) Entering MASTER STATE
Mar 5 15:47:47 gintama-taiwan-lb2 Keepalived_vrrp[30619]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar 5 15:47:47 gintama-taiwan-lb2 Keepalived_vrrp[30619]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth1 for 10.1.1.200
Mar 5 15:47:47 gintama-taiwan-lb2 Keepalived_healthcheckers[30618]: Netlink reflector reports IP 10.1.1.200 added
Mar 5 15:47:52 gintama-taiwan-lb2 Keepalived_vrrp[30619]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth1 for 10.1.1.200
可以看到LB2上的Keepalived先进入BACKUP状态,然后又转为MASTER状态,然后接管VIP
这样就说明VRRP组播有问题。
既然VRRP组播有问题,就尝试使用单播发送VRRP报文。修改LB1和LB2的配置
LB1
添加以下配置
unicast_src_ip 10.1.1.12
unicast_peer {
10.1.1.17
}
LB2
添加以下配置
unicast_src_ip 10.1.1.17
unicast_peer {
10.1.1.12
}
unicast_src_ip 表示发送VRRP单播报文使用的源IP地址
unicast_peer 表示对端接收VRRP单播报文的IP地址
然后各自重新加载keepalived,观察日志
LB1
Mar 5 16:13:35 gintama-taiwan-lb1 Keepalived_vrrp[2551]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar 5 16:13:35 gintama-taiwan-lb1 Keepalived_vrrp[2551]: VRRP_Script(chk_haproxy) considered successful on reload
Mar 5 16:13:35 gintama-taiwan-lb1 Keepalived_vrrp[2551]: Configuration is using : 65579 Bytes
Mar 5 16:13:35 gintama-taiwan-lb1 Keepalived_vrrp[2551]: Using LinkWatch kernel netlink reflector...
Mar 5 16:13:35 gintama-taiwan-lb1 Keepalived_vrrp[2551]: VRRP sockpool: [ifindex(3), proto(112), unicast(1), fd(10,11)]
Mar 5 16:13:36 gintama-taiwan-lb1 Keepalived_vrrp[2551]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 5 16:13:48 gintama-taiwan-lb1 Keepalived_vrrp[2551]: VRRP_Instance(VI_1) Received lower prio advert, forcing new election
Mar 5 16:13:48 gintama-taiwan-lb1 Keepalived_vrrp[2551]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth1 for 10.1.1.200
Mar 5 16:13:48 gintama-taiwan-lb1 Keepalived_vrrp[2551]: VRRP_Instance(VI_1) Received lower prio advert, forcing new election
Mar 5 16:13:48 gintama-taiwan-lb1 Keepalived_vrrp[2551]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth1 for 10.1.1.200
LB2
Mar 5 16:13:48 gintama-taiwan-lb2 Keepalived_vrrp[453]: VRRP_Instance(VI_1) Received higher prio advert
Mar 5 16:13:48 gintama-taiwan-lb2 Keepalived_vrrp[453]: VRRP_Instance(VI_1) Entering BACKUP STATE
Mar 5 16:13:48 gintama-taiwan-lb2 Keepalived_vrrp[453]: VRRP_Instance(VI_1) removing protocol VIPs.
Mar 5 16:13:48 gintama-taiwan-lb2 Keepalived_healthcheckers[452]: Netlink reflector reports IP 10.1.1.200 removed
查看VIP绑定情况,发现LB2上的VIP已经移除
在LB1上LB2上执行ping 10.1.1.200这个VIP
[lb1 ~]$ ping -c 5 10.1.1.200
PING 10.1.1.200 (10.1.1.200) 56(84) bytes of data.
64 bytes from 10.1.1.200: icmp_seq=1 ttl=64 time=0.028 ms
64 bytes from 10.1.1.200: icmp_seq=2 ttl=64 time=0.020 ms
64 bytes from 10.1.1.200: icmp_seq=3 ttl=64 time=0.020 ms
64 bytes from 10.1.1.200: icmp_seq=4 ttl=64 time=0.021 ms
64 bytes from 10.1.1.200: icmp_seq=5 ttl=64 time=0.027 ms
--- 10.1.1.200 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 3999ms
rtt min/avg/max/mdev = 0.020/0.023/0.028/0.004 ms
[lb2 ~]$ ping -c 5 10.1.1.200
PING 10.1.1.200 (10.1.1.200) 56(84) bytes of data.
--- 10.1.1.200 ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 14000ms
当LB1接管VIP的时候LB2居然无法ping通VIP,同样将LB1的Keepalived停掉,LB2可以接管VIP,但是在LB1上无法ping通这个VIP
在LB1和LB2上进行抓包
lb1 ~]$ sudo tcpdump -vvv -i eth1 host 10.1.1.17
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
16:46:04.827357 IP (tos 0xc0, ttl 255, id 328, offset 0, flags [none], proto VRRP (112), length 40)
10.1.1.12 > 10.1.1.17: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20, addrs: 10.1.1.200 auth "1111^@^@^@^@"
16:46:05.827459 IP (tos 0xc0, ttl 255, id 329, offset 0, flags [none], proto VRRP (112), length 40)
10.1.1.12 > 10.1.1.17: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20, addrs: 10.1.1.200 auth "1111^@^@^@^@"
16:46:06.828234 IP (tos 0xc0, ttl 255, id 330, offset 0, flags [none], proto VRRP (112), length 40)
10.1.1.12 > 10.1.1.17: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20, addrs: 10.1.1.200 auth "1111^@^@^@^@"
16:46:07.828338 IP (tos 0xc0, ttl 255, id 331, offset 0, flags [none], proto VRRP (112), length 40)
10.1.1.12 > 10.1.1.17: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20, addrs: 10.1.1.200 auth "1111^@^@^@^@"
lb2 ~]$ sudo tcpdump -vvv -i eth1 host 10.1.1.12
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
16:48:07.000029 IP (tos 0xc0, ttl 255, id 450, offset 0, flags [none], proto VRRP (112), length 40)
10.1.1.12 > 10.1.1.17: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20, addrs: 10.1.1.200 auth "1111^@^@^@^@"
16:48:07.999539 IP (tos 0xc0, ttl 255, id 451, offset 0, flags [none], proto VRRP (112), length 40)
10.1.1.12 > 10.1.1.17: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20, addrs: 10.1.1.200 auth "1111^@^@^@^@"
16:48:08.999252 IP (tos 0xc0, ttl 255, id 452, offset 0, flags [none], proto VRRP (112), length 40)
10.1.1.12 > 10.1.1.17: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20, addrs: 10.1.1.200 auth "1111^@^@^@^@"
16:48:09.999560 IP (tos 0xc0, ttl 255, id 453, offset 0, flags [none], proto VRRP (112), length 40)
10.1.1.12 > 10.1.1.17: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20, addrs: 10.1.1.200 auth "1111^@^@^@^@"
在LB1和LB2所在物理机上的其他虚拟机进行VIP ping测试,同样只能是LB1上绑定的VIP只能是LB1所在的物理机上的虚拟机可以ping通,LB2所在的物理机上的虚拟机无法ping通,反之也是一样
有同行建议说VRRP和DHCP也有关系,经过查看对方提供的VM的IP地址居然是DHCP分配的,但是经过测试,VRRP和DHCP没有关系。线上环境最好不要使用DHCP来获取IP地址。
8)请对方技术人员配合检查VIP无法ping通的问题
最终查明对方的内网居然使用的虚拟网络,网关是没有实际作用的。所以部分虚拟机无法通过10.1.1.1这个网关去访问VIP。
让对方虚拟机提供方的技术人员到服务器调试HAProxy+Keepalived,他们通过网络设置使得10.1.1.200这个VIP可以通过内网访问。但是当我测试时,发现当HAProxy挂掉后,Keepalived无法作VIP的切换。
9)解决当HAProxy挂掉后,Keepalived无法对VIP切换的问题。
经过反复测试,发现当Keepalived挂掉后,VIP可以切换。但是当HAProxy挂掉后,VIP无法切换。
仔细检查配置文件和查阅相关资料,最终确定是Keepalived的weight和priority两个参数的大小设置问题。
原来的配置文件中我设置LB1的weight为2,priority为100。LB2的weight为2,priority为99
对方在调试的时候将LB1的priority更改为160.这样反复测试当LB1的HAProxy挂掉后,VIP都无法迁移到LB2上。将LB1上的priority更改为100就可以了。
这里需要注意的是:
主keepalived的priority值与vrrp_script的weight值相减的数字小于备用keepalived的priority 值即可!
vrrp_script 里的script返回值为0时认为检测成功,其它值都会当成检测失败
* weight 为正时,脚本检测成功时此weight会加到priority上,检测失败时不加。
主失败:
主 priority < 从 priority + weight 时会切换。
主成功:
主 priority + weight > 从 priority + weight 时,主依然为主
* weight 为负时,脚本检测成功时此weight不影响priority,检测失败时priority - abs(weight)
主失败:
主 priority - abs(weight) < 从priority 时会切换主从
主成功:
主 priority > 从priority 主依然为主。
最终的配置文件为:
! Configuration File for keepalived
global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LB1_MASTER
}
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
#设置外网的VIP
vrrp_instance eth0_VIP {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
unicast_src_ip 8.8.8.6 #使用VRRP的单播
unicast_peer {
8.8.8.7
}
virtual_ipaddress {
8.8.8.8/25 brd 8.8.8.255 dev eth0 label eth0:vip
}
track_script {
chk_haproxy
}
}
#设置内网的VIP
vrrp_instance eth1_VIP {
state MASTER
interface eth1
virtual_router_id 52
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
unicast_src_ip 10.1.1.12
unicast_peer {
10.1.1.17
}
virtual_ipaddress {
10.1.1.200/24 brd 10.1.1.255 dev eth1 label eth1:vip
}
track_script {
chk_haproxy
}
}
三 排查总结
在配置Keepalived的时候,需要注意以下几点:
A.内核开启IP转发和允许非本地IP绑定功能,如果是使用LVS的DR模式还需设置两个arp相关的参数。
B.如果Keepalived所在网络不允许使用组播,可以使用VRRP单播
C.需要注意主备的weight和priority的值,这两个值如果设置不合理可能会影响VIP的切换。
D.如果使用的配置文件不是默认的配置文件,在启动Keepalived的时候需要使用 -f 参数指定配置文件。
涉及的协议:
VRRP协议
给企业路由器高可用功能:
高可用
管理lvs 给lvs高可用
原理:
keepalived原理.png
实际配置
第一个里程:在lb01,lb02两台负载均衡上下载Keepalived服务
[root@lb01 ~]# yum install -y keepalived
第二个里程:keepalived的配置文件详解
分为三个部分:GLOBAL CONFIGURATION(全局定义不部分)
VRRPD CONFIGURATION (vrrp实列:类似于rsync的模块)
LVS CONFIGURATION (通过keepalived配置文件控制lvs)
[root@lb01 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs { ---全局定义
router_id lb01 ---每个keepalived软件的标记/名称
}
vrrp_instance VI_1 { ---vrrp_instance vrrp实列部分
---vrrp_instance 名字 同一对主备之间要一致
state MASTER --- state 状态 MASTER 主 BACKUP备
interface eth0 ---哪块网卡
virtual_router_id 51 ---虚拟路由id号 同一对主备之间要一致
priority 150 --- 优先级 主备之间 50 主150 备100
advert_int 1 ---广告间隔 1s 心跳间隔
简单认证
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress { ---虚拟ip
10.0.0.3/24 dev eth0 label eth0:1 ---dev网卡 label:标签 给网卡启个小名
}
}
第三个里程:配置文件如何设置
需要修改的内容router_id state priority
主配置
[root@lb01 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id lb01
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.3/24 dev eth0 label eth0:1
}
}
备配置文件
[root@lb02 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id lb02
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.3/24 dev eth0 label eth0:1
}
}
第四个里程:进行测试,实现vip的漂移
image.png
1.如果nginx挂掉,keepalived也停掉,VIP漂移到另一台负载上
编写一个脚本
[root@lb01 /server/scripts]# vim chk_ngx.sh
#!/bin/sh
count=` ps -ef |grep nginx |grep -v grep |wc -l `
if [ $count -eq 0 ] ; then
systemctl stop keepalived
fi
2.利用keepalived进行监控nginx的状态
首先要给执行的脚本一个执行的权限
[root@lb01 /server/scripts]# chmod +x /server/scripts/chk_ngx.sh
然后在编写配置文件
[root@lb01 /server/scripts]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id lb01
}
添加以下5行
vrrp_script chk_ngx {
script "/server/scripts/chk_ngx.sh"
interval 2
weight 1
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.3/24 dev eth0 label eth0:1
}
添加以下3行
track_script {
chk_ngx
}
}
3.然后进行测试,停掉nginx,vip漂移到另一台负载上面
在lb01上执行
[root@lb01 /server/scripts]# systemctl restart nginx
[root@lb01 /server/scripts]# ip a |grep 0.3
vip出现在lb02上
[root@lb02 ~]# ip a|grep 0.3
inet 10.0.0.3/24 scope global secondary eth0:1
作用:减轻负载均衡的压力
1.如何设置双主
lb01上面
! Configuration File for keepalived
global_defs {
router_id lb01
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.3/24 dev eth0 label eth0:1
}
}
vrrp_instance VI_2 {
state BACKUP
interface eth0
virtual_router_id 52
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.4/24 dev eth0 label eth0:1
}
}
修改完配置文件,查看ip
10.0.0.3.png
lb02上面
[root@lb02 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id lb02
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.3/24 dev eth0 label eth0:1
}
}
vrrp_instance VI_2 {
state MASTER
interface eth0
virtual_router_id 52
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.4/24 dev eth0 label eth0:2
}
}
修改完配置文件,查看ip
10.0.0.4.png
2、然后修改nginx的配置文件,两台负载均衡配置文件保持一致。
[root@root]# vim /etc/nginx/nginx.conf
user nginx;
worker_processes 1;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
upstream web_pools {
# ip_hash;
server 10.0.0.7:80 weight=2 max_fails=3 fail_timeout=10s;
server 10.0.0.8:80 weight=1 max_fails=3 fail_timeout=10s;
}
#include /etc/nginx/conf.d/*.conf;
server {
listen 80;
server_name www.oldboy.com;
location / {
if ( $remote_addr ~ "^192.168.22.") {
return 403 "biedaoluan\n";
}
proxy_pass http://web_pools;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $remote_addr;
}
}
server {
listen 80;
server_name blog.oldboy.com;
location / {
proxy_pass http://web_pools;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $remote_addr;
}
}
}
配置好本地hosts解析
10.0.0.3 www.oldboy.com
10.0.0.4 status.oldboy.com blog.oldboy.com
停掉一台负载,在浏览器测试还能是显示
在浏览器分别测试 blog.oldboy.com www.oldboy.com
如何指定某个ip访问指定的网站
在nginx的配置文件中指定ip
#include /etc/nginx/conf.d/*.conf;
server {
listen 10.0.0.3:80; (指定ip)
server_name www.oldboy.com;
location / {
proxy_pass http://web_pools;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $remote_addr;
}
}
server {
listen 10.0.0.4:80; (指定ip)
server_name blog.oldboy.com;
location / {
proxy_pass http://web_pools;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $remote_addr;
}
}
检查语法是报错:
[root@lb01 /etc/nginx]# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: [emerg] bind() to 10.0.0.4:80 failed (99: Cannot assign requested address)
nginx: configuration file /etc/nginx/nginx.conf test failed
nginx无法把不存在的ip进行绑定
如何解决:修改内核参数,两台负载都得修改
[root@lb01 ] vim /etc/sysctl.conf (在最后一行追加)
net.ipv4.ip_nonlocal_bind = 1
#生效
sysctl -p
此时 ,检查语法就不会报错了
[root@lb01 ~]# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
然后进行测试
内核参数存放的路径
[root@lb01 ~]# #net.ipv4.ip_nonlocal_bind
[root@lb01 ~]# # /proc/sys/
[root@lb01 ~]# #net.ipv4.ip_nonlocal_bind
[root@lb01 ~]# cat /proc/sys/net/ipv4/ip_nonlocal_bind
1
[root@lb01 ~]# echo 0 >/proc/sys/net/ipv4/ip_nonlocal_bind
如何防止脑裂
脑裂问题与解决(keepalived脑裂的解决和预防)
一、keepalived脑裂
二、什么是裂脑?
三、keepalived脑裂产生的原因
四、常见的解决方案
五、解决keepalived脑裂问题
六、曾经碰到的一个keepalived脑裂的问题
七、预防keepalived脑裂问题
八、推荐自己写脚本
一、keepalived脑裂
Keepalived的作用是检测服务器的状态,如果有一台web服务器宕机,或工作出现故障,Keepalived将检测到,并将有故障的服务器从系统中剔除,同时使用其他服务器代替该服务器的工作,当服务器工作正常后Keepalived自动将服务器加入到服务器群中,这些工作全部自动完成,不需要人工干涉,需要人工做的只是修复故障的服务器。
二、什么是裂脑?
当两台高可用服务器在指定的时间内,无法互相检测到对方心跳而各自启动故障转移功能,取得了资源以及服务的所有权,而此时的两台高可用服务器对都还活着并作正常运行,这样就会导致同一个服务在两端同时启动而发生冲突的严重问题,最严重的就是两台主机同时占用一个VIP的地址(类似双端导入概念),当用户写入数据的时候可能会分别写入到两端,这样可能会导致服务器两端的数据不一致或造成数据的丢失,这种情况就称为裂脑,也有的人称之为分区集群或者大脑垂直分隔。
发生脑裂,导致互相竞争同一个IP资源,就如同我们局域网内常见的IP地址冲突一样,两个机器就会有一个或者两个不正常,影响用户正常访问服务器。如果是应用在数据库或者是存储服务这种极重要的高可用上,那就导致用户发布的数据间断的写在两台服务器上的恶果,最终数据恢复及困难或者是难已恢复。
脑裂(split-brain):指在一个高可用(HA)系统中,当联系着的两个节点断开联系时,本来为一个整体的系统,分裂为两个独立节点,这时两个节点开始争抢共享资源,结果会导致系统混乱,数据损坏。
对于无状态服务的HA,无所谓脑裂不脑裂;但对有状态服务(比如MySQL)的HA,必须要严格防止脑裂。
三、keepalived脑裂产生的原因
脑裂产生的原因:
一般来说,裂脑的发生,有以下几种原因:
优先考虑心跳线路上的问题,在可能是心跳服务,软件层面的问题
1)高可用服务器对之间心跳线路故障,导致无法正常的通信。原因比如:
1——心跳线本身就坏了(包括断了,老化);
2-——网卡以及相关驱动坏了,IP配置及冲突问题;
3——心跳线间连接的设备故障(交换机的故障或者是网卡的故障);
4——仲裁的服务器出现问题。
2)高可用服务器对上开启了防火墙阻挡了心跳消息的传输;
3)高可用服务器对上的心跳网卡地址等信息配置的不正确,导致发送心跳失败;
4)其他服务配置不当等原因,如心跳的方式不同,心跳广播冲突,软件出现了BUG等。
5)Keepalived配置里同一 VRRP实例如果virtual_router_id两端参数配置不一致也会导致裂脑问题发生。
四、常见的解决方案
在实际生产环境中,我们可以从以下几个方面来防止裂脑问题的发生:
同时使用串行电缆和以太网电缆连接,同时用两条心跳线路,这样一条线路坏了,另一个还是好的,依然能传送心跳消息。
当检测到裂脑时强行关闭一个心跳节点(这个功能需特殊设备支持,如Stonith、feyce)。相当于备节点接收不到心跳消患,通过单独的线路发送关机命令关闭主节点的电源。
做好对裂脑的监控报警(如邮件及手机短信等或值班).在问题发生时人为第一时间介入仲裁,降低损失。例如,百度的监控报警短倍就有上行和下行的区别。报警消息发送到管理员手机上,管理员可以通过手机回复对应数字或简单的字符串操作返回给服务器.让服务器根据指令自动处理相应故障,这样解决故障的时间更短.
当然,在实施高可用方案时,要根据业务实际需求确定是否能容忍这样的损失。对于一般的网站常规业务.这个损失是可容忍的。
多节点集群中,可以通过增加仲裁的机制,确定谁该获得资源,这里面有几个参考的思路:
1——增加一个仲裁机制。例如设置参考的IP,当心跳完全断开的时候,2个节点各自都ping一下参考的IP,不同则表明断点就出现在本段,这样就主动放弃竞争,让能够ping通参考IP的一端去接管服务。
2——通过第三方软件仲裁谁该获得资源,这个在阿里有类似的软件应用
启用磁盘锁。正在服务一方锁住共享磁盘,脑裂发生的时候,让对方完全抢不走共享的磁盘资源。但使用锁磁盘也会有一个不小的问题,如果占用共享盘的乙方不主动解锁,另一方就永远得不到共享磁盘。现实中介入服务节点突然死机或者崩溃,另一方就永远不可能执行解锁命令。后备节点也就截关不了共享的资源和应用服务。于是有人在HA中涉及了“智能”锁,正在服务的一方只在发现心跳线全部断开时才启用磁盘锁,平时就不上锁了
报警报在服务器接管之前,给人员处理留足够的时间就是1分钟内报警了,但是服务器不接管,而是5分钟之后接管,接管的时间较长。数据不会丢失,但就是会导致用户无法写数据。报警后,不直接自动服务器接管,而是由人员接管。
五、解决keepalived脑裂问题
检测思路:正常情况下keepalived的VIP地址是在主节点上的,如果在从节点发现了VIP,就设置报警信息。脚本(在从节点上)如下:
vim split-brainc_check.sh
#!/bin/bash
# 检查脑裂的脚本,在备节点上进行部署
LB01_VIP=192.168.1.229
LB01_IP=192.168.1.129
LB02_IP=192.168.1.130
while true
do
ping -c 2 -W 3 $LB01_VIP &>/dev/null
if [ $? -eq 0 -a `ip add|grep "$LB01_VIP"|wc -l` -eq 1 ];then
echo "ha is brain."
else
echo "ha is ok"
fi
sleep 5
done
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
六、曾经碰到的一个keepalived脑裂的问题
(如果启用了iptables,不设置"系统接收VRRP协议"的规则,就会出现脑裂)
曾经在做keepalived+Nginx主备架构的环境时,当重启了备用机器后,发现两台机器都拿到了VIP。这也就是意味着出现了keepalived的脑裂现象,检查了两台主机的网络连通状态,发现网络是好的。然后在备机上抓包:
# tcpdump -i eth0|grep VRRP
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
22:10:17.146322 IP 192.168.1.54 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 160, authtype simple, intvl 1s, length 20
22:10:17.146577 IP 192.168.1.96 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 50, authtype simple, intvl 1s, length 20
22:10:17.146972 IP 192.168.1.54 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 160, authtype simple, intvl 1s, length 20
22:10:18.147136 IP 192.168.1.96 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 50, authtype simple, intvl 1s, length 20
抓包发现备机能接收到master发过来的VRRP广播,那为什么还会有脑裂现象?
接着发现iptables开启着,检查了防火墙配置。发现系统不接收VRRP协议。于是修改iptables,添加允许系统接收VRRP协议的配置:
-A INPUT -i lo -j ACCEPT
-----------------------------------------------------------------------------------------
自己添加了下面的iptables规则:
-A INPUT -s 192.168.1.0/24 -d 224.0.0.18 -j ACCEPT #允许组播地址通信
-A INPUT -s 192.168.1.0/24 -p vrrp -j ACCEPT #允许VRRP(虚拟路由器冗余协)通信
----------------------------------------------------------------------------------------
最后重启iptables,发现备机上的VIP没了。
虽然问题解决了,但备机明明能抓到master发来的VRRP广播包,却无法改变自身状态。只能说明网卡接收到数据包是在iptables处理数据包之前。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
七、预防keepalived脑裂问题
(1)可以采用第三方仲裁的方法。由于keepalived体系中主备两台机器所处的状态与对方有关。如果主备机器之间的通信出了网题,就会发生脑裂,此时keepalived体系中会出现双主的情况,产生资源竞争。
(2)一般可以引入仲裁来解决这个问题,即每个节点必须判断自身的状态。最简单的一种操作方法是,在主备的keepalived的配置文件中增加check配置,服务器周期性地ping一下网关,如果ping不通则认为自身有问题 。
(3)最容易的是借助keepalived提供的vrrp_script及track_script实现。如下所示:
#vim /etc/keepalived/keepalived.conf
......
vrrp_script check_local {
script "/root/check_gateway.sh"
interval 5
}
......
track_script {
check_local
}
脚本内容:
# cat /root/check_gateway.sh
#!/bin/sh
VIP=$1
GATEWAY=192.168.1.1
/sbin/arping -I em1 -c 5 -s $VIP $GATEWAY &>/dev/null
check_gateway.sh 就是我们的仲裁逻辑,发现ping不通网关,则关闭keepalived。
八、推荐自己写脚本
写一个while循环,每轮ping网关,累计连续失败的次数,当连续失败达到一定次数则运行service keepalived stop关闭keepalived服务。
如果发现又能够ping通网关,再重启keepalived服务。最后在脚本开头再加上脚本是否已经运行的判断逻辑,将该脚本加到crontab里面。