这段时间随着互联网企业陆续开始招聘,一直都忙着找工作,基本上有运维岗位的我都投了简历,网申了呢,现在有些公司还处于面试或者等待消息的状况。现在就YY 一面中被问到的一个LVS 问题来展开,其他面试笔试经历稍后全部结束了会再总结一下。
为什么挑在YY 一面中被问到的问题展开呢?因为那次面试给我的感触很深,YY今年的运维岗位竞争压力很大,全国招10个人,而广州这边我问了下面试官只招4个。而我的学校既不是985,也不是211,只是普通的2A本科学校。不开玩笑,在YY笔试现场,我从门外贴着的签到表看到,全场除了一个韩山师范学院的童鞋外,就我们学校最差了,在场的华工中大不说,还有什么哈工,甚至香港大学。这对于我来说,压力真心很大,但还是顶着压力去尝试了。一面给我的感觉就是,一点都不比腾讯面试中的二面简单,感觉整场主动权都在面试官那里,他控得死死的。基本上,我简历上的东西他全部都问了,而且是一点一点,问得很深。接下来我要分析的LVS便是他问的其中一个问题,目前已经历了YY的3轮面试(技术初试+技术复试+HR面试),正在等结果。
前面说多了,现在就一面中被问到的问题来分析,问题是这样的:LVS 你熟悉吗?有没有自己抓过数据报分析呢?
问题一:Master 和 BACKUP 之间怎么知道谁优先级高,谁做Master 谁做BACKUP?
问题二:当Master 宕机了,BACKUP如何接管服务?
问题三:若两台服务器都为Master ,那么谁将充当真正的Master?
题外话:说真的,LVS我是接触挺久了,实习的时候做项目也用过,但是我对它的了解仅仅是理论上认知(算法,原理等等)以及简单的使用,却从未涉及到数据报层面,所以一面回来,我最大的感触就是自己太肤浅了。
下面是实验过程:
至于lvs+keepalived 的安装这里就不重复了,只贴出配置,需要的朋友可以参考我之前的《lvs+keepalive 实现高可用集群》那篇文章。
实验架构如下:
IP 分配情况:
node1 192.168.30.111
node2 192.168.30.112
web1 192.168.30.113
web2 192.168.30.114
VIP 192.168.30.254
一、配置
1. keepalived 的配置如下:
[root@node1 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_1
}
vrrp_instance VI_1 {
state MASTER // 从服务器 上改为BACKUP
interface eth0
virtual_router_id 51 // 必须保证两个id 一致
priority 100 // 从服务器的优先级应低于主服务器的优先级(数值越大,优先级越高)
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.30.254
}
}
virtual_server 192.168.30.254 80 {
delay_loop 6
lb_algo rr
lb_kind DR
#nat_mask 255.255.255.0
persistence_timeout 50
protocol TCP
real_server 192.168.30.113 80 {
weight 1
TCP_CHECK {
connect_timeout 3
nb_get_retry 3
connect_port 80
}
}
real_server 192.168.30.114 80 {
weight 1
TCP_CHECK {
connect_timeout 3
nb_get_retry 3
connect_port 80
}
}
}
2. RealServer 的配置
在RS1和RS2上分别创建并运行下面的脚本即可
#!/bin/bash
VIP=192.168.30.254
case $1 in
start)
echo "Start LVS of DR"
/sbin/ifdown eth1
ifconfig eth0:0 192.168.30.254 netmask 255.255.255.255 broadcast 192.168.30.254
route add -host 192.168.30.254 dev eth0:0
#route add default gw 192.168.30.200
echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce
sysctl -p > /dev/null 2>&1
;;
stop)
echo "Stop LVS of DR"
/sbin/ifconfig eth0:0 down
echo "0" > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo "0" > /proc/sys/net/ipv4/conf/lo/arp_announce
echo "0" > /proc/sys/net/ipv4/conf/all/arp_ignore
echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce
sysctl -p > /dev/null 2>&1
;;
*)
echo "Usage:$0 {start|stop}"
exit 1
esac
启动完毕,分别安装上apache,即可:
[root@web1 ~]# yum -y install httpd
[root@web1 ~]# service httpd start
[root@web1 ~]# echo "<h1>Web1</h1>" > /var/www/html/index.html
3. 通过tcpdump 抓包观察
在测试端上ping LVS Master ,并在Master 上执行tcpdump 抓包
[root@node1 ~]# tcpdump -p icmp -i eth0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:12:36.745383 IP 192.168.30.1 > 192.168.30.254: ICMP echo request, id 1024, seq 7424, length 40
19:12:36.745419 IP 192.168.30.254 > 192.168.30.1: ICMP echo reply, id 1024, seq 7424, length 40
19:12:37.745694 IP 192.168.30.1 > 192.168.30.254: ICMP echo request, id 1024, seq 7680, length 40
19:12:37.745748 IP 192.168.30.254 > 192.168.30.1: ICMP echo reply, id 1024, seq 7680, length 40
19:12:38.746653 IP 192.168.30.1 > 192.168.30.254: ICMP echo request, id 1024, seq 7936, length 40
19:12:38.746692 IP 192.168.30.254 > 192.168.30.1: ICMP echo reply, id 1024, seq 7936, length 40
19:12:39.746569 IP 192.168.30.1 > 192.168.30.254: ICMP echo request, id 1024, seq 8192, length 40
19:12:39.746602 IP 192.168.30.254 > 192.168.30.1: ICMP echo reply, id 1024, seq 8192, length 40
由此可以看出VIP 在Master 上
问题一:Master 和 BACKUP 之间怎么知道谁优先级高,谁做Master 谁做BACKUP?
[root@node1 ~]# tcpdump vrrp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:21:14.929139 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
19:21:15.930098 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
19:21:16.931275 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
19:21:17.932976 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
19:21:18.933830 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
由此可见 keepalived的master与slave是通过vrrp2协议进行通讯.以决定各自的状态及vip等相关信息,MASTER会发送广播包,广播地址为224.0.0.18.
[root@node1 ~]# tcpdump -X -n -vvv 'dst 224.0.0.18'
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:22:57.102636 IP (tos 0xc0, ttl 255, id 973, offset 0, flags [none], proto VRRP (112), length 40)
192.168.30.111 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 192.168.30.254 auth "1111^@^@^@^@"
0x0000: 45c0 0028 03cd 0000 ff70 f7ae c0a8 1e6f E..(.....p.....o
0x0010: e000 0012 2133 6401 0101 37c1 c0a8 1efe ....!3d...7.....
0x0020: 3131 3131 0000 0000 1111....
19:22:58.103593 IP (tos 0xc0, ttl 255, id 974, offset 0, flags [none], proto VRRP (112), length 40)
192.168.30.111 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 192.168.30.254 auth "1111^@^@^@^@"
0x0000: 45c0 0028 03ce 0000 ff70 f7ad c0a8 1e6f E..(.....p.....o
0x0010: e000 0012 2133 6401 0101 37c1 c0a8 1efe ....!3d...7.....
0x0020: 3131 3131 0000 0000 1111....
以192.168.30.111服务器发广播数据为例,如下:
192.168.30.111 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 192.168.30.254 auth "1111^@^@^@^@"
0x0000: 45c0 0028 03ce 0000 ff70 f7ad c0a8 1e6f E..(.....p.....o
0x0010: e000 0012 2133 6401 0101 37c1 c0a8 1efe ....!3d...7.....
0x0020: 3131 3131 0000 0000 1111....
vrrpv2的协议的消息从这里开始:
192.168.30.111 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 192.168.30.254 auth "1111^@^@^@^@"
0x0000: 45c0 0028 03ce 0000 ff70 f7ad c0a8 1e6f E..(.....p.....o
0x0010: e000 0012
2133 6401 0101 37c1 c0a8 1efe ....!3d...7.....
0x0020: 3131 3131 0000 0000 1111....
字段解释:
version:版本号4位,在RFC中定义为2,所以这里是2.
type:类型,4位,目前只定义一种类型,就是Advertisement,表示通告信息,取值为1.所以这里是1
Virtual ID:虚拟路由器ID,8位,因为在lvs1中的keepalived定义的virtual_router_id为51,所以转换为16进制就是33.
Priority:优先级,8位,因为在lvs1中的keepalived定义的Priority为100,所以转换为16进制就是64
count ip addrs:VRRP包中的IP地址数量,8位.这里只有一个ip地址,所以就是01
auth type:认证类型,8位,在RFC3768中认证功能已经取消.所以该字段为01,其实这样只对老版本的兼容.如果取消则为00.
adver int:通告包的发送间隔时间,缺省为1秒,我们的配置也是1秒,所以这里的值为01
checksum:检验和,16位.这里的校验数据范围只是VRRP数据,并不包括IP头.
ip address:vip地址,这里是16位,我们的vip地址为192.168.30.254,所以转换为十六进制就是c0a8 1efe
auth data:验证的密码,密码的最大长度为8个字符,也就是32位,不足32位的,以0补全,所以这里就是3131 3131 0000 0000
问题二:当Master 宕机了,BACKUP如何接管服务?
MASTER 在运行的时候会不断向本网段的发送VRRPv2 的组播报文,如下:
192.168.30.111 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 192.168.30.254 auth "1111^@^@^@^@"
0x0000: 45c0 0028 03cd 0000 ff70 f7ae c0a8 1e6f E..(.....p.....o
0x0010: e000 0012 2133 6401 0101 37c1 c0a8 1efe ....!3d...7.....
0x0020: 3131 3131 0000 0000 1111....
注:BACKUP是不发组播报文的,如下所示,在BACKUP服务器上抓包只看到MASTER的VRRP 包
[root@node2 ~]# tcpdump dst 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
20:36:18.975572 IP
node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
20:36:19.976512 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
20:36:20.977505 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
20:36:21.978394 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
但是如果MASTER宕掉,这时BACKUP在确认没有收到MASTER的组播报文后,会主动发送组播报文,声明自己的keepalived状态,随后启用VIP。正式接管keepliaved。
[root@node2 ~]# tcpdump dst 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
20:37:52.100343 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
20:37:53.090777 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 0, authtype simple, intvl 1s, length 20
20:37:53.779780 IP node2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 80, authtype simple, intvl 1s, length 20
20:37:54.794518 IP node2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 80, authtype simple, intvl 1s, length 20
20:37:55.795738 IP node2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 80, authtype simple, intvl 1s, length 20
查看日志可以看到BACKUP 已经接管VIP
[root@node2 ~]# tailf /var/log/messages
Oct 18 20:36:18 node2 kernel: device eth0 entered promiscuous mode
Oct 18 20:36:22 node2 kernel: device eth0 left promiscuous mode
Oct 18 20:37:51 node2 kernel: device eth0 entered promiscuous mode
Oct 18 20:37:53 node2 Keepalived_vrrp[40622]:
VRRP_Instance(VI_1) Transition to MASTER STATE
Oct 18 20:37:54 node2 Keepalived_vrrp[40622]:
VRRP_Instance(VI_1) Entering MASTER STATE
Oct 18 20:37:54 node2 Keepalived_vrrp[40622]:
VRRP_Instance(VI_1) setting protocol VIPs.
Oct 18 20:37:54 node2 Keepalived_healthcheckers[40621]:
Netlink reflector reports IP 192.168.30.254 added
Oct 18 20:37:54 node2 Keepalived_vrrp[40622]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.30.254
Oct 18 20:37:56 node2 kernel: device eth0 left promiscuous mode
Oct 18 20:37:59 node2 Keepalived_vrrp[40622]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.30.254
问题三:若两台服务器都为Master ,那么谁将充当真正的Master
将BACKUP上的状态也改为MASTER
[root@node2 ~]# vim /etc/keepalived/keepalived.conf
state MASTER
重启keepalived
[root@node2 ~]# service keepalived restart
Stopping keepalived: [ OK ]
Starting keepalived: [ OK ]
观察日志
[root@node2 ~]# tailf /var/log/messages
Oct 18 19:35:03 node2 Keepalived_vrrp[40602]:
VRRP_Instance(VI_1) Transition to MASTER STATE
Oct 18 19:35:03 node2 Keepalived_vrrp[40602]:
VRRP_Instance(VI_1) Received higher prio advert
Oct 18 19:35:03 node2 Keepalived_vrrp[40602]: VRRP_Instance(VI_1) Entering BACKUP STATE
从上面可以看到,它确实是以MASTER 启动,但是接收到了更高优先级,于是变成BACKUP。
若此时优先级相同会出现什么情况?
由于两个MASTER 不分上下,所以他们会互相抢占VIP,导致IP冲突。
问题四:如果宕掉的MASTER恢复工作,是否会接管工作?
MASTER 恢复工作后,会接管BACKUP 上面的工作,而BACKUP 又从MASTER 降级为BACKUP。
[root@node1 ~]# tailf /var/log/messages
Oct 18 20:41:04 node1 Keepalived_vrrp[46564]: VRRP_Instance(VI_1) Transition to MASTER STATE
Oct 18 20:41:05 node1 Keepalived_vrrp[46564]: VRRP_Instance(VI_1) Entering MASTER STATE
Oct 18 20:41:05 node1 Keepalived_vrrp[46564]: VRRP_Instance(VI_1) setting protocol VIPs.
Oct 18 20:41:05 node1 Keepalived_vrrp[46564]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.30.254
如何让MASTER 恢复后不抢占VIP?
在 node1 上修改state 为BACKUP
[root@node1 ~]# vim /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {
state BACKUP
interface eth0
priority 100
virtual_router_id 51
authentication {
auth_type PASS
auth_pass 1111
}
在node2 上加入nopreempt 参数,并将优先级修改为150(高于node1 即可)
[root@node2 ~]# vim /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {
state BACKUP
interface eth0
priority 150
nopreempt
virtual_router_id 51
authentication {
auth_type PASS
auth_pass 1111
}
分别重启node1,node2 的keepalived
[root@node1 ~]# service keepalived restart
[root@node2 ~]# service keepalived restart
先关闭node1 上面的keepalived
[root@node1 ~]# service keepalived stop
再开启node1 的keepalived
[root@node1 ~]# service keepalived start
[root@node1 ~]# tailf /var/log/messages
Oct 18 20:59:50 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) Entering BACKUP STATE
Oct 18 20:59:50 node1 Keepalived_healthcheckers[46651]: Using LinkWatch kernel netlink reflector...
Oct 18 20:59:50 node1 Keepalived_vrrp[46653]: VRRP sockpool: [ifindex(2), proto(112), fd(10,11)]
可以看到node1启动后还是BACKUP状态,而不会抢占MASTER
配置两个BACKUP状态,保证互不抢占.
为什么一台要比另一个的优先级高呢?因为我们在高优先级的服务器上配置了nopreempt,导致高的优先级也不会抢占低的优先级,也就是说只有在一台keepalived失败的时候,另一台才会接管。
再关闭node2
[root@node2 ~]# service keepalived stop
可以看到node1 变成了MASTER
[root@node1 ~]# tailf /var/log/messages
Oct 18 21:15:34 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) Transition to MASTER STATE
Oct 18 21:15:35 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) Entering MASTER STATE
Oct 18 21:15:35 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) setting protocol VIPs.
Oct 18 21:15:35 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.30.254
Oct 18 21:15:35 node1 Keepalived_healthcheckers[46651]: Netlink reflector reports IP 192.168.30.254 added
Oct 18 21:15:40 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.30.254