Keepalived是一款优秀的高可用软件,起初设计旨在解决LVS的高可用与健康状态检测。另外在这补充一个可用程度的计算公式
HA:High Availiablity,高可用;
MTBF:记为MTBF (mean time between failure/平均故障间隔时间),也就是平均无故障时间,系统的寿命是指两次相邻失效(故障) 之间的工作时间, 而不是指整个系统的报废时间。
MTTR:记为MTTR(mean time to repair/平均修复时间),也就是平均业务故障中断时间,可修复产品的平均修复时间, 就是从出现故障到修复中间的这段时间记为MTTR平均修复时间。
A: 可用度(也称有效度) 通常记作A,也可以理解为高可用程度, 可用平均无故障时间(MTBF)和平均修复时间(MTTR)来计算:A = MTBF/(MTBF + MTTR)。
A=MTBF/(MTBF+MTTR) 高可用程度=平均无故障时间/(平均无故障时间+平均业务故障中断时间)
(0,1):90%, 95%, 99%, 99.5%,
99.9%(525.6分钟/8小时宕机/年)
99.99%(52.56分钟宕机/年)
99.999%(5.256分钟宕机/年)
99.9999%(0.5256分钟/30秒宕机/年)
Keepalived冗余实现的本身就是在双方高可用机器安装程序实现心跳探测,而这一层面可以称为心跳层,心跳层的实现借助于网络协议的VRRP(虚拟路由器冗余协议)来实现
VRRP协议的软件实现,原生设计的目的为了高可用ipvs服务:
简单来说keepalived可以实现的效果为
- 基于vrrp协议完成虚拟IP地址流动;
- 为vip地址所在的节点生成ipvs规则(在配置文件中预先定义);
- 为ipvs集群的各RS做健康状态检测;(LVS+Keepalived双主)
- 基于脚本调用接口通过执行脚本完成脚本中定义的功能,进而影响集群事务;(Nginx+Keepalived双主)
配置之前对主机的要求:
HA Cluster的配置前提:
(1) 各节点时间必须同步;
ntp, chrony
(2) 确保iptables,firewalld及selinux不会成为阻碍;
(3) 各节点之间可通过主机名互相通信(对KA并非必须);
建议使用/etc/hosts文件实现;
(4) 确保各节点的用于集群服务的接口支持MULTICAST通信;
D类:224-239;
简单来实现第一项 基于vrrp协议完成虚拟IP地址流动
[root@node1 keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs { ##对于邮件报警,先简单配置为本地的邮箱,而且这里的邮件报警也比较鸡肋,后面我们借助keepalive调用脚本的能力再开发报警或者借助zabbix这种专业级程序
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1 #设置为主机名,唯一
vrrp_mcast_group4 224.0.0.112 ##组播地址
}
vrrp_instance VI_1 {
state MASTER #状态分为MASTER | BACKUP
interface eno16777736 ##浮动ip绑定在哪一个物理接口
virtual_router_id 31 ##虚拟路由器id,和另一台设置为一致
priority 100 ##优先级
advert_int 1 ##心跳检测频率,默认1s
nopreempt ##非抢占模式
authentication {
auth_type PASS
auth_pass f1GDsVH6 ##VRRP组播,和同一组虚拟vip保持一致
}
virtual_ipaddress {
192.168.233.199/24 dev eno16777736 label eno16777736:1 ##设置vip地址
}
}
vrrp_instance VI_2 {
state BACKUP
interface eno16777736
virtual_router_id 32
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass f1GDsV78
}
virtual_ipaddress {
192.168.233.198/24 dev eno16777736 label eno16777736:2
}
}
查看上线后的组播内容
[root@node1 keepalived]# tcpdump -i eno16777736 -nn host 224.0.0.112
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eno16777736, link-type EN10MB (Ethernet), capture size 262144 bytes
06:36:08.762574 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
06:36:08.762750 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:09.763419 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:09.764635 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
06:36:10.763953 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:10.764830 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
06:36:11.764128 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:11.765028 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
06:36:12.764737 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:12.765213 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
06:36:13.765372 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:13.766607 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
配置keepalive状态变更报警脚本
在keepalived.conf的vrrp_instance段内配置邮件触发脚本
vrrp_instance VI_1 {
state BACKUP
interface eno16777736
virtual_router_id 31
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass f1GDsVH6
}
virtual_ipaddress {
192.168.233.199/24 dev eno16777736 label eno16777736:1
# 192.168.233.199/24 ##此操作也可以,查看浮动ip使用ip a l
}
notify_master "/etc/keepalived/scripts/notify.sh master" ##状态变更为master时执行脚本
notify_backup "/etc/keepalived/scripts/notify.sh backup" ##状态变更为backup时执行脚本
notify_fault "/etc/keepalived/scripts/notify.sh fault" ##状态发生故障时执行脚本
}
############################## 以下脚本内容 ################################
[root@node2 scripts]# cat notify.sh
#!/bin/bash
#
contact='root@localhost'
notify() {
local mailsubject="$(hostname) to be $1, vip floating"
local mailbody="$(date +'%F %T'): vrrp transition, $(hostname) changed to be $1"
echo "$mailbody" | mail -s "$mailsubject" $contact
}
case $1 in
master)
notify master
;;
backup)
notify backup
;;
fault)
notify fault
;;
*)
echo "Usage: $(basename $0) {master|backup|fault}"
exit 1
;;
esac
为ipvs集群的各RS做健康状态检测;(LVS+Keepalived双主)
详见:https://blog.51cto.com/swiki/2342624
基于脚本调用接口通过执行脚本完成脚本中定义的功能,进而影响集群事务;(Nginx+Keepalived双主)
拓扑信息:
客户端 | NGINX1 | NGINX2 |
---|---|---|
192.168.2.1 | 192.168.2.128 VIP:192.168.2.198 | 192.168.2.129 VIP:192.168.2.199 |
脚本实现借助killall,centos7最小化安装并没有该命令,所以需要手动安装psmisc
修改原来的邮件通知脚本实现当NGINX进入master时,自动启动NGINX
配置实现:
######################### NGINX1 #########################
[root@node1 keepalived]# yum install nginx psmisc -y;systemctl start nginx;
[root@node1 keepalived]# echo "192.168.2.128" > /usr/share/nginx/html/index.html
[root@node1 keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs { ##对于邮件报警,先简单配置为本地的邮箱,而且这里的邮件报警也比较鸡肋,后面我们借助keepalive调用脚本的能力再开发报警或者借助zabbix这种专业级程序
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1 #设置为主机名,唯一
vrrp_mcast_group4 224.0.0.112 ##组播地址
}
vrrp_script chk_nginx {
script "killall -0 nginx && exit 0 || exit 1"
interval 1
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER #状态分为MASTER | BACKUP
interface eno16777736 ##浮动ip绑定在哪一个物理接口
virtual_router_id 31 ##虚拟路由器id,和另一台设置为一致
priority 100 ##优先级
advert_int 1 ##心跳检测频率,默认1s
# nopreempt ##非抢占模式
authentication {
auth_type PASS
auth_pass f1GDsVH6 ##VRRP组播,和同一组虚拟vip保持一致
}
virtual_ipaddress {
192.168.2.198/24 dev eno16777736 label eno16777736:1 ##设置vip地址
}
track_script {
chk_nginx
}
notify_master "/etc/keepalived/scripts/notify.sh master" ##状态变更为master时执行脚本
notify_backup "/etc/keepalived/scripts/notify.sh backup" ##状态变更为backup时执行脚本
notify_fault "/etc/keepalived/scripts/notify.sh fault" ##状态发生故障时执行脚本
}
vrrp_instance VI_2 {
state BACKUP
interface eno16777736
virtual_router_id 32
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass f1GDsV78
}
virtual_ipaddress {
192.168.2.199/24 dev eno16777736 label eno16777736:2
}
notify_master "/etc/keepalived/scripts/notify.sh master" ##状态变更为master时执行脚本
notify_backup "/etc/keepalived/scripts/notify.sh backup" ##状态变更为backup时执行脚本
notify_fault "/etc/keepalived/scripts/notify.sh fault" ##状态发生故障时执行脚本
track_script {
chk_nginx
}
}
######################### NGINX2 #########################
NGINX2与NGINX1的keepalived双活配置相反,这里不在赘述
附notify.sh 脚本内容:
[root@node1 keepalived]# cat scripts/notify.sh
#!/bin/bash
#
contact='root@localhost'
notify() {
local mailsubject="$(hostname) to be $1, vip floating"
local mailbody="$(date +'%F %T'): vrrp transition, $(hostname) changed to be $1"
echo "$mailbody" | mail -s "$mailsubject" $contact
}
case $1 in
master)
systemctl start nginx
notify master
;;
backup)
#systemctl start nginx
notify backup
;;
fault)
notify fault
;;
*)
echo "Usage: $(basename $0) {master|backup|fault}"
exit 1
;;
esac