Keepalived是一款优秀的高可用软件,起初设计旨在解决LVS的高可用与健康状态检测。另外在这补充一个可用程度的计算公式

        HA:High Availiablity,高可用;
            MTBF:记为MTBF (mean time between failure/平均故障间隔时间),也就是平均无故障时间,系统的寿命是指两次相邻失效(故障) 之间的工作时间, 而不是指整个系统的报废时间。
            MTTR:记为MTTR(mean time to repair/平均修复时间),也就是平均业务故障中断时间,可修复产品的平均修复时间, 就是从出现故障到修复中间的这段时间记为MTTR平均修复时间。
            A:   可用度(也称有效度) 通常记作A,也可以理解为高可用程度, 可用平均无故障时间(MTBF)和平均修复时间(MTTR)来计算:A = MTBF/(MTBF + MTTR)。

            A=MTBF/(MTBF+MTTR)      高可用程度=平均无故障时间/(平均无故障时间+平均业务故障中断时间)
                (0,1):90%, 95%, 99%, 99.5%,  
                99.9%(525.6分钟/8小时宕机/年)
                99.99%(52.56分钟宕机/年)
                99.999%(5.256分钟宕机/年)
                99.9999%(0.5256分钟/30秒宕机/年)

Keepalived冗余实现的本身就是在双方高可用机器安装程序实现心跳探测,而这一层面可以称为心跳层,心跳层的实现借助于网络协议的VRRP(虚拟路由器冗余协议)来实现

VRRP协议的软件实现,原生设计的目的为了高可用ipvs服务:
简单来说keepalived可以实现的效果为

  1. 基于vrrp协议完成虚拟IP地址流动;
  2. 为vip地址所在的节点生成ipvs规则(在配置文件中预先定义);
  3. 为ipvs集群的各RS做健康状态检测;(LVS+Keepalived双主)
  4. 基于脚本调用接口通过执行脚本完成脚本中定义的功能,进而影响集群事务;(Nginx+Keepalived双主)

配置之前对主机的要求:

        HA Cluster的配置前提:
            (1) 各节点时间必须同步;
                ntp, chrony
            (2) 确保iptables,firewalld及selinux不会成为阻碍;
            (3) 各节点之间可通过主机名互相通信(对KA并非必须);
                建议使用/etc/hosts文件实现; 
            (4) 确保各节点的用于集群服务的接口支持MULTICAST通信;
                D类:224-239;

简单来实现第一项 基于vrrp协议完成虚拟IP地址流动

[root@node1 keepalived]# cat keepalived.conf
! Configuration File for keepalived

global_defs {           ##对于邮件报警,先简单配置为本地的邮箱,而且这里的邮件报警也比较鸡肋,后面我们借助keepalive调用脚本的能力再开发报警或者借助zabbix这种专业级程序
   notification_email {
        root@localhost       
   }
   notification_email_from keepalived@localhost
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id node1           #设置为主机名,唯一
   vrrp_mcast_group4 224.0.0.112     ##组播地址
}

vrrp_instance VI_1 {
    state MASTER           #状态分为MASTER | BACKUP
    interface eno16777736    ##浮动ip绑定在哪一个物理接口
    virtual_router_id 31          ##虚拟路由器id,和另一台设置为一致
    priority 100                       ##优先级
    advert_int 1                      ##心跳检测频率,默认1s
        nopreempt                       ##非抢占模式
    authentication {
        auth_type PASS
        auth_pass f1GDsVH6      ##VRRP组播,和同一组虚拟vip保持一致
    }
    virtual_ipaddress {
        192.168.233.199/24 dev eno16777736 label eno16777736:1    ##设置vip地址
    }
}

vrrp_instance VI_2 {
    state BACKUP
    interface eno16777736
    virtual_router_id 32
    priority 98
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass f1GDsV78
    }
    virtual_ipaddress {
        192.168.233.198/24 dev eno16777736 label eno16777736:2
    }
}

查看上线后的组播内容

[root@node1 keepalived]# tcpdump -i eno16777736 -nn host 224.0.0.112
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eno16777736, link-type EN10MB (Ethernet), capture size 262144 bytes
06:36:08.762574 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
06:36:08.762750 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:09.763419 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:09.764635 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
06:36:10.763953 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:10.764830 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
06:36:11.764128 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:11.765028 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
06:36:12.764737 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:12.765213 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20
06:36:13.765372 IP 192.168.233.128 > 224.0.0.112: VRRPv2, Advertisement, vrid 32, prio 100, authtype simple, intvl 1s, length 20
06:36:13.766607 IP 192.168.233.129 > 224.0.0.112: VRRPv2, Advertisement, vrid 31, prio 100, authtype simple, intvl 1s, length 20

配置keepalive状态变更报警脚本

在keepalived.conf的vrrp_instance段内配置邮件触发脚本
vrrp_instance VI_1 {
    state BACKUP
    interface eno16777736
    virtual_router_id 31
    priority 98
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass f1GDsVH6
    }
    virtual_ipaddress {
        192.168.233.199/24 dev eno16777736 label eno16777736:1
#        192.168.233.199/24    ##此操作也可以,查看浮动ip使用ip a l
    }

    notify_master "/etc/keepalived/scripts/notify.sh master"      ##状态变更为master时执行脚本
    notify_backup "/etc/keepalived/scripts/notify.sh backup"     ##状态变更为backup时执行脚本
    notify_fault "/etc/keepalived/scripts/notify.sh fault"     ##状态发生故障时执行脚本
}

############################## 以下脚本内容 ################################

[root@node2 scripts]# cat notify.sh 
#!/bin/bash
#
contact='root@localhost'

notify() {
        local mailsubject="$(hostname) to be $1, vip floating"
        local mailbody="$(date +'%F %T'): vrrp transition, $(hostname) changed to be $1"
        echo "$mailbody" | mail -s "$mailsubject" $contact
}

case $1 in
master)
        notify master
        ;;
backup)
        notify backup
        ;;
fault)
        notify fault
        ;;
*)
        echo "Usage: $(basename $0) {master|backup|fault}"
        exit 1
        ;;
esac

为ipvs集群的各RS做健康状态检测;(LVS+Keepalived双主)

详见:https://blog.51cto.com/swiki/2342624

基于脚本调用接口通过执行脚本完成脚本中定义的功能,进而影响集群事务;(Nginx+Keepalived双主)

拓扑信息:

客户端 NGINX1 NGINX2
192.168.2.1 192.168.2.128 VIP:192.168.2.198 192.168.2.129 VIP:192.168.2.199


脚本实现借助killall,centos7最小化安装并没有该命令,所以需要手动安装psmisc
修改原来的邮件通知脚本实现当NGINX进入master时,自动启动NGINX

配置实现:

#########################  NGINX1  #########################
[root@node1 keepalived]# yum install nginx psmisc -y;systemctl start nginx;
[root@node1 keepalived]# echo "192.168.2.128" > /usr/share/nginx/html/index.html 
[root@node1 keepalived]# cat keepalived.conf
! Configuration File for keepalived

global_defs {           ##对于邮件报警,先简单配置为本地的邮箱,而且这里的邮件报警也比较鸡肋,后面我们借助keepalive调用脚本的能力再开发报警或者借助zabbix这种专业级程序
   notification_email {
        root@localhost       
   }
   notification_email_from keepalived@localhost
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id node1           #设置为主机名,唯一
   vrrp_mcast_group4 224.0.0.112     ##组播地址
}

vrrp_script chk_nginx {
        script "killall -0 nginx && exit 0 || exit 1"
        interval 1
        weight -5
        fall 2
        rise 1
}

vrrp_instance VI_1 {
    state MASTER           #状态分为MASTER | BACKUP
    interface eno16777736    ##浮动ip绑定在哪一个物理接口
    virtual_router_id 31          ##虚拟路由器id,和另一台设置为一致
    priority 100                       ##优先级
    advert_int 1                      ##心跳检测频率,默认1s
#        nopreempt                       ##非抢占模式
    authentication {
        auth_type PASS
        auth_pass f1GDsVH6      ##VRRP组播,和同一组虚拟vip保持一致
    }
    virtual_ipaddress {
        192.168.2.198/24 dev eno16777736 label eno16777736:1    ##设置vip地址
    }

    track_script {
        chk_nginx
    }

    notify_master "/etc/keepalived/scripts/notify.sh master"      ##状态变更为master时执行脚本
    notify_backup "/etc/keepalived/scripts/notify.sh backup"     ##状态变更为backup时执行脚本
    notify_fault "/etc/keepalived/scripts/notify.sh fault"     ##状态发生故障时执行脚本
}

vrrp_instance VI_2 {
    state BACKUP
    interface eno16777736
    virtual_router_id 32
    priority 98
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass f1GDsV78
    }
    virtual_ipaddress {
        192.168.2.199/24 dev eno16777736 label eno16777736:2
    }
    notify_master "/etc/keepalived/scripts/notify.sh master"      ##状态变更为master时执行脚本
    notify_backup "/etc/keepalived/scripts/notify.sh backup"     ##状态变更为backup时执行脚本
    notify_fault "/etc/keepalived/scripts/notify.sh fault"     ##状态发生故障时执行脚本
    track_script {
        chk_nginx
    }
}

#########################  NGINX2  #########################

NGINX2与NGINX1的keepalived双活配置相反,这里不在赘述

附notify.sh 脚本内容:

[root@node1 keepalived]# cat scripts/notify.sh 
#!/bin/bash
#
contact='root@localhost'

notify() {
        local mailsubject="$(hostname) to be $1, vip floating"
        local mailbody="$(date +'%F %T'): vrrp transition, $(hostname) changed to be $1"
        echo "$mailbody" | mail -s "$mailsubject" $contact
}

case $1 in
master)
        systemctl start nginx
        notify master
        ;;
backup)
        #systemctl start nginx
        notify backup
        ;;
fault)
        notify fault
        ;;
*)
        echo "Usage: $(basename $0) {master|backup|fault}"
        exit 1
        ;;
esac