一、高可用集群
1、概念:
高可用集群(High Availability Cluster,HA Cluster),指解决单点故障问题的发生,以减少服务中断时间为目的的服务器集群技术。
它通过保护用户的业务程序对外不间断地提供服务,把因为软、硬、人三因素而造成的故障对业务的影响程度降低到最小。
即最终目的为保证业务的7*24小时不间断。
2、高可用性能衡量标准:
衡量高可用的优劣,通常需要使用MTTF、MTTR、MTBF这三个时间值。
MTTF(Mean Time To Failure):平均无故障时间,指系统无故障运行的平均时间,取所有从系统开始正常运行到发生故障之间的时间段的平均值,衡量系统的可靠性。MTTF=ΣT1/N。
MTTR(Mean Time To Repair):平均维修时间,指系统从发生故障到维修结束之间的时间段的平均值,衡量系统的可维护性。MTTR=Σ(T2+T3)/N。
MTBF(Mean Time Between Failure):平均失效间隔,指系统两次故障发生之间的时间段的平均值。MTBF=Σ(T2+T3+T1)/N。
MTBF=MTTF+MTTR ;HA=MTTF/(MTTF+MTTR)*100%
基本可用性 |
2个9 |
99% |
年度宕机时间:87.6h |
较高可用性 |
3个9 |
99.9% |
年度宕机时间:8.8h |
具有故障自动恢复 |
4个9 |
99.99% |
年度宕机时间:53m |
极高可用性 |
5个9 |
99.999% |
年度宕机时间:5m |
3、高可用实现的原理
单点故障问题是高可用解决的主要问题,在解决单点故障问题上,架构设计的核心准则即“冗余”,通过配置主备设备达到集群的效果,每台设备为一个节点(node),实现主节点(master node)故障,备用节点(backup node)接管业务,待主节点故障修复后再切换回来。
然而,只有冗余是不够的,单单冗余需要人工干预进行手动切换,会增加系统不可服务的时间。因此,需要加入“自动故障转移”来实现自动切换的功能。
4、高可用集群软件
(1)国外:RedHat(RHCS)、Novell(Novell Cluster Service)、Steeleye(Lifekeeper for Linux、Keepalived)
(2)国内:中兴(Newstart HA)深度(Deepin HA)
二、keepalived
1、概念:
keepalived是一种服务器高可用,防止服务器网络单点故障导致业务中断的解决方案。起初是专为LVS负载均衡软件设计的,用来管理并监控LVS集群系统中各个服务节点的状态,后来又加入了可以实现高可用的VRRP功能,作为其他服务的高可用解决方案软件。
2、原理:
以VRRP为技术核心,引入虚拟IP(Virtual IP,VIP)概念。将多台服务器组成的集群虚拟成一台虚拟服务器设备,这台虚拟服务器设备通过VIP对外提供服务。而在虚拟服务器中多台物理服务器通过指定或选举的方式来确定master和backup,master实现针对虚拟服务器IP的各种网络功能对外提供服务,此时不提供网络服务的服务器作为backup,master会通过多播的方式发送心跳报文给backup。当master发生故障,backup周期时间内没有接收到master的心跳报文,则会通过选举的方式来接管master角色对外提供服务。
三、VRRP
1、概念:
Virtual Router Redundancy Protocol,虚拟路由器冗余协议。其目的就是为了解决静态路由单点故障问题,它能够保证当个别节点宕机时,整个网络可以不间断运行。
2、原理:
将启用VRRP功能的设备组成集群,设备之间通过竞选机制(比较优先级)选出主、备设备,主设备会通过组播的方式向备用设备发送报文。当主设备故障,备用设备接收不到主设备发送的组播报文,则备用设备会重新进行竞选,选出主设备进行接管业务。当原主设备恢复后会进行抢占,重新作为主设备对外提供服务。
四、Keepalived实战配置
1、系统环境配置
master服务器(主) | 192.168.49.184 |
backup服务器(备) | 192.168.49.185 |
虚拟IP | 192.168.49.186 |
2、Keepalived配置
(1)分别在master和backup服务器上安装keepalived软件
[root@master ~]# yum install -y keepalived.x86_64
已加载插件:fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirrors.aliyun.com
* extras: mirrors.aliyun.com
* updates: mirrors.aliyun.com
正在解决依赖关系
--> 正在检查事务
---> 软件包 keepalived.x86_64.0.1.3.5-19.el7 将被 安装
--> 解决依赖关系完成
依赖关系解决
================================================================================
Package 架构 版本 源 大小
================================================================================
正在安装:
keepalived x86_64 1.3.5-19.el7 base 332 k
事务概要
================================================================================
安装 1 软件包
总下载量:332 k
安装大小:1.0 M
Downloading packages:
keepalived-1.3.5-19.el7.x86_64.rpm | 332 kB 00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
正在安装 : keepalived-1.3.5-19.el7.x86_64 1/1
验证中 : keepalived-1.3.5-19.el7.x86_64 1/1
已安装:
keepalived.x86_64 0:1.3.5-19.el7
完毕!
[root@backup ~]# yum install -y keepalived.x86_64
已加载插件:fastestmirror, langpacks
base | 3.6 kB 00:00
extras | 2.9 kB 00:00
updates | 2.9 kB 00:00
(1/2): extras/7/x86_64/primary_db | 230 kB 00:01
(2/2): updates/7/x86_64/primary_db | 6.5 MB 00:01
Determining fastest mirrors
* base: mirrors.bfsu.edu.cn
* extras: mirrors.163.com
* updates: mirrors.bfsu.edu.cn
正在解决依赖关系
--> 正在检查事务
---> 软件包 keepalived.x86_64.0.1.3.5-19.el7 将被 安装
--> 正在处理依赖关系 ipset-libs >= 7.1,它被软件包 keepalived-1.3.5-19.el7.x86_64 需要
--> 正在处理依赖关系 libnetsnmpmibs.so.31()(64bit),它被软件包 keepalived-1.3.5-19.el7.x86_64 需要
--> 正在处理依赖关系 libnetsnmpagent.so.31()(64bit),它被软件包 keepalived-1.3.5-19.el7.x86_64 需要
--> 正在检查事务
---> 软件包 ipset-libs.x86_64.0.6.29-1.el7 将被 升级
--> 正在处理依赖关系 ipset-libs(x86-64) = 6.29-1.el7,它被软件包 ipset-6.29-1.el7.x86_64 需要
--> 正在处理依赖关系 libipset.so.3()(64bit),它被软件包 ipset-6.29-1.el7.x86_64 需要
--> 正在处理依赖关系 libipset.so.3(LIBIPSET_1.0)(64bit),它被软件包 ipset-6.29-1.el7.x86_64 需要
--> 正在处理依赖关系 libipset.so.3(LIBIPSET_2.0)(64bit),它被软件包 ipset-6.29-1.el7.x86_64 需要
--> 正在处理依赖关系 libipset.so.3(LIBIPSET_3.0)(64bit),它被软件包 ipset-6.29-1.el7.x86_64 需要
---> 软件包 ipset-libs.x86_64.0.7.1-1.el7 将被 更新
---> 软件包 net-snmp-agent-libs.x86_64.1.5.7.2-49.el7_9.1 将被 安装
--> 正在处理依赖关系 net-snmp-libs = 1:5.7.2-49.el7_9.1,它被软件包 1:net-snmp-agent-libs-5.7.2-49.el7_9.1.x86_64 需要
--> 正在检查事务
---> 软件包 ipset.x86_64.0.6.29-1.el7 将被 升级
---> 软件包 ipset.x86_64.0.7.1-1.el7 将被 更新
---> 软件包 net-snmp-libs.x86_64.1.5.7.2-28.el7 将被 升级
---> 软件包 net-snmp-libs.x86_64.1.5.7.2-49.el7_9.1 将被 更新
--> 解决依赖关系完成
依赖关系解决
================================================================================
Package 架构 版本 源 大小
================================================================================
正在安装:
keepalived x86_64 1.3.5-19.el7 base 332 k
为依赖而安装:
net-snmp-agent-libs x86_64 1:5.7.2-49.el7_9.1 updates 707 k
为依赖而更新:
ipset x86_64 7.1-1.el7 base 39 k
ipset-libs x86_64 7.1-1.el7 base 64 k
net-snmp-libs x86_64 1:5.7.2-49.el7_9.1 updates 751 k
事务概要
================================================================================
安装 1 软件包 (+1 依赖软件包)
升级 ( 3 依赖软件包)
总下载量:1.8 M
Downloading packages:
No Presto metadata available for base
No Presto metadata available for updates
(1/5): ipset-7.1-1.el7.x86_64.rpm | 39 kB 00:00
(2/5): ipset-libs-7.1-1.el7.x86_64.rpm | 64 kB 00:00
(3/5): net-snmp-agent-libs-5.7.2-49.el7_9.1.x86_64.rpm | 707 kB 00:00
(4/5): keepalived-1.3.5-19.el7.x86_64.rpm | 332 kB 00:00
(5/5): net-snmp-libs-5.7.2-49.el7_9.1.x86_64.rpm | 751 kB 00:01
--------------------------------------------------------------------------------
总计 1.2 MB/s | 1.8 MB 00:01
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
正在更新 : ipset-libs-7.1-1.el7.x86_64 1/8
正在更新 : 1:net-snmp-libs-5.7.2-49.el7_9.1.x86_64 2/8
正在安装 : 1:net-snmp-agent-libs-5.7.2-49.el7_9.1.x86_64 3/8
正在安装 : keepalived-1.3.5-19.el7.x86_64 4/8
正在更新 : ipset-7.1-1.el7.x86_64 5/8
清理 : ipset-6.29-1.el7.x86_64 6/8
清理 : ipset-libs-6.29-1.el7.x86_64 7/8
清理 : 1:net-snmp-libs-5.7.2-28.el7.x86_64 8/8
验证中 : keepalived-1.3.5-19.el7.x86_64 1/8
验证中 : ipset-7.1-1.el7.x86_64 2/8
验证中 : 1:net-snmp-agent-libs-5.7.2-49.el7_9.1.x86_64 3/8
验证中 : 1:net-snmp-libs-5.7.2-49.el7_9.1.x86_64 4/8
验证中 : ipset-libs-7.1-1.el7.x86_64 5/8
验证中 : ipset-libs-6.29-1.el7.x86_64 6/8
验证中 : ipset-6.29-1.el7.x86_64 7/8
验证中 : 1:net-snmp-libs-5.7.2-28.el7.x86_64 8/8
已安装:
keepalived.x86_64 0:1.3.5-19.el7
作为依赖被安装:
net-snmp-agent-libs.x86_64 1:5.7.2-49.el7_9.1
作为依赖被升级:
ipset.x86_64 0:7.1-1.el7 ipset-libs.x86_64 0:7.1-1.el7
net-snmp-libs.x86_64 1:5.7.2-49.el7_9.1
完毕!
(2)分别在主设备、备用设备上修改/etc/keepalived/keepalived.conf配置文件
[root@master ~]# cd /etc/keepalived/
[root@master keepalived]# cp keepalived.conf keepalived.conf.bak
[root@master keepalived]# vim keepalived.conf
! Configuration File for keepalived ///注释信息,文件解释
global_defs { ///全局配置部分
notification_email { ///设置发送邮件信息的收件人
[email protected]
[email protected]
[email protected]
}
notification_email_from [email protected] ///设置连接的邮件服务器信息
smtp_server 192.168.200.1 ///设置邮箱IP地址或域名
smtp_connect_timeout 30 ///设置30s内无法连接邮箱则不再发送
router_id master ///高可用集群主机身份标识(唯一性)
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 { ///VRRP协议配置部分(实例配置,族)
state MASTER ///标识设备在实例中的身份(MASTER/BACKUP)
interface ens33 ///制定VIP出现在什么网卡上
virtual_router_id 51 ///标识实例身份的信息
priority 100 ///设定优先级(越大越优先)
advert_int 1 ///
authentication { ///实现通讯需要有认证过程
auth_type PASS
auth_pass 1111
}
virtual_ipaddress { ///配置VIP
192.168.49.186
}
}
#virtual_server 192.168.200.100 443 { ///LVS服务管理配置部分,可删除
# delay_loop 6
# lb_algo rr
# lb_kind NAT
# persistence_timeout 50
# protocol TCP
# real_server 192.168.201.100 443 {
# weight 1
# SSL_GET {
# url {
# path /
# digest ff20ad2481f97b1754ef3e12ecd3a9cc
# }
# url {
# path /mrtg/
# digest 9b3a0c85a887a256d6939da88aabd8cd
# }
# connect_timeout 3
# nb_get_retry 3
# delay_before_retry 3
# }
# }
[root@backup ~]# cd /etc/keepalived/
[root@backup keepalived]# cp keepalived.conf keepalived.conf.bak
[root@backup keepalived]# vim keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
[email protected]
[email protected]
[email protected]
}
notification_email_from [email protected]
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id BACKUP
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state backup
interface ens33
virtual_router_id 51
priority 50
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.49.186
}
}
(3)启动服务
[root@master keepalived]# systemctl start keepalived
[root@master keepalived]# systemctl enable keepalived
[root@backup keepalived]# systemctl start keepalived
[root@backup keepalived]# systemctl enable keepalived
3、验证
(1)测试VIP是否正常通信、查看主设备网卡信息
[root@master keepalived]# ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:ec:e1:f6 brd ff:ff:ff:ff:ff:ff
inet 192.168.49.184/22 brd 192.168.51.255 scope global ens33
valid_lft forever preferred_lft forever
inet 192.168.49.186/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::dac1:9ced:f76b:d918/64 scope link
valid_lft forever preferred_lft forever
3: virbr0: mtu 1500 qdisc noqueue state DOWN qlen 1000
link/ether 52:54:00:64:52:cc brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
4: virbr0-nic: mtu 1500 qdisc pfifo_fast master virbr0 state DOWN qlen 1000
link/ether 52:54:00:64:52:cc brd ff:ff:ff:ff:ff:ff
[root@master keepalived]# ping 192.168.49.186
PING 192.168.49.186 (192.168.49.186) 56(84) bytes of data.
64 bytes from 192.168.49.186: icmp_seq=1 ttl=64 time=0.048 ms
64 bytes from 192.168.49.186: icmp_seq=2 ttl=64 time=0.099 ms
64 bytes from 192.168.49.186: icmp_seq=3 ttl=64 time=0.070 ms
64 bytes from 192.168.49.186: icmp_seq=4 ttl=64 time=0.069 ms
^C
--- 192.168.49.186 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
rtt min/avg/max/mdev = 0.048/0.071/0.099/0.020 ms
[root@backup keepalived]# ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:68:14:41 brd ff:ff:ff:ff:ff:ff
inet 192.168.49.185/22 brd 192.168.51.255 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::8613:4934:eafb:1eb6/64 scope link
valid_lft forever preferred_lft forever
inet6 fe80::dac1:9ced:f76b:d918/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
3: virbr0: mtu 1500 qdisc noqueue state DOWN qlen 1000
link/ether 52:54:00:64:52:cc brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
4: virbr0-nic: mtu 1500 qdisc pfifo_fast master virbr0 state DOWN qlen 1000
link/ether 52:54:00:64:52:cc brd ff:ff:ff:ff:ff:ff
[root@backup keepalived]# ping 192.168.49.186
PING 192.168.49.186 (192.168.49.186) 56(84) bytes of data.
64 bytes from 192.168.49.186: icmp_seq=1 ttl=64 time=0.495 ms
64 bytes from 192.168.49.186: icmp_seq=2 ttl=64 time=0.777 ms
64 bytes from 192.168.49.186: icmp_seq=3 ttl=64 time=0.811 ms
64 bytes from 192.168.49.186: icmp_seq=4 ttl=64 time=0.792 ms
64 bytes from 192.168.49.186: icmp_seq=5 ttl=64 time=0.980 ms
^C
--- 192.168.49.186 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4002ms
rtt min/avg/max/mdev = 0.495/0.771/0.980/0.156 ms
(2)模拟主设备故障,查看VIP通信状况及VIP所处位置
[root@master keepalived]# systemctl stop network
[root@backup keepalived]# ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:68:14:41 brd ff:ff:ff:ff:ff:ff
inet 192.168.49.185/22 brd 192.168.51.255 scope global ens33
valid_lft forever preferred_lft forever
inet 192.168.49.186/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::8613:4934:eafb:1eb6/64 scope link
valid_lft forever preferred_lft forever
inet6 fe80::dac1:9ced:f76b:d918/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
3: virbr0: mtu 1500 qdisc noqueue state DOWN qlen 1000
link/ether 52:54:00:64:52:cc brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
4: virbr0-nic: mtu 1500 qdisc pfifo_fast master virbr0 state DOWN qlen 1000
link/ether 52:54:00:64:52:cc brd ff:ff:ff:ff:ff:ff
(3)尝试以WEB服务进行验证,切断Master
五、参考文章文献: