高可用集群(HA)都面临脑裂的问题。作为一个整体相互配合的系统,由于失去联系,各自认为对方为故障节点,从而分裂为两个单独的系统。脑裂节点会争抢资源,导致系统混乱,数据错误。特别是对于有状态的应用的高可用,如数据库,必须严格控制脑裂。
当Keepalived的BACKUP主机在收不到MASTER主机报文后就会切换成为master,如果是它们之间的通信线路出现问题,无法接收到彼此的组播通知,但是两个节点实际都处于正常工作状态,这时两个节点均为master强行绑定虚拟IP,导致不可预料的后果,这就是脑裂。高可用首先要解决的就是脑裂问题。
1)两台keepalived可以直接连通
添加更多的检测手段,比如冗余的心跳线(两块网卡做健康监测),ping对方等。尽量减少"裂脑"发生机会。(指标不治本,只是提高了检测到的概率);
2)三台keepalived
算法保证,比如采用投票机制(keepalived没有实现);
3)四台机器
设置仲裁机制。两方都不可靠,那就依赖第三方。比如启用共享磁盘锁,ping网关等。(针对不同的手段还需具体分析);
keepalived是集群管理中保证集群高可用的一个服务软件,其功能类似于heartbeat,用来防止单点故障。
将N台提供相同功能的路由器组成一个路由器组,这个组里面有一个master和多个backup,master上面有一个对外提供服务的vip(该路由器所在局域网内其他机器的默认路由为该vip),master会发组播,当backup收不到vrrp包时就认为master宕掉了,这时就需要根据VRRP的优先级来选举一个backup当master。这样的话就可以保证路由器的高可用了。
keepalived是以VRRP协议为实现基础的,VRRP全称Virtual Router Redundancy Protocol,即虚拟路由冗余协议。虚拟路由冗余协议,可以认为是实现路由器高可用的协议,即将N台提供相同功能的路由器组成一个路由器组,这个组里面有一个master和多个backup,master上面有一个对外提供服务的vip(该路由器所在局域网内其他机器的默认路由为该vip),master会发组播,当backup收不到vrrp包时就认为master宕掉了,这时就需要根据VRRP的优先级来选举一个backup当master。这样的话就可以保证路由器的高可用了。
环境:
web1: 192.168.93.137
web2: 192.168.93.138
vip: 192.168.93.222
client: 192.168.93.140
yum -y install nginx
echo "137"
>/usr/share/nginx/html/index.html
sed -ri /^SELINUX=/cSELINUX=disabled /etc/selinux/config && setenforce 0
systemctl stop firewalld && systemctl disable firewalld
yum -y install keepalived
cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id Director1
}
#vrrp_script chk_nginx {
# script "/etc/keepalived/ck_ng.sh"
# interval 2
# weight -5
# fall 3
#}
vrrp_instance VI_1 {
state MASTER
interface ens33
mcast_src_ip 192.168.93.137
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1234
}
virtual_ipaddress {
192.168.93.222/24
}
# track_script {
# chk_nginx
# }
}
scp /etc/keepalived/keepalived.conf [email protected]:/etc/keepalived/
修改:
state MASTER改为 state BACKUP
mcast_src_ip 192.168.93.137 改为 mcast_src_ip 192.168.93.138
priority 100 改为priority 99
curl 192.168.93.222 # vip
<h1>137<h1>
web1的keepalived服务停掉 或者 关闭web1服务器网络 (模仿web1服务器故障)
curl 192.168.93.222
<h1>138<h1> # 自动切换到web2页面
如果其他原因导致nginx停止服务,但是keepalived服务依旧在工作,此时客户端也访问不到网站。为了避免这种情况,keepalived支持使用脚本实现对nginx进行守护。
cat /etc/keepalived/ck_ng.sh # 守护脚本
#!/bin/bash
#检查nginx进程是否存在
counter=$(ps -C nginx --no-heading|wc -l)
if [ "${counter}" = "0" ]; then
#尝试启动一次nginx,停止5秒后再次检测
systemctl restart nginx
sleep 5
counter=$(ps -C nginx --no-heading|wc -l)
if [ "${counter}" = "0" ]; then
#如果启动没成功,就杀掉keepalive触发主备切换
systemctl stop keepalived
fi
fi
chmod a+x /etc/keepalived/ck_ng.sh # 可执行权限
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived # 默认规则 必须要首行写
# 第一部分:全局定义块
global_defs {
notification_email {
root@localhost # 指定keepalived在发生切换时需要发送email地址
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id Director1 # 运行keepalived机器的一个标识 集群内唯一
}
# 健康检查
vrrp_script chk_nginx { # nginx守护脚本
script "/etc/keepalived/ck_ng.sh" # 检查脚本 绝对路径
interval 2 # 检查频率 每2s检查一次
weight -5 # 失败三次 权值减5
fall 3
}
# 实例配置
vrrp_instance VI_1 { # 实例VI_1
state MASTER # 主 keepalived
interface ens33 # 监听网卡
mcast_src_ip 192.168.93.137 # 心跳源地址 host ip
virtual_router_id 51 # 虚拟路由编号 主从一致
priority 100 # 优先级 权值
advert_int 1 # 心跳间隔 可以是毫秒
authentication { # 认证 防止其他设备加入该组
auth_type PASS
auth_pass 1234
}
virtual_ipaddress { # vip
192.168.93.222/24
}
track_script { # 监控nginx服务 脚本
chk_nginx # 名字和 vrrp_script 一致
}
}
环境:
lvs1: 192.168.93.137
lvs2: 192.168.93.138
vip: 192.168.93.222
web1: 192.168.93.139
web2: 192.168.93.140
client: 192.168.93.136
yum -y install keepalived ipvsadm # 2台同时安装
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id Director1
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 100 # 权值 1-255任意数字
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.93.222/24 dev ens33 # vip
}
}
virtual_server 192.168.93.222 80 {
delay_loop 3 # 轮询时间间隔
lb_algo rr # 轮询模式
lb_kind DR # lvs模式
protocol TCP
real_server 192.168.93.139 80 {
weight 1
TCP_CHECK {
connect_timeout 3
}
}
real_server 192.168.93.140 80 {
weight 1
TCP_CHECK {
connect_timeout 3
}
}
}
scp /etc/keepalived/keepalived.conf root@node-3:/etc/keepalived/
需要更改:
state MASTER 改为 state BACKUP
priority 100 改为 priority 80 小于100即可
yum -y install nginx
echo web1111111111111 >/usr/share/nginx/html/index.html # 设置主页 直观看出实验结果
ifconfig lo:0 192.168.93.222/32 # 绑定vip
# 配置ARP
echo 1 >/proc/sys/net/ipv4/conf/all/arp_ignore
echo 1 >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo 2 >/proc/sys/net/ipv4/conf/all/arp_announce
echo 2 >/proc/sys/net/ipv4/conf/lo/arp_announce
注: 上面配置参数为临时参数 重启机器后失效
如果要配置永久参数 操作如下 (web1 web2 都要配置)
cp /etc/sysconfig/network-scripts/{ifcfg-lo,ifcfg-lo:0}
vim /etc/sysconfig/network-scripts/ifcfg-lo:0
DEVICE=lo:0
IPADDR=192.168.93.222
NETMASK=255.255.255.255
ONBOOT=yes
vim /etc/sysctl.conf
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.lo.arp_ignore = 1
net.ipv4.conf.lo.arp_announce = 2
systemctl start nginx
curl 192.168.93.222
ipvsadm -L # 查看lvs转发
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP node-2:http rr
-> node-4:http Route 1 0 2
-> 192.168.93.140:http Route 1 0 3
ip a (关闭master的keepalived)
vip 192.168.93.222 在master上 如果master宕机 vip会跳到backup
#关闭web1的nginx
client端 curl 192.168.93.222
双主集群部署在单主集群的基础上进行部署、
环境:
lvs1: 192.168.93.137
lvs2: 192.168.93.138
vip: 192.168.93.222
vip: 192.168.93.223
web1: 192.168.93.139
web2: 192.168.93.140
web3: 192.168.93.141
web4: 192.168.93.142
client: 192.168.93.136
vip: 192.168.93.222 实例:VI_1 lvs1为MASTER lvs2为BACKUP
vip: 192.168.93.223 实例:VI_2 lvs1为BACKUP lvs2为MASTER
vim /etc/keepalived/keepalived.conf
#### lvs1为MASTER lvs2为BACKUP 的配置
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id Director1
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 100
advert_int 2
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.93.222/24 dev ens33
}
}
virtual_server 192.168.93.222 80 {
delay_loop 3
lb_algo rr
lb_kind DR
protocol TCP
real_server 192.168.93.139 80 {
weight 1
TCP_CHECK {
connect_timeout 3
}
}
real_server 192.168.93.140 80 {
weight 1
TCP_CHECK {
connect_timeout 3
}
}
}
####lvs1为BACKUP lvs2为MASTER 的配置
vrrp_instance VI_2 {
state BACKUP
interface ens33
virtual_router_id 55
priority 90
advert_int 2
authentication {
auth_type PASS
auth_pass 1234
}
virtual_ipaddress {
192.168.93.223/24 dev ens33
}
}
virtual_server 192.168.93.223 80 {
delay_loop 3
lb_algo rr
lb_kind DR
protocol TCP
real_server 192.168.93.139 80 {
weight 1
TCP_CHECK {
connect_timeout 3
}
}
real_server 192.168.93.140 80 {
weight 1
TCP_CHECK {
connect_timeout 3
}
}
}
vim /etc/keepalived/keepalived.conf
####lvs1为MASTER lvs2为BACKUP 的配置
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id Director1
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 90
advert_int 2
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.93.222/24 dev ens33
}
}
virtual_server 192.168.93.222 80 {
delay_loop 3
lb_algo rr
lb_kind DR
protocol TCP
real_server 192.168.93.139 80 {
weight 1
TCP_CHECK {
connect_timeout 3
}
}
real_server 192.168.93.140 80 {
weight 1
TCP_CHECK {
connect_timeout 3
}
}
}
####lvs1为BACKUP lvs2为MASTER 的配置
vrrp_instance VI_2 {
state MASTER
interface ens33
virtual_router_id 55
priority 100
advert_int 2
authentication {
auth_type PASS
auth_pass 1234
}
virtual_ipaddress {
192.168.93.223/24 dev ens33
}
}
virtual_server 192.168.93.223 80 {
delay_loop 3
lb_algo rr
lb_kind DR
protocol TCP
real_server 192.168.93.141 80 {
weight 1
TCP_CHECK {
connect_timeout 3
}
}
real_server 192.168.93.142 80 {
weight 1
TCP_CHECK {
connect_timeout 3
}
}
}
ifconfig lo:1 192.168.93.223/32 # 原来网卡为lo:0 vip为192.168.93.222
注:
1.要有两个实例 VI_1 和 VI_2
2.两套集群virtual_router_id要不同 每套主备的virtual_router_id相同
3.要有两个vip
4.每个MASTER的priority要比BACKUP高