1、在VMWare上安装带有用户界面的CentOS7系统,下载地址:
https://www.centos.org/download/
版本:CentOS-7-x86_64-Everything-1804.iso
2、Keepalived安装包,下载地址:
http://www.keepalived.org/download.html
版本:keepalived-2.0.8.tar.gz
本文安装两个CentOS虚拟机,分别为keepalived1(IP:192.168.197.146)、keepalived2(IP:192.168.197.147),虚拟IP为192.168.197.148。
安装Keepalived,将Keepalived安装包上传至Linux服务器,解压:
[root@keepalived1 ~]# tar -zxvf keepalived-2.0.8.tar.gz
[root@keepalived1 ~]# cd keepalived-2.0.8
检查配置:
[root@keepalived1 keepalived-2.0.8]# ./configure --prefix=/usr/local/keepalived
其中,/usr/local/keepalived为安装路径。
检查过程中可能发生报错:
configure: error:
!!! OpenSSL is not properly installed on your system. !!!
!!! Can not include OpenSSL headers files. !!!
该报错由于缺少openssl-devel包导致,安装可解决:
yum install openssl-devel
检查通过后,编译并安装:
[root@keepalived1 keepalived-2.0.8]# make
[root@keepalived1 keepalived-2.0.8]# make install
安装完成后,在安装目录下存在etc文件夹,其中已经预存放着各种应用场景下的示例配置文件。
本文首先配置以最简单的配置文件,后文将通过逐渐增加配置,演示配置代表的功能,最终完成一份实用的Keepalived配置。
keepalived1服务器:
! Configuration File for keepalived
global_defs {
# 标识本节点的名称,用以告警时进行区分
router_id SERVER_146
}
vrrp_instance VI_1 {
# 初始状态,有MASTER和BACKUP两种状态,需全部大写,其中MASTER为工作状态,BACKUP为备用状态
state MASTER
# 对外提供服务的网卡接口,即虚拟IP绑定的网卡接口,在选择网卡接口时,要核实清楚,可通过ifconfig指令查看本机的网卡情况
interface ens32
# 虚拟路由的ID号,每组中各个节点设置必须一样,可选择IP最后一段使用,相同的 VRID 为一个组,他将决定多播的 MAC 地址
virtual_router_id 148
# 节点优先级,取值范围0~254,MASTER要比BACKUP高
priority 100
# MASTER与BACKUP节点间同步检查的时间间隔,单位为秒
advert_int 1
# 虚拟IP地址池,可以有多个IP,每个IP占一行,不需要指定子网掩码
virtual_ipaddress {
192.168.197.148
}
}
keepalived2服务器:
! Configuration File for keepalived
global_defs {
router_id SERVER_147
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
virtual_router_id 148
priority 90
advert_int 1
virtual_ipaddress {
192.168.197.148
}
}
进入安装目录的sbin文件夹,直接运行keepalived启动:
[root@keepalived1 sbin]# keepalived
或使用服务命令启动:
service keepalived start
使用以上两种方式启动,都是以默认的配置文件路径启动的:/etc/keepalived/keepalived.conf。需在该路径下手工创建配置文件,否则将以默认的配置启动。
指定配置文件启动,需加上-f参数:
[root@keepalived1 sbin]# keepalived -f /usr/local/keepalived/etc/keepalived/keepalived.conf
停止Keepalived:
pkill keepalived
Keepalived启停过程中,可在Linux的系统日志/var/log/messages中查看相关日志信息。启动时加上“-D”参数,会记录更详细的日志:
[root@keepalived1 sbin]# keepalived -D -f /usr/local/keepalived/etc/keepalived/keepalived.conf
以1.3节的配置,依次启动两台服务器上的Keepalived(启动之前注意关闭两个服务器的防火墙)。
以XShell连接地址192.168.197.148,可以看到,连接到的服务器为keepalived1(可通过主机名判断当前连接的是哪台服务器);停止keepalived1上的Keepalived进程,或重启这台服务器,XShell连接将断开,再次连接地址192.168.197.148,连接到的服务器为keepalived2,即虚拟IP指向了备机。
实际使用时,主备机两台服务器都运行相同的程序,以虚拟IP对外提供服务,当主机异常时,备机可立即接替虚拟IP,对于外部的程序来说,服务没有中断(或中断时间极短),以此实现高可用。
默认情况下,当主机异常时,备机成为主机,接替虚拟IP对外提供服务,当原主机恢复后,将根据节点优先级priority进行判断,若原主机的优先级高于原备机时,将再次发生切换,原主机重新成为主机,成为主机的原备机又自动降为备机。
这样的场景通常是不希望发生的,因为再次切换意味着对外的服务将再一次中断。希望达到的效果是,若原主机恢复后,能继续保持当前的主备机状态,原主机作为备机,原备机继续作为主机。
此时就需要用到配置非抢占模式nopreempt,配置有nopreempt的服务器恢复后,不会抢占MASTER权限。
服务器keepalived1:
! Configuration File for keepalived
global_defs {
router_id SERVER_146
}
vrrp_instance VI_1 {
state BACKUP
# 非抢占模式
nopreempt
interface ens32
virtual_router_id 148
priority 100
advert_int 1
virtual_ipaddress {
192.168.197.148
}
}
服务器keepalived2:
! Configuration File for keepalived
global_defs {
router_id SERVER_147
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
virtual_router_id 148
priority 90
advert_int 1
virtual_ipaddress {
192.168.197.148
}
}
若keepalived1异常、keepalived2成为主机,当keepalived1恢复时,由于其配置有非抢占模式nopreempt,不会抢占成为主机;若keepalived2异常、keepalived1成为主机,当keepalived2恢复时,由于其优先级较低,也不会成为主机。如此,就保证了异常的服务器恢复时,不会发生主备机切换。
注意:非抢占模式只对初始状态为BACKUP的服务器有效,故将keepalived1和keepalived2两台服务器都设置为了BACKUP。
由Keepalived构建的高可用集群,通常需要做一定保护,否则外界的任何服务器都可以加入该集群,争抢主机权力,会干扰该集群的正常运行。
可为各节点设置密码验证,只有密码相同的各节点才能进行正常通信。
服务器keepalived1:
! Configuration File for keepalived
global_defs {
router_id SERVER_146
}
vrrp_instance VI_1 {
state BACKUP
nopreempt
interface ens32
virtual_router_id 148
priority 100
advert_int 1
virtual_ipaddress {
192.168.197.148
}
authentication {
auth_type PASS
# 密码无需设置过长,Keepalived只会用到前8个字符
auth_pass abc123
}
}
服务器keepalived2:
! Configuration File for keepalived
global_defs {
router_id SERVER_147
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
virtual_router_id 148
priority 90
advert_int 1
virtual_ipaddress {
192.168.197.148
}
authentication {
auth_type PASS
auth_pass abc123
}
}
可自行验证,两台服务器密码不一致的情况。
在之前章节中,本文设置keepalived1为主机,keepalived2为备机。两台服务器的网络状况、硬件情况可能会发生变化,若keepalived1的网络状况变差,此时理应由网络状况更好的keepalived2接管主机权限,对外提供服务。而依照之前的配置,keepalived1仅凭借初始配置的优先级priority高,始终掌握着主机权力,这显然不是我们希望看到的状况。
此时,可利用vrrp_script块,vrrp_script会按照设置间隔定时执行指定的脚本,并依照执行的结果改变服务器的优先级。
为主备机建立脚本 check.sh:
#!/bin/bash
ping -c 1 192.168.197.137
exit $?
在这个脚本中,“ping -c 1”表示对目标地址执行一次ping命令,若执行成功,“exit $?”将返回0,若执行失败则将返回1。Keepalived依赖脚本的返回码是0或1作为脚本执行成功或失败的依据。192.168.197.137是另一台虚拟机的IP,用作测试用。
注意:建立脚本后,要用“chmod +x”命令为该脚本赋予执行权限。
keepalived1配置文件:
! Configuration File for keepalived
global_defs {
router_id SERVER_146
# 执行脚本使用的用户
script_user root
}
# 声明脚本
vrrp_script check {
# 执行的脚本的路径
script "/usr/local/keepalived/script/check.sh"
# 执行脚本的时间间隔,单位秒,每隔10秒执行一次脚本
interval 10
# 执行脚本的超时时间,单位秒,脚本执行超过10秒视为失败
timeout 10
# 脚本执行失败后,本节点优先级减小值
weight -20
}
vrrp_instance VI_1 {
state BACKUP
# 采用抢占模式
# nopreempt
interface ens32
virtual_router_id 148
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass abc123
}
virtual_ipaddress {
192.168.197.148
}
# 声明监控的脚本,脚本只有被监控时才会定时运行
track_script {
check
}
}
keepalived2暂保持为2.2节配置不变,开启服务器192.168.197.137,先后启动keepalived1和keepalived2两台服务器上的Keepalived后,keepalived1将作为主机,keepalived2作为备机。
将服务器192.168.197.137关机,check_network.sh将返回1,查看/var/log/messages日志,可以看到两台服务器主备切换的全过程:
# 初始配置是BACKUP,进入备机状态
Nov 16 21:47:50 keepalived1 Keepalived_vrrp[3825]: (VI_1) Entering BACKUP STATE (init)
# 发现服务器keepalived2的优先级较低
Nov 16 21:47:51 keepalived1 Keepalived_vrrp[3825]: (VI_1) received lower priority (90) advert from 192.168.197.147 - discarding
# 成为主机
Nov 16 21:47:54 keepalived1 Keepalived_vrrp[3825]: (VI_1) Entering MASTER STATE
# ...关闭服务器192.168.197.137
# 执行脚本超时(因为ping不通)
Nov 16 21:55:47 keepalived1 Keepalived_vrrp[4160]: VRRP_Script(check_network) timed_out
# 执行脚本失败,根据配置,优先级减少20,变为80
Nov 16 21:55:47 keepalived1 Keepalived_vrrp[4160]: (VI_1) Changing effective priority from 100 to 80
# 发现备机的优先级更高
Nov 16 21:55:51 keepalived1 Keepalived_vrrp[4160]: (VI_1) Master received advert from 192.168.197.147 with higher priority 90, ours 80
# 进入备机状态,原备机成为主机
Nov 16 21:55:51 keepalived1 Keepalived_vrrp[4160]: (VI_1) Entering BACKUP STATE
在实际使用中,可以在check.sh中进行网络状态的监控(如ping网关)、应用程序状态的监控等,当本机健康状况不良时,可将主机权力让渡。
同样地,备机也配置以监控脚本,最终keepalived2的配置变为:
! Configuration File for keepalived
global_defs {
router_id SERVER_147
script_user root
}
vrrp_script check {
script "/usr/local/keepalived/script/check.sh"
interval 10
timeout 10
weight -20
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
virtual_router_id 148
priority 90
advert_int 1
virtual_ipaddress {
192.168.197.148
}
authentication {
auth_type PASS
auth_pass abc123
}
track_script {
check
}
}
网卡是服务器进行通信的重要设备,若网卡异常,将直接影响服务器的通信。Keepalived可使用track_interface块针对网卡进行监控,当网卡连接不通时,进入失败(FAULT)状态,优先级降为0。
keepalived1服务器配置:
! Configuration File for keepalived
global_defs {
router_id SERVER_146
script_user root
}
vrrp_script check {
script "/usr/local/keepalived/script/check.sh"
interval 10
timeout 10
weight -20
}
vrrp_instance VI_1 {
state BACKUP
# nopreempt
interface ens32
virtual_router_id 148
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass abc123
}
virtual_ipaddress {
192.168.197.148
}
track_script {
check
}
# 列出所监控的网卡
track_interface {
ens32
lo
}
}
keepalived2服务器配置:
! Configuration File for keepalived
global_defs {
router_id SERVER_147
script_user root
}
vrrp_script check {
script "/usr/local/keepalived/script/check.sh"
interval 10
timeout 10
weight -20
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
virtual_router_id 148
priority 90
advert_int 1
virtual_ipaddress {
192.168.197.148
}
authentication {
auth_type PASS
auth_pass abc123
}
track_script {
check
}
track_interface {
ens32
lo
}
}
这里在track_interface中列出了ens32,实际上,即使这里不列出,对于vrrp_instance中使用的网卡,连接不通后也会进入FAULT状态。在测试改设置的效果时,可使用本地回环接口lo:
# 断开网卡
[root@keepalived1 ~]# ifdown lo
# 连接网卡
[root@keepalived1 ~]# ifup lo
断开本地回环接口lo后,可以看到keepalived1的日志:
Nov 19 10:45:51 keepalived1 Keepalived_vrrp[6652]: Netlink reports lo down
Nov 19 10:45:51 keepalived1 Keepalived_vrrp[6652]: (VI_1) Entering FAULT STATE
Nov 19 10:45:51 keepalived1 Keepalived_vrrp[6652]: (VI_1) sent 0 priority
Nov 19 10:45:51 keepalived1 Keepalived_vrrp[6652]: Netlink: error: data remnant size 1148
Nov 19 10:45:51 keepalived1 avahi-daemon[675]: Withdrawing address record for 192.168.197.148 on ens32.
keepalived1由于lo断开,进入了FAULT状态,优先级变为0。查看keepalived2的日志可以看到,由于keepalived1的优先级已低于keepalived2,keepalived2成为了主机:
Nov 19 10:45:51 keepalived2 Keepalived_vrrp[4161]: (VI_1) Backup received priority 0 advertisement
Nov 19 10:45:51 keepalived2 Keepalived_vrrp[4161]: (VI_1) Backup received priority 0 advertisement
Nov 19 10:45:52 keepalived2 Keepalived_vrrp[4161]: (VI_1) Entering MASTER STATE
Keepalived的实际使用中,发生主备切换时,通常需要执行一系列操作,如备机通常应用程序是出于停止状态的,当备机成为主机时,需要将应用程序拉起,对外提供服务。为使应用程序及时启动,需由Keepalived在主备切换后自动运行应用程序。
Keepalived提供通知脚本(Notify Scripts),当Keepalived发生状态改变时,会自动运行设定的脚本。Keepalived配置文件:
! Configuration File for keepalived
global_defs {
router_id SERVER_146
script_user root
}
vrrp_script check {
script "/usr/local/keepalived/script/check.sh"
interval 10
timeout 10
weight -20
}
vrrp_instance VI_1 {
state BACKUP
# nopreempt
interface ens32
virtual_router_id 148
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass abc123
}
virtual_ipaddress {
192.168.197.148
}
track_script {
check
}
track_interface {
ens32
lo
}
# 当前服务器进入主机状态时运行的脚本
notify_master "/usr/local/keepalived/script/notify_master.sh"
# 当前服务器进入备机状态时运行的脚本
notify_backup "/usr/local/keepalived/script/notify_backup.sh"
# 当前服务器进入失败状态时运行的脚本
notify_fault "/usr/local/keepalived/script/notify_fault.sh"
# 当前服务器Keepalived停止时运行的脚本
notify_stop "/usr/local/keepalived/script/notify_stop.sh"
# 该脚本在任何状态切换后都会运行,且在以上脚本运行完毕后运行,以下3个参数会自动传入脚本中:$1=GROUP|INSTANCE,表示切换的是VRRP实例组或VRRP实例;$2=VRRP实例(组)的名称;$3=MASTER|BACKUP|FAULT,为切换的目标状态
notify "/usr/local/keepalived/script/notify.sh"
}
除此之外,Keepalived还有更为复杂的配置,可实现丰富的功能。在我接触到的系统中,以上配置已足够使用。更多配置可参考Keepalived官网:
http://www.keepalived.org/manpage.html