上节介绍的主、从切换实践,其实现方式和逻辑虽然简单并且相对可靠,但是毕竟费时又费力,甚至在很多情况下还会出现人为误操作的问题。那么是否存在某种方式可以尽量避免这类事件的发生呢,有一种显而易见的方式就是用程序代替人工来实现切换操作。本节将介绍如何基于开源软件Keepalived以及定制开发的轻量级脚本来实现这一需求。
Keepalived 安装
Keepalived 介绍
Keepalived开源软件可以在TCP/IP栈的IP层、TCP层及应用层工作,支持类似于交换机制软件中通常所说的第3层、第4层和第5层交换。通过使用虚拟路由冗余协议(VRRP),Keepalived集群可以共用一个对外提供服务的VIP。在此集群中,主(master)服务器(单台)通过多播或单播方式与备份(backup)服务器(单台或多台)进行通信,建立心跳,并确认双方是否存活。如果其中的master主机宕机或者出现异常,那么集群中其他的backup主机则会通过VRRP协议选举出新的master主机,以此来防止单点故障并实现服务的高可用性。
在业界,Keepalived被广泛应用于LAMP或LNMP等应用软件组合的后端架构中。
Keepalived主要包含三个模块:
CORE模块:负责主进程的启动、维护及全局配置文件的加载和解析;
CHECK模块:负责健康检查,支持使用自主开发的脚本或程序实现特定的监控与检测;
VRRP模块:用于实现虚拟路由冗余协议。Keepalived的健康检查与心跳检测两大特性可用于实现MySQL的自动切换。
Keepalived 安装
Keepalived软件安装主要有RPM包和源码编译两种方式。像CentOS开源系统平台,可直接使用RPM包方式安装,这是我们推荐使用的方式。对于其他平台,则可使用编译方式实现。
首先,安装Keepalived相关依赖包,命令如下:
shell> yum install -y openssl openssl-devel libnl libnl-devel libnfnetlink
shell> wget http://mirror.centos.org/centos/7/os/x86_64/Packages/libnfnetlink-devel-1.0.1-4.el7.x86_64.rpm
shell> rpm -ivh libnfnetlink-devel-1.0.1-4.el7.x86_64.rpm
采用RPM包方式安装(推荐),命令如下:
shell> yum install -y keepalived
或者,采用编译方式安装,命令如下:
shell> test -d /usr/local/keepalived || mkdir /usr/local/keepalived
shell> cd /soft
shell> wget http://www.keepalived.org/software/keepalived-1.3.5.tar.gz
shell> tar -zxvf keepalived-1.3.5.tar.gz
shell> cd ./keepalived-1.3.5/
shell> ./configure --prefix=/usr/local/keepalived
shell> make && make install
shell> mkdir /usr/local/keepalived/var
shell> ln -s /run /usr/local/keepalived/var/run
注意:源码编译过程中,若出现configure: error: libnfnetlink headers missing,则意味着libnfnetlink-devel包缺失。由于操作系统ISO镜像中并不包含此包,因此需要单独下载并安装。
准备Mysql 检查脚本及主从切换脚本
在具体配置Keepalived之前,CHECK模块要调用的mysql_check.sh脚本需要提前准备好。该脚本会按照Keepalived配置中指定的时间间隔,定期检查MySQL实例的运行状态,并将检查结果记录到输出日志中以方便回溯。需要注意的是,本节介绍的脚本均采用Linux原生最兼容的bash语言开发。mysql_check.sh脚本如下:
shell> mkdir -p /etc/keepalived/{scripts,log}
shell> vi /etc/keepalived/scripts/mysql_check.sh
#!/bin/bash
. $HOME/.bash_profile >/dev/null 2>&1
LogDir=/etc/keepalived/log
logfile=$LogDir/mysql_check.log
Lhost=localhost
Luser=root
Lpasswd=Abcd321#
Lsocket=/mysql/product/data/mysql.sock
ConnCMD="-h$Lhost -u$Luser -p$Lpasswd -S$Lsocket"
shopt -s expand_aliases
alias cdate='date +"%Y-%m-%d %H:%M:%S"'
InstSuffix=''
startMySQLcmd="systemctl start mysqld$InstSuffix"
stopKeepAliveDcmd="systemctl stop keepalived"
WaitTime=5
WaitCount=12
j=1
while true
do
mysqld_status=`mysqladmin $ConnCMD ping 2>/dev/null`
mysql_status=`mysql $ConnCMD -e"select 1;" 2>/dev/null|awk 'NR>1'`
if [ "$mysqld_status" == "mysqld is alive" ] && [ $mysql_status -eq 1 ];then
echo "$(cdate) [Note] MySQL Service is Normal." >> $logfile
exit 0
elif [ "$mysqld_status" == "mysqld is alive" ] && [ $mysql_status -ne 1 ];then
echo "$(cdate) [Warning] MySQL Service may be Hung." >> $logfile
exit 0
else
if [ $j -le $WaitCount ];then
echo "$(cdate) [Error] MySQL Service is Abnormal and will be restart." >> $logfile
$startMySQLcmd
else
echo "$(cdate) [Error] MySQL Service is still Abnormal After $WaitCount restarts."
>> $logfile
$stopKeepAliveDcmd
echo "$(cdate) [Action] KeepAlived is forced to perform a Failover." >> $logfile
break
fi
fi
sleep $WaitTime
let j++
done
使用Keepalived检查模块调用mysql_check.sh,并根据检查的情况输出对应的日志。其中,Note关键字表示MySQL主进程和实例均正常;Warning表示主进程正常但实例可能出现卡顿现象;Error则表示主进程异常或宕机,即实例不可用;而Action表示在尝试一定次数的MySQL服务重启动作后,若主进程仍旧异常则触发Keepalived强制执行故障切换。此外,特意引入WaitTime和WaitCount变量,旨在控制检查脚本不要过于灵敏,并支持用户根据实际情况做出针对性调整。
Keepalived软件的notify功能,在切换阶段会自动通知并调用对应的脚本最终实现主从库的计划性切换。
将主库设置为只读模式的脚本to_setROmode.sh,命令如下:
shell> vi /etc/keepalived/scripts/to_setROmode.sh
#!/bin/bash
. $HOME/.bash_profile >/dev/null 2>&1
ctime=$(date +"%Y-%m-%d_%H-%M-%S")
LogDir=/etc/keepalived/log
logfile=$LogDir/to_setROmode_$ctime.log
dblog=$LogDir/to_setROmode_$ctime.bin
Lhost=localhost
Luser=root
Lpasswd=Abcd321#
Lsocket=/mysql/product/data/mysql.sock
ConnCMD="-h$Lhost -u$Luser -p$Lpasswd -S$Lsocket"
shopt -s expand_aliases
alias cdate='date +"%Y-%m-%d %H:%M:%S"'
echo "$(cdate) [Action] To set Master Read Only mode after KeepAlived Failover is
begining." >> $logfile
mysql $ConnCMD >/dev/null 2>&1 < flush table with read lock; tee $dblog; show master status\G; notee; set global read_only=on; set global super_read_only=on; unlock tables; EOF echo "..." >> $logfile echo "$(cdate) [Action] To set Master Read Only mode after KeepAlived Failover is end." >> $logfile 在主节点上,Keepalived服务关闭时会通过notify_stop触发to_setROmode.sh,该脚本主要是为了禁止主库的写操作,尽快让复制拓扑中各节点的数据达成一致。当然,实现方式可以是多样化的,如上文提到过的前端停业务、后端杀线程,或者开启某个线程执行FTWRL并保持,等等。为了方便切换后数据的同步恢复,以及适配其他极端情况,该脚本还会记录禁止写入时的二进制文件名和位置。切换后,主库上会生成以“.bin”和“.log”为后缀的切换日志,具体如下: shell> cat to_setROmode_2020-05-24_07-54-05.log 2020-05-24 07:54:05 [Action] To set Master Read Only mode after KeepAlived Failover is begining. ... 2020-05-24 07:54:05 [Action] To set Master Read Only mode after KeepAlived Failover is end. shell> cat to_setROmode_2020-05-24_07-54-05.bin *************************** 1. row *************************** File: mysql-binlog.000005 Position: 194 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: b1cf5c71-9da6-11ea-9b6e-000c2919db73:1-1527 将从库切换成新主库的脚本to_NewMaster.sh,命令如下: shell> vi /etc/keepalived/scripts/to_NewMaster.sh #!/bin/bash . $HOME/.bash_profile >/dev/null 2>&1 ctime=$(date +"%Y-%m-%d_%H-%M-%S") logfile=/etc/keepalived/log/to_NewMaster_$ctime.log dblog=/etc/keepalived/log/to_NewMaster_$ctime.bin Mhost=192.168.113.110 Chost=192.168.113.111 VRRPvip=192.168.113.200 LconnCMD="-hlocalhost -uroot -pAbcd321# -S/mysql/product/data/mysql.sock" MconnCMD="-h$Mhost -urepl -pAbcd321# -P3306" shopt -s expand_aliases alias cdate='date +"%Y-%m-%d %H:%M:%S"' alias vipmatchCMD=$(echo 'ip a|awk -v RS="@#$j" '"'"'{print gsub (/'"$VRRPvip"'/,"&")}'"'") BinPrefixOpt=$(mysql $LconnCMD -e"show global variables where Variable_name= 'log_bin_basename';" 2>/dev/null|awk '{print $NF}'|awk 'NR>1') BinPrefix=${BinPrefixOpt##*/} gtid_modeOpt=$(mysql $LconnCMD -e"show global variables where Variable_name= 'gtid_mode';" 2>/dev/null|awk '{print $NF}'|awk 'NR>1') if [ "$gtid_mode" == "ON" ];then CHANGEOPTS="MASTER_AUTO_POSITION = 1" else CHANGEOPTS="MASTER_LOG_FILE='$BinPrefix.000001', MASTER_LOG_POS=154" fi WaitTime=5 WaitCount=12 count=1 while [ $count -le $WaitCount ] do vipmatch=$(vipmatchCMD) MasterLogFile=$(mysql $MconnCMD -e"show master status\G;" 2>/dev/null|grep -w "File:"|awk '{print $2}') MasterLogPos=$(mysql $MconnCMD -e"show master status\G;" 2>/dev/null|grep -w "Position:"|awk '{print $2}') ReadMasterLogFile=$(mysql $LconnCMD -e"show slave status\G;" 2>/dev/null|grep -w "Master_Log_File"|awk '{print $2}') ReadMasterLogPos=$(mysql $LconnCMD -e"show slave status\G;" 2>/dev/null|grep -w "Read_Master_Log_Pos"|awk '{print $2}') RelayMasterLogFile=$(mysql $LconnCMD -e"show slave status\G;" 2>/dev/null|grep -w "Relay_Master_Log_File"|awk '{print $2}') ExecMasterLogPos=$(mysql $LconnCMD -e"show slave status\G;" 2>/dev/null|grep -w "Exec_Master_Log_Pos"|awk '{print $2}') if [ $vipmatch -eq 1 ];then echo "$(cdate) [Prerequisite 1] KeepAlived VRRP vip is already existed." >> $logfile if [ -n "$MasterLogFile" ] && [ -n "$MasterLogPos" ];then if [ "$MasterLogFile" == "$ReadMasterLogFile" ] && [ "$MasterLogPos" == "$ReadMasterLogPos" ];then echo "$(cdate) [Prerequisite 2] Master and Slave io_thread's data has been synchronized." >> $logfile if [ "$MasterLogFile" == "$RelayMasterLogFile" ] && [ "$MasterLogPos" == "$ExecMasterLogPos" ];then echo "$(cdate) [Prerequisite 3] Master and Slave sql_thread's data has been synchronized." >> $logfile echo "$(cdate) [Action] MySQL Replication Switchover is begining." >> $logfile mysql $LconnCMD >/dev/null 2>&1 < stop slave; reset slave all; reset master; flush tables with read lock; tee $dblog; show master status\G; notee; set global read_only = off; set global super_read_only = off; unlock tables; EOF echo "$(cdate) [Action] MySQL Replication Switchover is end." >> $logfile echo "$(cdate) [Action] MySQL Replication resync is begining." >> $logfile mysql $MconnCMD >/dev/null 2>&1 < change master to MASTER_HOST='$Chost', MASTER_USER='repl', MASTER_PASSWORD= 'Abcd321#', MASTER_PORT=3306, $CHANGEOPTS; start slave; EOF echo "$(cdate) [Action] MySQL Replication resync is end." >> $logfile exit 0 else echo "$(cdate) [Warning] Master and Slave sql_thread's data has not yet been synchronized." >> $logfile fi else echo "$(cdate) [Warning] Master and Slave io_thread's data has not yet been synchronized." >> $logfile fi else echo "$(cdate) [Warning] Master is probably down." >> $logfile fi else echo "$(cdate) [Warning] KeepAlived VRRP vip has not been obtained." >> $logfile fi sleep $WaitTime let count++ done 注意:该脚本在切换的过程中,除了会使用本节点默认的系统账户root@localhost,还会调用拓扑中统一使用的复制账户。在复制架构的各个节点上都必须存在该账户,并且由于此账户会涉及必要的查询和变更,因此其需要具备SUPER、REPLICATION SLAVE和REPLICATION CLIENT权限。 一般情况下,to_NewMaster.sh程序只需要部署在从端即可。但是用户在做MySQL切换演练时,很多情况下考虑的不仅仅是只做顺切,可能还要求回切。因此,该脚本还需要部署在新从端即原主端(假设是一主一从复制拓扑)。与mysql_check.sh、to_setROmode.sh两脚本允许在不同节点直接复用不同,to_NewMaster.sh还需要对Mhost(主节点)与Chost(候选节点)变量值做针对性修改,其他代码行如没有特殊情况则可以直接使用。 从节点的Keepalived在切换后会被VRRP仲裁并成为新的主节点,因此其会持有Keepalived集群中的VIP地址。notify_masetr可用于触发to_NewMaster.sh脚本,并执行切换新主节点的操作。其中,Warning关键字表示该从节点未获取VIP地址、主库异常宕机,或者从库I/O线程仍有数据延迟,亦或者从库SQL线程也存在数据延迟;Prerequisite则表示切换动作执行的先决条件,[1]表示已经持有VIP地址,[2]表示I/O线程数据同步已经一致,[3]表示SQL线程数据同步已经一致;而Action则表示切换动作和复制拓扑中新主、从节点数据重新同步的开始和结束。同理,该脚本也引入了WaitTime和WaitCount变量,若其中任意一个先决条件没有得到满足,则意味着切换不会执行,用户可以设置这两个变量来控制切换等待的最大时长。 由于是计划性切换,数据同步没有被破坏,因此待复制环境恢复后,便可使用上述相同的脚本实现回切。用户若想执行故障切换,则只需要将先决条件校验的代码行去掉便可实现。但通常来说,故障切换并不是受推荐的方式(意味着可能会出现数据不一致或丢失的问题),只有在极端情况下才会使用。 主节点Keepalived 配置 在主节点上备份keepalived.conf文件,并创建自定义的keepalived.conf文件。当Keepalived服务停止时,会触发to_setROmode.sh脚本,将主库设置为只读模式,代码如下: shell> cd /etc/keepalived shell> mv keepalived.conf keepalived.conf.original shell> vi keepalived.conf ! Configuration File for keepalived global_defs { router_id LVS_DEVEL } vrrp_script mysql_check { script "/etc/keepalived/scripts/mysql_check.sh" interval 30 } vrrp_instance VI_1 { state BACKUP nopreempt interface ens33 virtual_router_id 51 priority 100 unicast_src_ip 192.168.113.110 unicast_peer { 192.168.113.111 } advert_int 1 authentication { auth_type PASS auth_pass 1111 } track_script { mysql_check } virtual_ipaddress { 192.168.113.200 } notify_master /etc/keepalived/scripts/to_NewMaster.sh notify_stop /etc/keepalived/scripts/to_setROmode.sh } 从节点Keepalived 配置 在从节点上创建keepalived.conf配置文件。当虚拟路由冗余协议仲裁的VIP传至从节点时,会触发to_NewMaster.sh脚本将从库切换成新的主库,命令如下: shell> cd /etc/keepalived shell> mv keepalived.conf keepalived.conf.original shell> vi keepalived.conf ! Configuration File for keepalived global_defs { router_id LVS_DEVEL } vrrp_script mysql_check { script "/etc/keepalived/scripts/mysql_check.sh" interval 30 } vrrp_instance VI_1 { state BACKUP nopreempt interface ens33 virtual_router_id 51 priority 90 unicast_src_ip 192.168.113.111 unicast_peer { 192.168.113.110 } advert_int 1 authentication { auth_type PASS auth_pass 1111 } track_script { mysql_check } virtual_ipaddress { 192.168.113.200 } notify_master /etc/keepalived/scripts/to_NewMaster.sh notify_stop /etc/keepalived/scripts/to_setROmode.sh } 测试验证 实践表明,Keepalived集群中的各节点若设置了抢占模式,那么出于各种原因,在很多场景下,该模式会导致复制架构中主、从库数据紊乱的问题。因此,对于生产线环境,强烈建议配置为非抢占模式。上文示例中的Keepalived最佳实践配置,是基于最典型的一主一从复制拓扑场景给出的。若配置了多节点集群,并且仍然选择单播模式,则需在unicast_peer选项中将除主节点之外的所有其他节点以换行的方式加入进去。 先后在主、从节点分别启动Keepalived服务,两节点VRRP实例在启动时都是BACKUP状态,但主节点VRRP实例会在稍后转换成MASTER状态,并持有VIP地址,且开始发送ARP广播,同时启动该软件的检查功能,即通知并执行mysql_check.sh脚本。输出日志会被默认重定向到主机操作系统/var/log/messages文件中,具体如下: shell> systemctl start keepalived 主节点: May 25 09:18:46 localhost Keepalived_vrrp[1384]: VRRP_Instance(VI_1) Entering BACKUP STATE May 25 09:18:46 localhost Keepalived_healthcheckers[1380]: Opening file '/etc/ keepalived/keepalived.conf'. May 25 09:18:46 localhost Keepalived_vrrp[1384]: VRRP_Script(mysql_check) succeeded May 25 09:18:50 localhost Keepalived_vrrp[1384]: VRRP_Instance(VI_1) Transition to MASTER STATE May 25 09:18:51 localhost Keepalived_vrrp[1384]: VRRP_Instance(VI_1) Entering MASTER STATE May 25 09:18:51 localhost Keepalived_vrrp[1384]: VRRP_Instance(VI_1) setting protocol VIPs. May 25 09:18:51 localhost Keepalived_vrrp[1384]: Sending gratuitous ARP on ens33 for 192.168.113.200 从节点: May 25 09:19:52 localhost Keepalived_vrrp[1321]: VRRP_Instance(VI_1) Entering BACKUP STATE May 25 09:19:52 localhost Keepalived_healthcheckers[1320]: Opening file '/etc/ keepalived/keepalived.conf'. May 25 09:19:53 localhost Keepalived_vrrp[1321]: VRRP_Script(mysql_check) succeeded 待所有节点Keepalived服务都成功启动后,通过tcpdump工具对两节点的ens33网卡接口进行监听,可以捕获到广播通信的报文。该信息会记录广播的源端和目标端,以及VRRP主端配置的协议版本、虚拟路由ID号、权重、认证类型和心跳间隔等。详细输出如下所示: shell> tcpdump -i ens33 vrrp -n 主节点: tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes 09:32:44.348297 IP 192.168.113.110 > 192.168.113.111: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20 从节点: tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes 09:32:36.305103 IP 192.168.113.110 > 192.168.113.111: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20 以上所有脚本的调用都依赖于Keepalived服务,因此该服务成为了控制切换的开关。当在主节点上手动关闭(更为推荐的方式是重启)Keepalived服务时,就会触发主、从库计划性切换,切换日志如下所示。 主节点主动驱逐Keepalived集群中的VIP,并关闭虚拟路由冗余实例,命令如下: shell> tail -f 20f /var/log/messages May 25 15:13:27 localhost Keepalived[8013]: Stopping May 25 15:13:27 localhost systemd: Stopping LVS and VRRP High Availability Monitor... May 25 15:13:27 localhost Keepalived_vrrp[8015]: VRRP_Instance(VI_1) sent 0 priority May 25 15:13:27 localhost Keepalived_vrrp[8015]: VRRP_Instance(VI_1) removing protocol VIPs. May 25 15:13:27 localhost Keepalived_healthcheckers[8014]: Stopped May 25 15:13:28 localhost Keepalived_vrrp[8015]: Stopped May 25 15:13:28 localhost Keepalived[8013]: Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2 May 25 15:13:28 localhost systemd: Stopped LVS and VRRP High Availability Monitor. 主库被设置为只读模式,并获取当前二进制文件名、位置和GTID,命令如下: shell> cat to_setROmode_2020-05-25_15-13-28.log 2020-05-25 15:13:28 [Action] To set Master Read Only mode after KeepAlived Failover is begining. ... 2020-05-25 15:13:28 [Action] To set Master Read Only mode after KeepAlived Failover is end. shell> cat to_setROmode_2020-05-25_15-13-28.bin *************************** 1. row *************************** File: mysql-binlog.000001 Position: 55278 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: b1cf5c71-9da6-11ea-9b6e-000c2919db73:1-158 从节点在Keepalived集群中被仲裁为主状态并获取VIP,之后便开始发送ARP广播,命令如下: shell> tail -20f /var/log/messages May 25 15:13:22 localhost Keepalived_vrrp[8834]: VRRP_Instance(VI_1) Transition to MASTER STATE May 25 15:13:23 localhost Keepalived_vrrp[8834]: VRRP_Instance(VI_1) Entering MASTER STATE May 25 15:13:23 localhost Keepalived_vrrp[8834]: VRRP_Instance(VI_1) setting protocol VIPs. May 25 15:13:23 localhost Keepalived_vrrp[8834]: Sending gratuitous ARP on ens33 for 192.168.113.200 从库被切换成新主库,并与原主库恢复数据同步,命令如下: shell> cat to_NewMaster_2020-05-25_15-13-23.log 2020-05-25 15:13:23 [Prerequisite 1] KeepAlived VRRP vip is already existed. 2020-05-25 15:13:23 [Prerequisite 2] Master and Slave io_thread's data has been synchronized. 2020-05-25 15:13:23 [Prerequisite 3] Master and Slave sql_thread's data has been synchronized. 2020-05-25 15:13:23 [Action] MySQL Replication Switchover is begining. 2020-05-25 15:13:23 [Action] MySQL Replication Switchover is end. 2020-05-25 15:13:23 [Action] MySQL Replication resync is begining. 2020-05-25 15:13:23 [Action] MySQL Replication resync is end. shell> cat to_NewMaster_2020-05-25_15-13-23.bin *************************** 1. row *************************** File: mysql-binlog.000001 Position: 154 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 回切的流程和日志输出与上文类似,需要注意的是,在新主节点关闭Keepalived服务之前,必须正常启动原主节点的Keepalived服务,否则回切将失败。