公司最近要上线多实例redis主从,可用的redis服务机器一共有两台,其中前端1为redis主机,前端2为redis从机,而且主机上需启用三个主redis,备机上与之对应启动三个从redis,这就需要keepalived支撑三个主redis的故障切换和恢复,其中192.168.100.154为主redis,192.168.100.156为从redis,具体部署如下图:
二、keepalived主备机配置
因为只有两台机器,没有多余的,而这两台机器上将来有nginx的主备切换要上线,这就需要两台机器上既要能redis切换,又要nginx切换,其实配置起来很简单。
主机上keepalived配置:
!Configuration File for keepalived
global_defs{
notification_email {
}
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_scriptcheck_run {
script"/usr/local/nginx/check_nginx.sh"
interval 2
}
vrrp_sync_groupVG1 {
group {
VI_1
}
}
vrrp_instanceVI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
track_script{
check_run
}
virtual_ipaddress {
192.168.1.57/24
}
}
vrrp_sync_group VG2 {
group {
VI_2
}
}
vrrp_script check_run2{
script"/usr/redis/script/redis_check.sh"
interval 2
}
vrrp_instance VI_2 {
state MASTER
interfaceeth1
virtual_router_id 52
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
track_script{
check_run2
}
virtual_ipaddress {
192.168.100.100/24
}
notify_master/usr/redis/script/redis_master.sh
notify_backup/usr/redis/script/redis_backup.sh
notify_fault/usr/redis/script/redis_fault.sh
notify_stop/usr/redis/script/redis_stop.sh
}
需要注意的是VI_2中的interface 和virtual_router_id不能和 VI_1一样的。否则会出现切换错误。还需要说明的是vrrp_script触发机制一定要放到最前面的,刚开始弄,不清楚,放到其他位置,VIP不能自动漂移。这里我们只配置check_run2,上面的留给nginx。
备机上keepalived配置
! Configuration Filefor keepalived
global_defs {
notification_email {
}
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_sync_group VG1 {
group {
VI_1
}
notify_master"/usr/local/nginx/nginx.sh master"
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.57/24
}
}
vrrp_scriptcheck_run2 {
script"/usr/redis/script/redis_check.sh"
interval 2
}
vrrp_instanceVI_2 {
state BACKUP
interface eth1
virtual_router_id 52
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
track_script{
check_run2
}
virtual_ipaddress {
192.168.100.100/24
}
notify_master/usr/redis/script/redis_master.sh
notify_backup/usr/redis/script/redis_backup.sh
notify_fault/usr/redis/script/redis_fault.sh
notify_stop/usr/redis/script/redis_stop.sh
}
2.1、redis主机配置
这里的脚本是拷贝几个专业人士的脚本,只根据实际需求做了部分修改。
1>、vim /usr/redis/script/redis_check.sh
#!/bin/bash
ALIVE=`/usr/local/redis/bin/redis-cli -p6379 PING`
ALIVE1=`/usr/local/redis/bin/redis-cli -p6380 PING`
ALIVE2=`/usr/local/redis/bin/redis-cli -p6381 PING`
if [ "$ALIVE" == "PONG"]&&[ "$ALIVE1" == "PONG" ]&&["$ALIVE2" == "PONG" ]; then
echo "$ALIVE,$ALIVE1,$ALIVE2"
exit 0
else
killall -9 redis-server
exit 1
fi
检测方法是三个主redis中,任何一个PING命令检测到的状态不是PONG的,即进入故障切换状态。
2>、vim /usr/redis/script/redis_master.sh
#!/bin/bash
#/usr/redis/script/redis_master.sh
REDISCLI="/usr/local/redis/bin/redis-cli"
LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6379.log"
LOGFILE1="/usr/local/redis/logs/keepalived-redis-state-6380.log"
LOGFILE2="/usr/local/redis/logs/keepalived-redis-state-6381.log"
echo "[master]" >> $LOGFILE
echo "[master]" >>$LOGFILE1
echo "[master]" >>$LOGFILE2
date >> $LOGFILE
date >> $LOGFILE1
date >> $LOGFILE2
echo "Being master...." >>$LOGFILE 2>&1
echo "Being master...." >>$LOGFILE1 2>&1
echo "Being master...." >>$LOGFILE2 2>&1
echo "Run SLAVEOF cmd ...">> $LOGFILE
echo "Run SLAVEOF cmd ...">> $LOGFILE1
echo "Run SLAVEOF cmd ...">> $LOGFILE2
$REDISCLI -p 6379 SLAVEOF 192.168.100.156 6379 >> $LOGFILE 2>&1
$REDISCLI -p 6380 SLAVEOF 192.168.100.156 6380 >> $LOGFILE1 2>&1
$REDISCLI -p 6381 SLAVEOF 192.168.100.156 6381 >> $LOGFILE2 2>&1
sleep 10
echo "Run SLAVEOF NO ONE cmd ...">> $LOGFILE
echo "Run SLAVEOF NO ONE cmd ...">> $LOGFILE1
echo "Run SLAVEOF NO ONE cmd ...">> $LOGFILE2
$REDISCLI -p 6379 SLAVEOF NO ONE >> $LOGFILE 2>&1
$REDISCLI -p 6380 SLAVEOF NO ONE >> $LOGFILE1 2>&1
$REDISCLI -p 6381 SLAVEOF NO ONE >> $LOGFILE2 2>&1
请注意标红的地方,redis-cli后面一定要加对应的端口号,不然主备切换过程中,会产生端口错乱现象。
3>、vim /usr/redis/script/redis_backup.sh
#!/bin/bash
REDISCLI="/usr/local/redis/bin/redis-cli"
LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6379.log"
LOGFILE1="/usr/local/redis/logs/keepalived-redis-state-6380.log"
LOGFILE2="/usr/local/redis/logs/keepalived-redis-state-6381.log"
echo "[backup]" >> $LOGFILE
echo "[backup]" >>$LOGFILE1
echo "[backup]" >>$LOGFILE2
date >> $LOGFILE
date >> $LOGFILE1
date >> $LOGFILE2
echo "Being slave...." >>$LOGFILE 2>&1
echo "Being slave...." >>$LOGFILE1 2>&1
echo "Being slave...." >>$LOGFILE2 2>&1
sleep 15
echo "Run SLAVEOF cmd ...">> $LOGFILE
echo "Run SLAVEOF cmd ...">> $LOGFILE1
echo "Run SLAVEOF cmd ...">> $LOGFILE2
$REDISCLI -p 6379 SLAVEOF 192.168.100.156 6379 >> $LOGFILE 2>&1
$REDISCLI -p 6380 SLAVEOF 192.168.100.156 6380 >> $LOGFILE1 2>&1
$REDISCLI -p 6381 SLAVEOF 192.168.100.156 6381 >> $LOGFILE2 2>&1
标红的地方同样需要注意SLAVEOF 后面IP为从机的。
4>、vim /usr/redis/script/redis_backup.sh
#!/bin/bash
LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6379.log"
LOGFILE1="/usr/local/redis/logs/keepalived-redis-state-6380.log"
LOGFILE2="/usr/local/redis/logs/keepalived-redis-state-6381.log"
echo "[fault]" >> $LOGFILE
echo "[fault]" >> $LOGFILE1
echo "[fault]" >> $LOGFILE2
date >> $LOGFILE
date >> $LOGFILE1
date >> $LOGFILE2
5>、vim /usr/redis/script/redis_stop.sh
#!/bin/bash
LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6379.log"
LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6380.log"
LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6381.log"
echo "[stop]" >> $LOGFILE
echo "[stop]" >> $LOGFILE1
echo "[stop]" >> $LOGFILE2
date >> $LOGFILE
date >> $LOGFILE1
date >> $LOGFILE2
2.2、redis从机配置
1、/usr/redis/script/redis_check.sh
#!/bin/bash
ALIVE=`/usr/local/redis/bin/redis-cli -p 6379 PING`
ALIVE1=`/usr/local/redis/bin/redis-cli -p 6380 PING`
ALIVE2=`/usr/local/redis/bin/redis-cli -p 6381 PING`
if [ "$ALIVE" == "PONG" ]&&[ "$ALIVE1" == "PONG" ]&&[ "$ALIVE2" == "PONG" ]; then
echo "$ALIVE,$ALIVE1,$ALIVE2"
exit 0
else
echo "$ALIVE,$ALIVE1,$ALIVE2"
exit 1
fi
剩下的redis_backup.sh, redis_fault.sh redis_master.sh redis_stop.sh和上面的一样,只需要改对应iP为192.168.1.154即可。
2.3、故障模拟
启动192.168.100.3,192.168.100.2上的keepalived和redis
启动keepalived:service keepalivedstart
启动redis: redis-server/usr/local/redis/conf/redis.conf
1、 查看主redis的IP状态,可以看到获取到了VIP。
查看redis信息,看到三个主redis状态,与之对应的slave状态是online的。
2、查看从redis的IP状态,看到三个从redis状态是up的。
3、故障模拟
我们分别启动主redis和从redis上的三个redis服务。
redis-cli /usr/local/redis/conf/redis_6379.conf
redis-cli /usr/local/redis/conf/redis_6380.conf
redis-cli /usr/local/redis/conf/redis_6381.conf
这种启动方法有个缺陷,就是在故障恢复后,启动三个redis时,经常不能启动redis,因为我在check_redis.sh脚本已经设置了启动机制,把3个redis做成启动脚本启动,效果并不理想,需要把keepalived里的 interval 2检测值设置的大点,或许效果好点。这个问题希望大神多多指点啊!