关于多实例redis主从+Keepalived故障切换的解决方法

、redis三主三从部署图

   公司最近要上线多实例redis主从,可用的redis服务机器一共有两台,其中前端1redis主机,前端2redis从机,而且主机上需启用三个主redis,备机上与之对应启动三个从redis,这就需要keepalived支撑三个主redis的故障切换和恢复,其中192.168.100.154为主redis,192.168.100.156为从redis,具体部署如下图:

wKioL1X6XnygonChAAHXcpZdsO0789.jpg

二、keepalived主备机配置

   因为只有两台机器,没有多余的,而这两台机器上将来有nginx的主备切换要上线,这就需要两台机器上既要能redis切换,又要nginx切换,其实配置起来很简单。

   主机上keepalived配置:   

!Configuration File for keepalived

global_defs{

   notification_email {

     [email protected]

     [email protected]

     [email protected]

   }

   [email protected]

   smtp_server 192.168.200.1

   smtp_connect_timeout 30

   router_id LVS_DEVEL

}

vrrp_scriptcheck_run {

   script"/usr/local/nginx/check_nginx.sh"

   interval 2

}

vrrp_sync_groupVG1 {

    group {

          VI_1

    }

}

vrrp_instanceVI_1 {

    state MASTER

    interface eth0

    virtual_router_id 51

    priority 100

    advert_int 1

    authentication {

        auth_type PASS

        auth_pass 1111

    }

track_script{

       check_run

    }

    virtual_ipaddress {

        192.168.1.57/24

    }

}

vrrp_sync_group VG2 {

    group {

          VI_2

    }

}

vrrp_script check_run2{

   script"/usr/redis/script/redis_check.sh"

   interval 2

}

vrrp_instance VI_2 {

    state MASTER

        interfaceeth1

    virtual_router_id 52

    priority 100

    advert_int 1

    authentication {

        auth_type PASS

        auth_pass 1111

    }

track_script{

       check_run2

    }

    virtual_ipaddress {

        192.168.100.100/24

    }

        notify_master/usr/redis/script/redis_master.sh

        notify_backup/usr/redis/script/redis_backup.sh

        notify_fault/usr/redis/script/redis_fault.sh

        notify_stop/usr/redis/script/redis_stop.sh

}

需要注意的是VI_2中的interface virtual_router_id不能和 VI_1一样的。否则会出现切换错误。还需要说明的是vrrp_script触发机制一定要放到最前面的,刚开始弄,不清楚,放到其他位置,VIP不能自动漂移。这里我们只配置check_run2,上面的留给nginx。

备机上keepalived配置

! Configuration Filefor keepalived

 

global_defs {

   notification_email {

     [email protected]

     [email protected]

     [email protected]

   }

   [email protected]

   smtp_server 192.168.200.1

   smtp_connect_timeout 30

   router_id LVS_DEVEL

}

vrrp_sync_group VG1 {

    group {

          VI_1

}

notify_master"/usr/local/nginx/nginx.sh master"

 

}

 

vrrp_instance VI_1 {

    state BACKUP

    interface eth0

    virtual_router_id 51

    priority 80

    advert_int 1

    authentication {

        auth_type PASS

        auth_pass 1111

    }

virtual_ipaddress {

        192.168.1.57/24

    }

}

vrrp_scriptcheck_run2 {

   script"/usr/redis/script/redis_check.sh"

   interval 2

}

vrrp_instanceVI_2 {

    state BACKUP

    interface eth1

    virtual_router_id 52

    priority 80

    advert_int 1

    authentication {

        auth_type PASS

        auth_pass 1111

    }

track_script{

       check_run2

    }

    virtual_ipaddress {

        192.168.100.100/24

}

        notify_master/usr/redis/script/redis_master.sh

        notify_backup/usr/redis/script/redis_backup.sh

        notify_fault/usr/redis/script/redis_fault.sh

        notify_stop/usr/redis/script/redis_stop.sh

 

}



2.1、redis主机配置

这里的脚本是拷贝几个专业人士的脚本,只根据实际需求做了部分修改。

 1>vim /usr/redis/script/redis_check.sh

#!/bin/bash

ALIVE=`/usr/local/redis/bin/redis-cli -p6379 PING`

ALIVE1=`/usr/local/redis/bin/redis-cli -p6380 PING`

ALIVE2=`/usr/local/redis/bin/redis-cli -p6381 PING`

if [ "$ALIVE" == "PONG"]&&[ "$ALIVE1" == "PONG" ]&&["$ALIVE2" == "PONG" ]; then

  echo "$ALIVE,$ALIVE1,$ALIVE2"

  exit 0

else

  killall -9 redis-server

  exit 1

fi

检测方法是三个主redis中,任何一个PING命令检测到的状态不是PONG,即进入故障切换状态。

2>vim /usr/redis/script/redis_master.sh

#!/bin/bash

#/usr/redis/script/redis_master.sh

REDISCLI="/usr/local/redis/bin/redis-cli"

LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6379.log"

LOGFILE1="/usr/local/redis/logs/keepalived-redis-state-6380.log"

LOGFILE2="/usr/local/redis/logs/keepalived-redis-state-6381.log"

echo "[master]" >> $LOGFILE

echo "[master]" >>$LOGFILE1

echo "[master]" >>$LOGFILE2

date >> $LOGFILE

date >> $LOGFILE1

date >> $LOGFILE2

echo "Being master...." >>$LOGFILE 2>&1

echo "Being master...." >>$LOGFILE1 2>&1

echo "Being master...." >>$LOGFILE2 2>&1

echo "Run SLAVEOF cmd ...">> $LOGFILE

echo "Run SLAVEOF cmd ...">> $LOGFILE1

echo "Run SLAVEOF cmd ...">> $LOGFILE2

$REDISCLI -p 6379 SLAVEOF 192.168.100.156 6379 >> $LOGFILE 2>&1

$REDISCLI -p 6380 SLAVEOF 192.168.100.156 6380 >> $LOGFILE1 2>&1

$REDISCLI -p 6381 SLAVEOF 192.168.100.156 6381 >> $LOGFILE2 2>&1

sleep 10

echo "Run SLAVEOF NO ONE cmd ...">> $LOGFILE

echo "Run SLAVEOF NO ONE cmd ...">> $LOGFILE1

echo "Run SLAVEOF NO ONE cmd ...">> $LOGFILE2

$REDISCLI -p 6379 SLAVEOF NO ONE >> $LOGFILE 2>&1

$REDISCLI -p 6380 SLAVEOF NO ONE >> $LOGFILE1 2>&1

$REDISCLI -p 6381 SLAVEOF NO ONE >> $LOGFILE2 2>&1

请注意标红的地方,redis-cli后面一定要加对应的端口号,不然主备切换过程中,会产生端口错乱现象。

3>vim /usr/redis/script/redis_backup.sh

#!/bin/bash 

REDISCLI="/usr/local/redis/bin/redis-cli"

LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6379.log"

LOGFILE1="/usr/local/redis/logs/keepalived-redis-state-6380.log"

LOGFILE2="/usr/local/redis/logs/keepalived-redis-state-6381.log"

 

echo "[backup]" >> $LOGFILE

echo "[backup]" >>$LOGFILE1

echo "[backup]" >>$LOGFILE2

date >> $LOGFILE

date >> $LOGFILE1

date >> $LOGFILE2

echo "Being slave...." >>$LOGFILE 2>&1

echo "Being slave...." >>$LOGFILE1 2>&1

echo "Being slave...." >>$LOGFILE2 2>&1

 

sleep 15

echo "Run SLAVEOF cmd ...">> $LOGFILE

echo "Run SLAVEOF cmd ...">> $LOGFILE1

echo "Run SLAVEOF cmd ...">> $LOGFILE2

$REDISCLI -p 6379 SLAVEOF 192.168.100.156 6379 >> $LOGFILE  2>&1

$REDISCLI -p 6380 SLAVEOF 192.168.100.156 6380 >> $LOGFILE1  2>&1

$REDISCLI -p 6381 SLAVEOF 192.168.100.156 6381 >> $LOGFILE2  2>&1

标红的地方同样需要注意SLAVEOF 后面IP为从机的。

4>vim /usr/redis/script/redis_backup.sh

#!/bin/bash 

LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6379.log"

LOGFILE1="/usr/local/redis/logs/keepalived-redis-state-6380.log"

LOGFILE2="/usr/local/redis/logs/keepalived-redis-state-6381.log"

echo "[fault]" >> $LOGFILE

echo "[fault]" >> $LOGFILE1

echo "[fault]" >> $LOGFILE2

date >> $LOGFILE

date >> $LOGFILE1

date >> $LOGFILE2

5>vim /usr/redis/script/redis_stop.sh

#!/bin/bash 

LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6379.log"

LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6380.log"

LOGFILE="/usr/local/redis/logs/keepalived-redis-state-6381.log"

echo "[stop]" >> $LOGFILE

echo "[stop]" >> $LOGFILE1

echo "[stop]" >> $LOGFILE2

date >> $LOGFILE

date >> $LOGFILE1

date >> $LOGFILE2

2.2、redis从机配置

1/usr/redis/script/redis_check.sh

#!/bin/bash 

ALIVE=`/usr/local/redis/bin/redis-cli -p 6379 PING`

ALIVE1=`/usr/local/redis/bin/redis-cli -p 6380 PING`

ALIVE2=`/usr/local/redis/bin/redis-cli -p 6381 PING`

if [ "$ALIVE" == "PONG" ]&&[ "$ALIVE1" == "PONG" ]&&[ "$ALIVE2" == "PONG" ]; then

   echo  "$ALIVE,$ALIVE1,$ALIVE2"

   exit 0

else

   echo "$ALIVE,$ALIVE1,$ALIVE2"

   exit 1

fi

    剩下的redis_backup.sh, redis_fault.sh  redis_master.sh  redis_stop.sh和上面的一样,只需要改对应iP为192.168.1.154即可。

    2.3、故障模拟

启动192.168.100.3192.168.100.2上的keepalivedredis

启动keepalived:service keepalivedstart

启动redis: redis-server/usr/local/redis/conf/redis.conf


1、  查看主redisIP状态,可以看到获取到了VIP

wKioL1X6aeWjgHKAAAG8rl96OEw094.jpg      

查看redis信息,看到三个主redis状态,与之对应的slave状态是online的。

wKiom1X6aEHxdA0vAAGa6L_VK4g337.jpg

2、查看从redisIP状态,看到三个从redis状态是up的。

wKiom1X6aQ2iYKx8AADPLwwAM20633.jpg

wKioL1X6a0XDPAnpAADYopBO8w8599.jpg

wKiom1X6aQ7jv2QfAADiEJOcUlg264.jpg


3、故障模拟

我们分别启动主redis和从redis上的三个redis服务。

redis-cli /usr/local/redis/conf/redis_6379.conf

redis-cli /usr/local/redis/conf/redis_6380.conf

redis-cli /usr/local/redis/conf/redis_6381.conf

   这种启动方法有个缺陷,就是在故障恢复后,启动三个redis时,经常不能启动redis,因为我在check_redis.sh脚本已经设置了启动机制,把3个redis做成启动脚本启动,效果并不理想,需要把keepalived里的 interval 2检测值设置的大点,或许效果好点。这个问题希望大神多多指点啊!










你可能感兴趣的:(主从,多实例redis)