Redis主从架构持久化存在一个问题,即前次测试的结论,持久化需要配置在主实例上才能跨越实例保证数据不丢失,这样以来主实例在持久化数据到硬盘的过程中,势必会造成磁盘的I/O等待,经过实际测试,这个持久化写硬盘的过程给应用程序带来的影响无法忍受;因而在大多数场景下,会考虑把持久化配置在从实例上,当主实例宕机后,通过手动或者自动的方式将从实例提升为主实例,继续提供服务!当主实例恢复后,先从原从实例上同步数据,同步完成后再恢复到原始的主从状态!要实现这种的要求,需要有keepalive的配合,一方面keepalive提供了VIP,可以避免修改应用程序连接,同时redis实例的配置文件监听部分也需要修改为全网监听;另一方面keepalive定时调度脚本来监控主从实例的状态,根据具体情况进行切换!本文将重点介绍下使用keepalive实现redis主从自动failover!

环境介绍
操作系统版本均为:rhel5.4 64bit
redis版本:2.6.4
redis实例端口均为:6379
redis实例密码均为:123
VIP:192.168.1.120
主实例为server11(192.168.1.112)
从实例为server12(192.168.1.113,开启快照持久化)

一:安装keepalive软件,server11安装完成后直接scp至server12上即可

   
   
   
   
  1. [root@server11 ~]# wget http://keepalived.org/software/keepalived-1.1.19.tar.gz  
  2. [root@server11 ~]# tar -zxvf ../tarbag/keepalived-1.1.19.tar.gz  
  3. [root@server11 ~]# cd keepalived-1.1.19/  
  4. [root@server11 ~]# ./configure --prefix=/usr/local/keepalived && make && make install

二:配置主节点server11配置文件

   
   
   
   
  1. [root@server11 ~]# cat /usr/local/keepalived/etc/keepalived/keepalived.conf  
  2. ! Configuration File for keepalived  
  3.  
  4. global_defs {  
  5.  router_id LVS_DEVEL  
  6. }  
  7.  
  8. vrrp_script Monitor_redis {  
  9.  script "/usr/local/scripts/redis_monitor.sh"  
  10.  interval 2   
  11.  weight 2    
  12. }  
  13.  
  14. vrrp_instance VI_1{   
  15.  state MASTER  
  16.  interface eth0  
  17.  virtual_router_id 51  
  18.  mcast_src_ip 192.168.1.112  
  19.  priority  100  
  20.  advert_int 1  
  21.  authentication {  
  22.  auth_type PASS  
  23.  auth_pass password_123  
  24. }  
  25.  track_script {  
  26.  Monitor_redis  
  27. }  
  28.  virtual_ipaddress {  
  29.  192.168.1.120  
  30.  }  
  31.  notify_fault  /usr/local/scripts/redis_fault.sh    
  32.  notify_stop   /usr/local/scripts/redis_stop.sh    
  33.  

三:配置从节点server12配置文件

   
   
   
   
  1. [root@server12 ~]# cat /usr/local/keepalived/etc/keepalived/keepalived.conf  
  2. ! Configuration File for keepalived  
  3.  
  4. global_defs {  
  5.  router_id LVS_DEVEL  
  6. }  
  7.  
  8. vrrp_script Monitor_redis {  
  9.  script "/usr/local/scripts/redis_monitor.sh"  
  10.  interval 2   
  11.  weight 2    
  12. }  
  13.  
  14. vrrp_instance VI_1{   
  15.  state BACKUP   
  16.  interface eth0  
  17.  virtual_router_id 51  
  18.  mcast_src_ip 192.168.1.113  
  19.  priority  99  
  20.  advert_int 1  
  21.  authentication {  
  22.  auth_type PASS  
  23.  auth_pass password_123  
  24. }  
  25.  track_script {  
  26.  Monitor_redis  
  27. }  
  28.  virtual_ipaddress {  
  29.  192.168.1.120  
  30.  }  
  31.  notify_master /usr/local/scripts/redis_master.sh    
  32.  notify_backup /usr/local/scripts/redis_backup.sh    
  33.  notify_fault  /usr/local/scripts/redis_fault.sh    
  34.  notify_stop   /usr/local/scripts/redis_stop.sh    
  35.  

四:准备相关的脚本,主从实例上都需要存在这些脚本,同时注意脚本需要由可执行权限

   
   
   
   
  1. [root@server11 ~]# cat /usr/local/scripts/redis_monitor.sh   
  2. #!/bin/bash    
  3. ALIVE=$(/usr/local/redis2/bin/redis-cli -h 192.168.1.112 -p 6379 -a 123 PING)  
  4.  
  5. if [ "$ALIVE" == "PONG" ]; then   
  6.     echo $ALIVE    
  7.     exit 0    
  8.     else   
  9.     echo $ALIVE    
  10.     killall -9 keepalived  
  11.     service network restart  
  12.     exit 1    
  13. fi   
  14.  
  15. [root@server11 ~]# sh /usr/local/scripts/redis_monitor.sh   
  16. PONG  
  17.  
  18. [root@server11 ~]# cat /usr/local/scripts/redis_master.sh    
  19. #!/bin/bash    
  20. REDISCLI="/usr/local/redis2/bin/redis-cli -h 192.168.1.112 -p 6379 -a 123"   
  21. LOGFILE="/usr/local/redis2/var/keepalived-redis-state.log"   
  22.  
  23. echo "[master]" >> $LOGFILE    
  24. date >> $LOGFILE    
  25. echo "Being master...." >> $LOGFILE 2>&1    
  26. echo "Run SLAVEOF cmd ..." >> $LOGFILE    
  27. $REDISCLI SLAVEOF 192.168.1.113 6379 >> $LOGFILE  2>&1    
  28. sleep 10   
  29. echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE    
  30. $REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1   
  31.  
  32.  
  33. [root@server11 ~]# cat /usr/local/scripts/redis_backup.sh   
  34. #!/bin/bash    
  35. REDISCLI="/usr/local/redis2/bin/redis-cli -h 192.168.1.112 -p 6379 -a 123"   
  36. LOGFILE="/usr/local/redis2/var/keepalived-redis-state.log"   
  37.  
  38. echo "[backup]" >> $LOGFILE    
  39. date >> $LOGFILE    
  40. echo "Being slave...." >> $LOGFILE 2>&1    
  41. sleep 15    
  42. echo "Run SLAVEOF cmd ..." >> $LOGFILE    
  43. $REDISCLI SLAVEOF 192.168.1.113 6379 >> $LOGFILE  2>&1   
  44.  
  45. [root@server11 ~]# cat /usr/local/scripts/redis_stop.sh   
  46. #!/bin/bash  
  47. LOGFILE="/usr/local/redis2/var/keepalived-redis-state.log"   
  48. echo "[stop]" >> $LOGFILE  
  49. date >> $LOGFILE   
  50. [root@server11 ~]# cat /usr/local/scripts/redis_fault.sh
    #!/bin/bash 
    LOGFILE="/usr/local/redis2/var/keepalived-redis-state.log"
    echo "[fault]" >> $LOGFILE 
    date >> $LOGFILE

五:主从实例分别启动keepalive进程,测试VIP是否正常(这里就要修改redis配置文件的监听地址为0.0.0.0)

   
   
   
   
  1. [root@server11 ~]# /usr/local/keepalived/sbin/keepalived -D -f  /usr/local/keepalived/etc/keepalived/keepalived.conf  
  2. [root@server11 ~]# tail -f /var/log/messages  
  3. Dec 12 09:25:49 server11 Keepalived_healthcheckers[7710]: Configuration is using : 5499 Bytes  
  4. Dec 12 09:25:49 server11 Keepalived_healthcheckers[7710]: Using LinkWatch kernel netlink reflector...  
  5. Dec 12 09:25:49 server11 Keepalived_vrrp[7712]: VRRP sockpool: [ifindex(2), proto(112), fd(12,13)]  
  6. Dec 12 09:25:49 server11 Keepalived_vrrp[7712]: VRRP_Script(Monitor_redis) succeeded  
  7. Dec 12 09:25:50 server11 Keepalived_vrrp[7712]: VRRP_Instance(VI_1{) Transition to MASTER STATE  
  8. Dec 12 09:25:51 server11 Keepalived_vrrp[7712]: VRRP_Instance(VI_1{) Entering MASTER STATE  
  9. Dec 12 09:25:51 server11 Keepalived_vrrp[7712]: VRRP_Instance(VI_1{) setting protocol VIPs.  
  10. Dec 12 09:25:51 server11 Keepalived_vrrp[7712]: VRRP_Instance(VI_1{) Sending gratuitous ARPs on eth0 for 192.168.1.120  
  11. Dec 12 09:25:51 server11 avahi-daemon[4519]: Registering new address record for 192.168.1.120 on eth0.  
  12. Dec 12 09:25:51 server11 Keepalived_healthcheckers[7710]: Netlink reflector reports IP 192.168.1.120 added  
  13. Dec 12 09:25:51 server11 Keepalived_vrrp[7712]: Netlink reflector reports IP 192.168.1.120 added  
  14. Dec 12 09:25:56 server11 Keepalived_vrrp[7712]: VRRP_Instance(VI_1{) Sending gratuitous ARPs on eth0 for 192.168.1.120  
  15.  
  16. [root@server11 ~]# ip a |grep 192  
  17.     inet 192.168.1.112/24 brd 192.168.1.255 scope global eth0  
  18.     inet 192.168.1.120/32 scope global eth0  
  19.  
  20. [root@server12 ~]# /usr/local/keepalived/sbin/keepalived -D -f /usr/local/keepalived/etc/keepalived/keepalived.conf  
  21. [root@server12 ~]# tail -f /var/log/messages  
  22. Dec 12 09:26:55 server12 Keepalived_healthcheckers[3106]: Configuration is using : 5595 Bytes  
  23. Dec 12 09:26:55 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) Entering BACKUP STATE  
  24. Dec 12 09:26:55 server12 Keepalived_healthcheckers[3106]: Using LinkWatch kernel netlink reflector...  
  25. Dec 12 09:26:55 server12 Keepalived_vrrp[3108]: VRRP sockpool: [ifindex(2), proto(112), fd(12,13)]  
  26. Dec 12 09:26:55 server12 Keepalived_vrrp[3108]: VRRP_Script(Monitor_redis) succeeded  
  27.  
  28. [root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.120 -a 123 info |grep -A 3 'Replication'  
  29. # Replication  
  30. role:master  
  31. connected_slaves:1  
  32. slave0:192.168.1.113,6379,online 

六:主实例写入测试数据,该脚本原则上会写入25条测试数据,不过由于未优化redis默认并发数,会导致一些写入请求失败,最终功写入231839条测试数据,占内存总大小为25M左右,写入过程中可以观察主从实例的持久化文件变化情况,主实例的持久化文件维持在30k,从实例的则不断的扩展! 

   
   
   
   
  1. [root@server11 ~]# cat test.sh   
  2. #!/bin/bash    
  3. REDISCLI="/usr/local/redis2/bin/redis-cli -h 192.168.1.120 -a 123 -n 1 SET"   
  4. ID=1    
  5. while(($ID<50001))    
  6. do   
  7. INSTANCE_NAME="i-2-$ID-VM"   
  8. UUID=`cat /proc/sys/kernel/random/uuid`    
  9. PRIVATE_IP_ADDRESS=10.`echo "$RANDOM % 255 + 1" | bc`.`echo "$RANDOM % 255 + 1" | bc`.`echo   
  10.  
  11. "$RANDOM % 255 + 1" | bc`\    
  12. CREATED=`date "+%Y-%m-%d %H:%M:%S"`    
  13. $REDISCLI vm_instance:$ID:instance_name "$INSTANCE_NAME"   
  14. $REDISCLI vm_instance:$ID:uuid "$UUID"   
  15. $REDISCLI vm_instance:$ID:private_ip_address "$PRIVATE_IP_ADDRESS"   
  16. $REDISCLI vm_instance:$ID:created "$CREATED"   
  17. $REDISCLI vm_instance:$INSTANCE_NAME:id "$ID"   
  18. ID=$(($ID+1))    
  19. done   
  20.  
  21. [root@server11 ~]# sh test.sh   
  22. [root@server11 redis2]# /usr/local/redis2/bin/redis-cli -h 192.168.1.120 -a 123 info |egrep   
  23.  
  24. 'used_memory_peak_human|db1:keys'  
  25. used_memory_peak_human:24.98M  
  26. db1:keys=231839,expires=0 

Redis主从自动failover_第1张图片 

Redis主从自动failover_第2张图片

七:模拟主实例故障,观察日志输出,验证从实例是否能成功接管VIP,同时将实例变成读写模式

   
   
   
   
  1. [root@server11 ~]# killall -9  redis-server  
  2. [root@server11 ~]# ip a |grep 192  
  3.     inet 192.168.1.112/24 brd 192.168.1.255 scope global eth0  
  4.  
  5. [root@server11 ~]# ps -ef |grep redis  
  6. root     15886  6458  0 09:49 pts/0    00:00:00 grep redis  
  7. [root@server11 ~]# ps -ef |grep keep  
  8. root     16029  6458  0 09:49 pts/0    00:00:00 grep keep  
  9.  
  10.  
  11. [root@server12 ~]# tail -f /usr/local/redis2/var/keepalived-redis-state.log   
  12. [master]  
  13. Wed Dec 12 09:48:52 CST 2012  
  14. Being master....  
  15. Run SLAVEOF cmd ...  
  16. OK Already connected to specified master  
  17. Run SLAVEOF NO ONE cmd ...  
  18. OK  
  19.  
  20. [root@server12 ~]# tail -f /var/log/messages  
  21. Dec 12 09:48:51 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) Transition to MASTER STATE  
  22. Dec 12 09:48:52 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) Entering MASTER STATE  
  23. Dec 12 09:48:52 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) setting protocol VIPs.  
  24. Dec 12 09:48:52 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) Sending gratuitous ARPs on eth0 for 192.168.1.120  
  25. Dec 12 09:48:52 server12 Keepalived_vrrp[3108]: Netlink reflector reports IP 192.168.1.120 added  
  26. Dec 12 09:48:52 server12 avahi-daemon[2921]: Registering new address record for 192.168.1.120 on eth0.  
  27. Dec 12 09:48:52 server12 Keepalived_healthcheckers[3106]: Netlink reflector reports IP 192.168.1.120 added  
  28. Dec 12 09:48:57 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) Sending gratuitous ARPs on eth0 for 192.168.1.120  
  29.  
  30. [root@server12 ~]# ip a |grep 192  
  31.     inet 192.168.1.113/24 brd 192.168.1.255 scope global eth0  
  32.     inet 192.168.1.120/32 scope global eth0  
  33.  
  34. [root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.120 -a 123 info |grep -A 3 'Replication'  
  35. # Replication  
  36. role:master  
  37. connected_slaves:0 

   
   
   
   
  1. [root@server12 ~]# sh test.sh  
  2. [root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.120 -a 123 info |egrep   
  3.  
  4. 'used_memory_peak_human|db1:keys'  
  5. used_memory_peak_human:26.78M  
  6. db1:keys=249925,expires=0 

九:主实例角色的恢复过程,使用shell脚本自动恢复

   
   
   
   
  1. [root@server11 ~]# ssh-keygen   
  2. [root@server11 ~]# cd .ssh/  
  3. [root@server11 .ssh]# ssh-copy-id -i id_rsa.pub [email protected]  
  4. [root@server11 ~]# cat /usr/local/scripts/recover_mastart.sh   
  5. #!/bin/sh  
  6. ALIVE=$(/usr/local/redis2/bin/redis-cli -h 192.168.1.113 -p 6379 -a 123 PING)  
  7. MDB=/usr/local/redis2/master_dump.rdb  
  8. SDB=/usr/local/redis2/slave_dump.rdb  
  9.  
  10. if [ "$ALIVE" == "PONG" ]; then   
  11.     echo $ALIVE    
  12.     scp [email protected]:$SDB  $MDB  
  13.     else   
  14.     echo $ALIVE   
  15.     exit 1    
  16. fi   
  17.  
  18. /usr/local/redis2/bin/redis-server /usr/local/redis2/etc/redis.conf  
  19. /usr/local/keepalived/sbin/keepalived -D -f    
  20.  
  21. /usr/local/keepalived/etc/keepalived/keepalived.conf  
  22.  
  23. [root@server11 ~]# chmod +x  /usr/local/scripts/recover_mastart.sh   
  24. [root@server11 ~]# sh /usr/local/scripts/recover_mastart.sh  

十:验证数据完整性和主从角色恢复情况

   
   
   
   
  1. [root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.120 -a 123 info |egrep 'used_memory_peak_human|db1:keys'  
  2. used_memory_peak_human:26.78M  
  3. db1:keys=249925,expires=0 
  4.  
  5. [root@server11 ~]#  /usr/local/redis2/bin/redis-cli -h 192.168.1.120 -a 123 info |grep -A 3 'Replication'  
  6. # Replication  
  7. role:master  
  8. connected_slaves:1  
  9. slave0:192.168.1.113,6379,online  
  10.  
  11. [root@server12 ~]#  /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 info |grep -A 3 'Replication'  
  12. # Replication  
  13. role:slave  
  14. master_host:192.168.1.112  
  15. master_port:6379  
  16.  
  17. [root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.120 -a 123 info |egrep 'used_memory_peak_human|db1:keys'  
  18. used_memory_peak_human:26.78M  
  19. db1:keys=249925,expires=0 
  20.  
  21. 主实例keepalive日志:  
  22. [root@server11 ~]# tail -f /var/log/messages  
  23. Dec 12 10:08:13 server11 Keepalived_vrrp[20231]: VRRP sockpool: [ifindex(2), proto(112), fd(11,12)]  
  24. Dec 12 10:08:13 server11 Keepalived_vrrp[20231]: VRRP_Script(Monitor_redis) succeeded  
  25. Dec 12 10:08:13 server11 Keepalived_vrrp[20231]: VRRP_Instance(VI_1{) Transition to MASTER STATE  
  26. Dec 12 10:08:13 server11 Keepalived_vrrp[20231]: VRRP_Instance(VI_1{) Received higher prio advert  
  27. Dec 12 10:08:13 server11 Keepalived_vrrp[20231]: VRRP_Instance(VI_1{) Entering BACKUP STATE  
  28. Dec 12 10:08:15 server11 Keepalived_vrrp[20231]: VRRP_Instance(VI_1{) forcing a new MASTER election  
  29. Dec 12 10:08:16 server11 Keepalived_vrrp[20231]: VRRP_Instance(VI_1{) Transition to MASTER STATE  
  30. Dec 12 10:08:17 server11 Keepalived_vrrp[20231]: VRRP_Instance(VI_1{) Entering MASTER STATE  
  31. Dec 12 10:08:17 server11 Keepalived_vrrp[20231]: VRRP_Instance(VI_1{) setting protocol VIPs.  
  32. Dec 12 10:08:17 server11 Keepalived_healthcheckers[20230]: Netlink reflector reports IP 192.168.1.120 added  
  33. Dec 12 10:08:17 server11 Keepalived_vrrp[20231]: VRRP_Instance(VI_1{) Sending gratuitous ARPs on eth0 for 192.168.1.120  
  34. Dec 12 10:08:17 server11 Keepalived_vrrp[20231]: Netlink reflector reports IP 192.168.1.120 added  
  35. Dec 12 10:08:17 server11 avahi-daemon[4519]: Registering new address record for 192.168.1.120 on eth0.  
  36.  
  37. [root@server11 ~]# ip a |grep 192  
  38.     inet 192.168.1.112/24 brd 192.168.1.255 scope global eth0  
  39.     inet 192.168.1.120/32 scope global eth0  
  40.  
  41. 从实例keepalive日志:  
  42. [root@server12 ~]# tail -f /var/log/messages  
  43. Dec 12 09:56:01 server12 last message repeated 4 times  
  44. Dec 12 10:08:13 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) Received lower prio advert, forcing new election  
  45. Dec 12 10:08:13 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) Sending gratuitous ARPs on eth0 for 192.168.1.120  
  46. Dec 12 10:08:15 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) Received higher prio advert  
  47. Dec 12 10:08:15 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) Entering BACKUP STATE  
  48. Dec 12 10:08:15 server12 Keepalived_vrrp[3108]: VRRP_Instance(VI_1{) removing protocol VIPs.  
  49. Dec 12 10:08:15 server12 Keepalived_healthcheckers[3106]: Netlink reflector reports IP 192.168.1.120 removed  
  50. Dec 12 10:08:15 server12 Keepalived_vrrp[3108]: Netlink reflector reports IP 192.168.1.120 removed  
  51. Dec 12 10:08:15 server12 avahi-daemon[2921]: Withdrawing address record for 192.168.1.120 on eth0.  
  52.  
  53. 从实例角色转换日志:  
  54. [root@server12 ~]# tail -f /usr/local/redis2/var/keepalived-redis-state.log   
  55. [backup]  
  56. Wed Dec 12 10:08:15 CST 2012  
  57. Being slave....  
  58. Run SLAVEOF cmd ...  
  59. OK 

参考文档:
http://heylinux.com/archives/1932.html