本文介绍redis主从环境下的手工failover操作及排错过程,实现主实例宕机的时候,将从实例提升为主实例,继续写入数据;等到原主实例恢复后,同步原从实例上的数据完成后,恢复初始的主从实例角色!
环境介绍
操作系统版本均为:rhel5.4 64bit
redis版本:2.6.4
redis实例端口均为:6379
redis实例密码均为:123
主实例为server11(192.168.1.112)
从实例为server12(192.168.1.113)
一:未配置持久化情况下的手工切换
1:正常情况下,server11为主实例,server12为从实例,数据同步正常
01.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123
02.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 info |grep -A 3 'Replication'
03.# Replication
04.role:master
05.connected_slaves:1
06.slave0:192.168.1.113,6379,online
07.
08.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 config get save
09.1) "save"
10.2) ""
11.
12.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 config get save
13.1) "save"
14.2) ""
15.
16.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 set 5 e
17.OK
18.
19.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 get 5
20."e"
21.
22.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 5 23."e" 2:当主实例挂掉的时候,从实例可以正常查询,但无法写入数据
01.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 shutdown
02.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 get 5
03.Could not connect to Redis at 192.168.1.112:6379: Connection refused
04.
05.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 5
06."e"
07.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 set 6 f 08.(error) READONLY You can't write against a read only slave. 3:将从实例提升为主实例,从而实现数据写入
01.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 SLAVEOF NO ONE
02.OK
03.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 info |grep -A 3 'Replication'
04.# Replication
05.role:master
06.connected_slaves:0
07.
08.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 5
09."e"
10.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 set 6 f
11.OK
12.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 6
13."f" 4:主实例恢复后尝试从server12实例上获取最新的数据,实际测试表明这种方法不可行,最终导致server11和server12的数据不一致,如果强行恢复初始实例角色,则会导致数据丢失
01.[root@server11 ~]# /usr/local/redis2/bin/redis-server /usr/local/redis2/etc/redis.conf
02.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 info |grep -A 3 'Replication'
03.# Replication
04.role:master
05.connected_slaves:0
06.
07.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 get 5
08.(nil)
09.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 get 6
10.(nil)
11.
12.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 5
13."e"
14.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 6
15."f"
16.
17.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -p 6379 -a 123 SLAVEOF 192.168.1.113 6379
18.OK
19.
20.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 info |grep -A 10 'Replication'
21.# Replication
22.role:slave
23.master_host:192.168.1.113
24.master_port:6379
25.master_link_status:down
26.master_last_io_seconds_ago:-1
27.master_sync_in_progress:0
28.master_link_down_since_seconds:517
29.slave_priority:100
30.slave_read_only:1
31.connected_slaves:0
32.
33.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 info |grep -A 3 'Replication'
34.# Replication
35.role:master
36.connected_slaves:0
37.
38.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 get 5
39.(nil)
40.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 get 6
41.(nil)
42.
43.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 6
44."f"
45.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 5
46."e" 二:开启从实例快照持久化下的测试
1:恢复原测试环境后,开启从实例的快照持久化,因为是测试环境,所以设置60秒内如果有1条数据变更则保持一次快照
01.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 config get save 02.1) "save" 03.2) "" 04. 05.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 config get save 06.1) "save" 07.2) "60 1" 08. 09.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 info |grep -A 3 'Replication' 10.# Replication 11.role:master 12.connected_slaves:1 13.slave0:192.168.1.113,6379,online 14. 15.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 info |grep -A 3 'Replication' 16.# Replication 17.role:slave 18.master_host:192.168.1.112 19.master_port:6379 2:写入测试数据主从环境数据是否同步正常
01.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 set 5 e 02.OK 03. 04.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 get 5 05."e" 06. 07.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 5 08."e" 3:模拟主实例宕机,手动将从实例提升为主实例,继续写入新数据
01.[root@server11 ~]# killall -9 redis-server 02.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 info |grep -A 3 'Replication' 03.Could not connect to Redis at 192.168.1.112:6379: Connection refused 04. 05.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 5 06."e" 07.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 set 6 f 08.(error) READONLY You can't write against a read only slave 09. 10.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 slaveof no one 11.OK 12.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 info |grep -A 3 'Replication' 13.# Replication 14.role:master 15.connected_slaves:0 16. 17.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 5 18."e" 19. 20.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 set 6 f 21.OK 22. 23.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 6 24."f" 4:原主实例恢复后的数据同步及角色复原,这里同步数据采取将从实例的快照文件复制到主实例的方式实现
01.[root@server12 ~]# scp /usr/local/redis2/slave_dump.rdb server11:/usr/local/redis2/master_dump.rdb 02.[root@server11 ~]# /usr/local/redis2/bin/redis-server /usr/local/redis2/etc/redis.conf 03.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 info |grep -A 2 'Replication' 04.# Replication 05.role:master 06.connected_slaves:0 07.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 get 5 08."e" 09.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 get 6 10."f" 11. 12.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 slaveof 192.168.1.112 6379 13.OK 14.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 info |grep -A 10 'Replication' 15.# Replication 16.role:slave 17.master_host:192.168.1.112 18.master_port:6379 19.master_link_status:up 20.master_last_io_seconds_ago:1 21.master_sync_in_progress:0 22.slave_priority:100 23.slave_read_only:1 24.connected_slaves:0 25.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 5 26."e" 27.[root@server12 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.113 -a 123 get 6 28."f 29. 30.[root@server11 ~]# /usr/local/redis2/bin/redis-cli -h 192.168.1.112 -a 123 info |grep -A 3 'Replication' 31.# Replication 32.role:master 33.connected_slaves:1 34.slave0:192.168.1.113,6379,online