redis 从sentinel的日志文件分析3个哨兵的master-slave切换实验
实验环境 三台centos虚拟机上分别安装配置了redis哨兵与服务器环境,配置文件分别是sentinel.conf与redis.conf
实验目的 测试哨兵的容灾能力,通过手动杀死master进程,测试哨兵的主从切换过程。
3台虚拟机分别称为A,B,C,配置如下
虚拟机A:
ip:192.168.0.101
redis.conf :(剩余没显示的配置directives都是默认设置)
bind 192.168.0.101
protected-mode yes
port 6379
daemonize yes
pidfile "/var/run/redis_6379.pid"
rdbcompression yes
//Compress string objects using LZF when dump .rdb databases
dbfilename "dump.rdb"
dir "/var/redis/6379"
masterauth "redis_pass"
slave-priority 70
requirepass "redis_pass"
appendonly yes
appendfilename "appendonly.aof"
sentinel.conf
bind 192.168.0.101
protected-mode no
port 5000
sentinel myid 9bcf66ffe1cf831a8bf1089805306ad972ec710e
dir "/var/sentinel/5000"
sentinel monitor mymaster 192.168.0.103 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 10000
sentinel auth-pass mymaster redis_pass
虚拟机B:
ip:192.168.0.102
redis.conf(剩余没显示的配置directives都是默认设置)
bind 192.168.0.102
protected-mode yes
port 6379
daemonize yes
pidfile "/var/run/redis_6379.pid"
rdbcompression yes
//Compress string objects using LZF when dump .rdb databases
dbfilename "dump.rdb"
dir "/var/redis/6379"
masterauth "redis_pass"
slave-priority 100
requirepass "redis_pass"
appendonly yes
appendfilename "appendonly.aof"
sentinel.conf:
bind 192.168.0.102
protected-mode no
port 5000
sentinel myid 32c6e25d252cfeb6ab8163794c012efd1fcf0456
dir "/var/sentinel/5000"
sentinel monitor mymaster 192.168.0.103 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 10000
sentinel auth-pass mymaster redis_pass
虚拟机C:
ip:192.168.0.103
redis.conf(剩余没显示的配置directives都是默认设置)
bind 192.168.0.103
protected-mode yes
port 6379
daemonize yes
pidfile "/var/run/redis_6379.pid"
rdbcompression yes
//Compress string objects using LZF when dump .rdb databases
dbfilename "dump.rdb"
dir "/var/redis/6379"
masterauth "redis_pass"
slave-priority 5
requirepass "redis_pass"
appendonly yes
appendfilename "appendonly.aof"
sentinel.conf:
bind 192.168.0.103
protected-mode no
port 5000
sentinel myid 32d3371febb9063058c4ac3f4f7a3a93490b7eae
dir "/var/sentinel/5000"
sentinel monitor mymaster 192.168.0.103 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 10000
sentinel auth-pass mymaster redis_pass
虚拟机A的redis-server的priority是默认值100,虚拟机B是70,虚拟机C是5,一开始将虚拟机B设置了master(IP地址是192.168.0.102),其他为slave
所以你会看到:(以下分析,默认都是看虚拟机A的日志)
1140:X 13 Dec 21:47:27.132 * +slave slave 192.168.0.101:6379 192.168.0.101 6379 @ mymaster 192.168.0.102 6379
1140:X 13 Dec 21:47:27.144 * +slave slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.102 6379
将虚拟机A和虚拟机C当作slave。
同时,哨兵在pub/sub系统上会互相发现并通信,在每个哨兵上,会看到
1140:X 13 Dec 21:47:33.491 * +sentinel sentinel 32c6e25d252cfeb6ab8163794c012efd1fcf0456 192.168.0.102 5000 @ mymaster 192.168.0.102 6379
1140:X 13 Dec 21:47:34.868 * +sentinel sentinel 32d3371febb9063058c4ac3f4f7a3a93490b7eae 192.168.0.103 5000 @ mymaster 192.168.0.102 6379
1,然后,在虚拟机B上,执行
ps -ef | grep redis
显示出:
[root@eshop-cache02 ~]# ps -ef | grep redis
root 1052 1 0 08:42 ? 00:00:02 /usr/local/bin/redis-server 192.168.0.102:6379
root 1150 1107 0 09:26 pts/0 00:00:00 redis-sentinel 192.168.0.102:5000 [sentinel]
root 1154 1122 0 09:26 pts/1 00:00:00 grep redis
用 kill -9 1052 将redis-server kill了,别忘了,rm -rf /var/run/redis_6379.pid清理一下
会在哨兵看到
1140:X 13 Dec 21:48:06.705 # +sdown master mymaster 192.168.0.102 6379
说明该哨兵主观认为master down了
2,达到了quorum数量的哨兵认为master down了,自动转为odown:
1140:X 13 Dec 21:48:07.791 # +odown master mymaster 192.168.0.102 6379 #quorum 3/2
3,更新版本
1140:X 13 Dec 21:48:06.809 # +new-epoch 1
4,投票选择一个哨兵当master-slave切换的leader哨兵:(这里可以看虚拟机B的日志比较清晰)
1141:X 13 Dec 21:48:05.645 # +vote-for-leader 32c6e25d252cfeb6ab8163794c012efd1fcf0456 1
1141:X 13 Dec 21:48:05.652 # 32d3371febb9063058c4ac3f4f7a3a93490b7eae voted for 32c6e25d252cfeb6ab8163794c012efd1fcf0456 1
1141:X 13 Dec 21:48:05.652 # 9bcf66ffe1cf831a8bf1089805306ad972ec710e voted for 32c6e25d252cfeb6ab8163794c012efd1fcf0456 1
好了,myid是32c6e25d252cfeb6ab8163794c012efd1fcf0456(虚拟机B的sentinel)全票当选;D
1141:X 13 Dec 21:48:05.722 # +elected-leader master mymaster 192.168.0.102 6379
5,既然虚拟机B的sentinel当选负责人,那就看他的日志(以下分析默认用虚拟机B的sentinel日志)
1141:X 13 Dec 21:48:05.781 # +selected-slave slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:05.781 * +failover-state-send-slaveof-noone slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.102 6379
选择了虚拟机C,做主从切换,然后就是切换成功,和修改一些配置文件的信息的工作了
6,最后,把原来的master当作slave加入:
1141:X 13 Dec 21:48:08.877 * +slave slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
因为它还是关闭的,所以
1141:X 13 Dec 21:48:13.928 # +sdown slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
我们在虚拟机B上手动开启
redis-server /etc/redis/6379.conf
开启成功,就会看到
1141:X 13 Dec 21:49:58.343 # -sdown slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
注意-sdown,说明虚拟机B上的redis-server启动了。(+号与-号是相反的状态)
虚拟机A的sentinel日志
[root@eshop-cache01 ~]# redis-sentinel /etc/sentinel/5000.conf
1140:X 13 Dec 21:47:27.129 * Increased maximum number of open files to 10032 (it was originally set to 1024).
1140:X 13 Dec 21:47:27.131 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1140:X 13 Dec 21:47:27.131 # Sentinel ID is 9bcf66ffe1cf831a8bf1089805306ad972ec710e
1140:X 13 Dec 21:47:27.131 # +monitor master mymaster 192.168.0.102 6379 quorum 2
1140:X 13 Dec 21:47:27.132 * +slave slave 192.168.0.101:6379 192.168.0.101 6379 @ mymaster 192.168.0.102 6379
1140:X 13 Dec 21:47:27.144 * +slave slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.102 6379
1140:X 13 Dec 21:47:33.491 * +sentinel sentinel 32c6e25d252cfeb6ab8163794c012efd1fcf0456 192.168.0.102 5000 @ mymaster 192.168.0.102 6379
1140:X 13 Dec 21:47:34.868 * +sentinel sentinel 32d3371febb9063058c4ac3f4f7a3a93490b7eae 192.168.0.103 5000 @ mymaster 192.168.0.102 6379
1140:X 13 Dec 21:48:06.705 # +sdown master mymaster 192.168.0.102 6379
1140:X 13 Dec 21:48:07.791 # +odown master mymaster 192.168.0.102 6379 #quorum 3/2
1140:X 13 Dec 21:48:06.809 # +new-epoch 1
1140:X 13 Dec 21:48:06.813 # +vote-for-leader 32c6e25d252cfeb6ab8163794c012efd1fcf0456 1
1140:X 13 Dec 21:48:07.946 # +config-update-from sentinel 32c6e25d252cfeb6ab8163794c012efd1fcf0456 192.168.0.102 5000 @ mymaster 192.168.0.102 6379
1140:X 13 Dec 21:48:07.946 # +switch-master mymaster 192.168.0.102 6379 192.168.0.103 6379
1140:X 13 Dec 21:48:07.947 * +slave slave 192.168.0.101:6379 192.168.0.101 6379 @ mymaster 192.168.0.103 6379
1140:X 13 Dec 21:48:07.947 * +slave slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
1140:X 13 Dec 21:48:12.975 # +sdown slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
1140:X 13 Dec 21:49:59.076 # -sdown slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
1140:X 13 Dec 21:50:09.058 * +convert-to-slave slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
虚拟机B的sentinel日志
[root@eshop-cache02 ~]# redis-sentinel /etc/sentinel/5000.conf
1141:X 13 Dec 21:47:30.282 * Increased maximum number of open files to 10032 (it was originally set to 1024).
1141:X 13 Dec 21:47:30.284 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1141:X 13 Dec 21:47:30.284 # Sentinel ID is 32c6e25d252cfeb6ab8163794c012efd1fcf0456
1141:X 13 Dec 21:47:30.284 # +monitor master mymaster 192.168.0.102 6379 quorum 2
1141:X 13 Dec 21:47:30.298 * +slave slave 192.168.0.101:6379 192.168.0.101 6379 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:47:30.309 * +slave slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:47:32.013 * +sentinel sentinel 9bcf66ffe1cf831a8bf1089805306ad972ec710e 192.168.0.101 5000 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:47:33.706 * +sentinel sentinel 32d3371febb9063058c4ac3f4f7a3a93490b7eae 192.168.0.103 5000 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:05.586 # +sdown master mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:05.638 # +odown master mymaster 192.168.0.102 6379 #quorum 3/2
1141:X 13 Dec 21:48:05.638 # +new-epoch 1
1141:X 13 Dec 21:48:05.638 # +try-failover master mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:05.645 # +vote-for-leader 32c6e25d252cfeb6ab8163794c012efd1fcf0456 1
1141:X 13 Dec 21:48:05.652 # 32d3371febb9063058c4ac3f4f7a3a93490b7eae voted for 32c6e25d252cfeb6ab8163794c012efd1fcf0456 1
1141:X 13 Dec 21:48:05.652 # 9bcf66ffe1cf831a8bf1089805306ad972ec710e voted for 32c6e25d252cfeb6ab8163794c012efd1fcf0456 1
1141:X 13 Dec 21:48:05.722 # +elected-leader master mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:05.722 # +failover-state-select-slave master mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:05.781 # +selected-slave slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:05.781 * +failover-state-send-slaveof-noone slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:05.844 * +failover-state-wait-promotion slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:06.736 # +promoted-slave slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:06.736 # +failover-state-reconf-slaves master mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:06.783 * +slave-reconf-sent slave 192.168.0.101:6379 192.168.0.101 6379 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:07.784 * +slave-reconf-inprog slave 192.168.0.101:6379 192.168.0.101 6379 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:07.850 # -odown master mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:08.787 * +slave-reconf-done slave 192.168.0.101:6379 192.168.0.101 6379 @ mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:08.877 # +failover-end master mymaster 192.168.0.102 6379
1141:X 13 Dec 21:48:08.877 # +switch-master mymaster 192.168.0.102 6379 192.168.0.103 6379
1141:X 13 Dec 21:48:08.877 * +slave slave 192.168.0.101:6379 192.168.0.101 6379 @ mymaster 192.168.0.103 6379
1141:X 13 Dec 21:48:08.877 * +slave slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
1141:X 13 Dec 21:48:13.928 # +sdown slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
1141:X 13 Dec 21:49:58.343 # -sdown slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
虚拟机C的sentinel日志
[root@eshop-cahce03 ~]# redis-sentinel /etc/sentinel/5000.conf
1143:X 13 Dec 21:47:32.339 * Increased maximum number of open files to 10032 (it was originally set to 1024).
1143:X 13 Dec 21:47:32.339 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1143:X 13 Dec 21:47:32.339 # Sentinel ID is 32d3371febb9063058c4ac3f4f7a3a93490b7eae
1143:X 13 Dec 21:47:32.339 # +monitor master mymaster 192.168.0.102 6379 quorum 2
1143:X 13 Dec 21:47:32.342 * +slave slave 192.168.0.101:6379 192.168.0.101 6379 @ mymaster 192.168.0.102 6379
1143:X 13 Dec 21:47:32.353 * +slave slave 192.168.0.103:6379 192.168.0.103 6379 @ mymaster 192.168.0.102 6379
1143:X 13 Dec 21:47:32.707 * +sentinel sentinel 9bcf66ffe1cf831a8bf1089805306ad972ec710e 192.168.0.101 5000 @ mymaster 192.168.0.102 6379
1143:X 13 Dec 21:47:33.025 * +sentinel sentinel 32c6e25d252cfeb6ab8163794c012efd1fcf0456 192.168.0.102 5000 @ mymaster 192.168.0.102 6379
1143:X 13 Dec 21:48:06.274 # +sdown master mymaster 192.168.0.102 6379
1143:X 13 Dec 21:48:06.343 # +new-epoch 1
1143:X 13 Dec 21:48:06.346 # +vote-for-leader 32c6e25d252cfeb6ab8163794c012efd1fcf0456 1
1143:X 13 Dec 21:48:06.358 # +odown master mymaster 192.168.0.102 6379 #quorum 2/2
1143:X 13 Dec 21:48:06.358 # Next failover delay: I will not start a failover before Wed Dec 13 21:48:26 2017
1143:X 13 Dec 21:48:07.478 # +config-update-from sentinel 32c6e25d252cfeb6ab8163794c012efd1fcf0456 192.168.0.102 5000 @ mymaster 192.168.0.102 6379
1143:X 13 Dec 21:48:07.478 # +switch-master mymaster 192.168.0.102 6379 192.168.0.103 6379
1143:X 13 Dec 21:48:07.478 * +slave slave 192.168.0.101:6379 192.168.0.101 6379 @ mymaster 192.168.0.103 6379
1143:X 13 Dec 21:48:07.478 * +slave slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
1143:X 13 Dec 21:48:12.515 # +sdown slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379
1143:X 13 Dec 21:49:59.023 # -sdown slave 192.168.0.102:6379 192.168.0.102 6379 @ mymaster 192.168.0.103 6379