简介

redis是一个key-value存储系统。和Memcached类似,它支持存储的value类型相对更多,包括string(字符串)、list(链表)、set(集合)、zset(sorted set --有序集合)和hash(哈希类型)这些数据类型都支持push/pop、add/remove及取交集并集和差集及更丰富的操作,而且这些操作都是原子性的。在此基础上,redis支持各种不同方式的排序,与memcached一样,为了保证效率,数据都是缓存在内存中,区别的是redis会周期性的把更新的数据写入磁盘或者把修改操作写入追加的记录文件,并且在此基础上实现了master-slave(主从)同步

redis哨兵

Redis Sentinel(哨兵)是用于监控redis集群中Master状态的工具,其已经被集成在redis2.4+的版本中
Sentinel作用
  1):Master状态检测
  2):如果Master异常,则会进行Master-Slave切换,将其中一个Slave作为Master,将之前的Master作为Slave
  3):Master-Slave切换后,master_redis.conf、slave_redis.conf和sentinel.conf的内容都会发生改变,即master_redis.conf中会多一行slaveof的配置,sentinel.conf的监控目标会随之调换
Sentinel工作方式
  1):每个Sentinel以每秒钟一次的频率向它所知的Master,Slave以及其他Sentinel实例发送一个PING命令
  2):如果一个实例(instance)距离最后一次有效回复PING命令的时间超过down-after-milliseconds选项所指定的值, 则这个实例会被Sentinel标记为主观下线
  3):如果一个Master被标记为主观下线,则正在监视这个Master的所有Sentinel要以每秒一次的频率确认Master的确进入了主观下线状态
  4):当有足够数量的Sentinel(大于等于配置文件指定的值)在指定的时间范围内确认Master的确进入了主观下线状态, 则Master会被标记为客观下线
  5):在一般情况下, 每个Sentinel会以每10秒一次的频率向它已知的所有Master,Slave发送INFO命令
6):当Master被Sentinel标记为客观下线时,Sentinel向下线的Master的所有Slave发送INFO命令的频率会从10 秒一次改为每秒一次
  7):若没有足够数量的Sentinel同意Master已经下线,Master的客观下线状态就会被移除,若Master重新向Sentinel的PING命令返回有效回复,Master的主观下线状态就会被移除
主观下线和客观下线
  主观下线:Subjectively Down,简称SDOWN,指的是当前Sentinel实例对某个redis服务器做出的下线判断
  客观下线:Objectively Down, 简称ODOWN,指的是多个Sentinel实例在对Master Server做出SDOWN判断,并且通过SENTINEL is-master-down-by-addr命令互相交流之后,得出的Master Server下线判断,然后开启failover
  SDOWN适合于Master和Slave,只要一个Sentinel发现Master进入了ODOWN, 这个Sentinel就可能会被其他Sentinel推选出,并对下线的主服务器执行自动故障迁移操作
  ODOWN只适用于Master,对于Slave的Redis实例,Sentinel在将它们判断为下线前不需要进行协商, 所以Slave的Sentinel永远不会达到ODOWN

前期准备

准备三台Centos7虚拟机,配置IP地址和Hostname,同步系统时间,关闭防火墙和selinux,修改IP地址和hostname映射

ip hostname
192.168.29.131 node1
192.168.29.146 node2
192.168.29.147 node3

部署redis服务

[root@node1 ~]# yum install redis -y
[root@node2 ~]# yum install redis -y
[root@node3 ~]# yum install redis -y

修改配置文件

[root@node1 ~]# vi /etc/redis.conf
[root@node2 ~]# vi /etc/redis.conf
[root@node3 ~]# vi /etc/redis.conf
#地址绑定
bind 0.0.0.0
#关闭保护模式
protected-mode no
#设置为后台进程
daemonize yes

部署redis主从复制

修改配置文件

[root@node2 ~]# vi /etc/redis.conf
slaveof 192.168.29.131 6379
[root@node3 ~]# vi /etc/redis.conf
slaveof 192.168.29.131 6379
#启动服务
[root@node1 ~]# systemctl start redis
[root@node2 ~]# systemctl start redis
[root@node3 ~]# systemctl start redis

验证redis主从复制

[root@node1 ~]# redis-cli info
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.29.147,port=6379,state=online,offset=15,lag=0
slave1:ip=192.168.29.146,port=6379,state=online,offset=15,lag=0
master_repl_offset:15
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:14
#node1添加数据
[root@node1 ~]# redis-cli set hello world
OK
#node2查看数据
[root@node2 ~]# redis-cli get hello
"world"
#node3查看数据
[root@node3 ~]# redis-cli get hello
"world"

部署redis哨兵

修改配置文件

[root@node1 ~]# vi /etc/redis-sentinel.conf 
#设定哨兵监控集群,2表示有两个哨兵发现master宕机即开始选举新的结点
sentinel monitor mymaster 192.168.29.131 6379 2
#设定哨兵探测master存活的间隔时间
sentinel down-after-milliseconds mymaster 5000
#设定当master宕机后更换master结点的间隔时间
sentinel failover-timeout mymaster 10000
#关闭保护模式
protected-mode no
[root@node2 ~]# vi /etc/redis-sentinel.conf 
sentinel monitor mymaster 192.168.29.131 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 10000
protected-mode no
[root@node3 ~]# vi /etc/redis-sentinel.conf 
sentinel monitor mymaster 192.168.29.131 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 10000
protected-mode no
#启动服务
[root@node1 ~]# systemctl start redis-sentinel.service
[root@node2 ~]# systemctl start redis-sentinel.service
[root@node3 ~]# systemctl start redis-sentinel.service

验证redis哨兵

查看哨兵情况

[root@node1 ~]# redis-cli -p 26379 INFO Sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=192.168.29.131:6379,slaves=2,sentinels=3

模拟master结点宕机

关闭node1结点的redis服务

[root@node1 ~]# systemctl stop redis

查看集群情况

[root@node3 ~]# redis-cli info
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.29.146,port=6379,state=online,offset=3913,lag=0
master_repl_offset:4199
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:4198

查看日志输出

[root@node1 ~]# cat /var/log/redis/sentinel.log 
6493:X 24 Jul 23:42:01.714 # +new-epoch 1
6493:X 24 Jul 23:42:01.716 # +vote-for-leader 1e928f9b00cb1fa317a98330da73e468cf201003 1
6493:X 24 Jul 23:42:01.717 # +sdown master mymaster 192.168.29.131 6379
6493:X 24 Jul 23:42:01.773 # +odown master mymaster 192.168.29.131 6379 #quorum 3/2
6493:X 24 Jul 23:42:01.773 # Next failover delay: I will not start a failover before Fri Jul 24 23:42:22 2020
6493:X 24 Jul 23:42:01.962 # +config-update-from sentinel 1e928f9b00cb1fa317a98330da73e468cf201003 192.168.29.146 26379 @ mymaster 192.168.29.131 6379
6493:X 24 Jul 23:42:01.962 # +switch-master mymaster 192.168.29.131 6379 192.168.29.147 6379
6493:X 24 Jul 23:42:01.963 * +slave slave 192.168.29.146:6379 192.168.29.146 6379 @ mymaster 192.168.29.147 6379
6493:X 24 Jul 23:42:01.963 * +slave slave 192.168.29.131:6379 192.168.29.131 6379 @ mymaster 192.168.29.147 6379
6493:X 24 Jul 23:42:06.997 # +sdown slave 192.168.29.131:6379 192.168.29.131 6379 @ mymaster 192.168.29.147 6379

关闭node3结点的redis服务

[root@node3 ~]# systemctl stop redis 

查看集群情况

[root@node2 ~]# redis-cli info
# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

查看日志输出

[root@node1 ~]# cat /var/log/redis/sentinel.log
6493:X 24 Jul 23:45:42.529 # +sdown master mymaster 192.168.29.147 6379
6493:X 24 Jul 23:45:42.613 # +odown master mymaster 192.168.29.147 6379 #quorum 2/2
6493:X 24 Jul 23:45:42.613 # +new-epoch 2
6493:X 24 Jul 23:45:42.613 # +try-failover master mymaster 192.168.29.147 6379
6493:X 24 Jul 23:45:42.615 # +vote-for-leader ee7c7bf525d58010d579820a0a36c78af6df688a 2
6493:X 24 Jul 23:45:42.617 # 1e928f9b00cb1fa317a98330da73e468cf201003 voted for 1e928f9b00cb1fa317a98330da73e468cf201003 2
6493:X 24 Jul 23:45:42.620 # 993f8941dc7b8223131145c5b8f6edc3a153eb5a voted for 1e928f9b00cb1fa317a98330da73e468cf201003 2
6493:X 24 Jul 23:45:43.720 # +config-update-from sentinel 1e928f9b00cb1fa317a98330da73e468cf201003 192.168.29.146 26379 @ mymaster 192.168.29.147 6379
6493:X 24 Jul 23:45:43.721 # +switch-master mymaster 192.168.29.147 6379 192.168.29.146 6379
6493:X 24 Jul 23:45:43.721 * +slave slave 192.168.29.131:6379 192.168.29.131 6379 @ mymaster 192.168.29.146 6379
6493:X 24 Jul 23:45:43.721 * +slave slave 192.168.29.147:6379 192.168.29.147 6379 @ mymaster 192.168.29.146 6379
6493:X 24 Jul 23:45:48.759 # +sdown slave 192.168.29.147:6379 192.168.29.147 6379 @ mymaster 192.168.29.146 6379
6493:X 24 Jul 23:45:48.760 # +sdown slave 192.168.29.131:6379 192.168.29.131 6379 @ mymaster 192.168.29.146 6379

node2添加数据

127.0.0.1:6379> set name tom
OK
127.0.0.1:6379> set beautiful girl
OK

重启node1和node3的redis服务

[root@node1 ~]# systemctl start redis
[root@node3 ~]# systemctl start redis

查看集群情况

[root@node2 ~]# redis-cli info
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.29.131,port=6379,state=online,offset=8983,lag=1
slave1:ip=192.168.29.147,port=6379,state=online,offset=9126,lag=1
master_repl_offset:9126
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:9125

查看数据情况

[root@node1 ~]#  redis-cli get name
"tom"
[root@node1  ~]#  redis-cli get beautiful
"girl"

[root@node3 ~]#  redis-cli get name
"tom"
[root@node3 ~]#  redis-cli get beautiful
"girl"

模拟两个结点同时宕机

同时关闭node2和node3结点的redis服务

[root@node2 ~]# systemctl stop redis
[root@node3 ~]# systemctl stop redis

查看日志输出

[root@node1 ~]# cat /var/log/redis/sentinel.log
6493:X 24 Jul 23:53:23.065 # +sdown slave 192.168.29.147:6379 192.168.29.147 6379 @ mymaster 192.168.29.146 6379
6493:X 24 Jul 23:53:24.007 # +sdown master mymaster 192.168.29.146 6379
6493:X 24 Jul 23:53:24.048 # +new-epoch 3
6493:X 24 Jul 23:53:24.050 # +vote-for-leader 993f8941dc7b8223131145c5b8f6edc3a153eb5a 3
6493:X 24 Jul 23:53:24.069 # +odown master mymaster 192.168.29.146 6379 #quorum 3/2
6493:X 24 Jul 23:53:24.069 # Next failover delay: I will not start a failover before Fri Jul 24 23:53:44 2020
6493:X 24 Jul 23:53:25.198 # +config-update-from sentinel 993f8941dc7b8223131145c5b8f6edc3a153eb5a 192.168.29.147 26379 @ mymaster 192.168.29.146 6379
6493:X 24 Jul 23:53:25.199 # +switch-master mymaster 192.168.29.146 6379 192.168.29.131 6379
6493:X 24 Jul 23:53:25.199 * +slave slave 192.168.29.147:6379 192.168.29.147 6379 @ mymaster 192.168.29.131 6379
6493:X 24 Jul 23:53:25.199 * +slave slave 192.168.29.146:6379 192.168.29.146 6379 @ mymaster 192.168.29.131 6379
6493:X 24 Jul 23:53:30.206 # +sdown slave 192.168.29.147:6379 192.168.29.147 6379 @ mymaster 192.168.29.131 6379
6493:X 24 Jul 23:53:30.206 # +sdown slave 192.168.29.146:6379 192.168.29.146 6379 @ mymaster 192.168.29.131 6379

添加数据

[root@node1 ~]# redis-cli set handsome boy
OK

开启node2和node3结点的redis服务

[root@node2 ~]# systemctl start redis
[root@node3 ~]# systemctl start redis

查看集群情况

[root@node1 ~]# redis-cli info
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.29.146,port=6379,state=online,offset=6238,lag=0
slave1:ip=192.168.29.147,port=6379,state=online,offset=6238,lag=1
master_repl_offset:6381
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:6380

查看日志输出

root@node1 ~]# cat /var/log/redis/sentinel.log
6493:X 24 Jul 23:57:14.544 # -sdown slave 192.168.29.146:6379 192.168.29.146 6379 @ mymaster 192.168.29.131 6379
6493:X 24 Jul 23:57:17.470 # -sdown slave 192.168.29.147:6379 192.168.29.147 6379 @ mymaster 192.168.29.131 6379
6493:X 24 Jul 23:57:27.472 * +fix-slave-config slave 192.168.29.147:6379 192.168.29.147 6379 @ mymaster 192.168.29.131 6379

查看数据情况

[root@node2 ~]# redis-cli get handsome
"boy"
[root@node3 ~]# redis-cli get handsome
"boy"