当系统中只有一台redis运行时,一旦该redis挂了,会导致整个系统无法运行。
由于单台redis出现单点故障,就会导致整个系统不可用,所以想到的办法自然就是备份。当一台redis出现问题了,另一台redis可以继续提供服务。
- 虽然上面redis做了备份,看上去很完美。但由于redis目前只支持主从复制备份(不支持主主复制),当主redis挂了,从redis只能提供读服务,无法提供写服务。所以,还得想办法,当主redis挂了,让从redis升级成为主redis。
- 这就需要自动故障转移,redis sentinel带有这个功能,当一个主redis不能提供服务时,redis sentinel可以将一个从redis升级为主redis,并对其他从redis进行配置,让他们使用新的主redis进行复制备份。
- Redis-Sentinel是Redis官方推荐的高可用性(HA)解决方案,当用Redis做Master-slave的高可用方案时,假如master宕机了,Redis本身(包括它的很多客户端)都没有实现自动进行主备切换,而Redis-sentinel本身也是一个独立运行的进程,它能监控多个master-slave集群,发现master宕机后能进行自动切换。它的主要功能有以下几点
- 实时地监控redis是否按照预期良好地运行;
- 如果发现某个redis节点运行出现状况,能够通知另外一个进程(例如它的客户端);
- 能够进行自动切换。当一个master节点不可用时,能够选举出master的多个slave(如果有超过一个slave的话)中的一个来作为新的master,其他的slave节点会将它所追随的master的地址改为被提升为master的slave的新地址。
这里使用三台服务器,每台服务器上开启一个redis-server和redis-sentinel服务,redis-server端口为8000,redis-sentinel的端口为6800,修改默认端口是安全的第一步。
redis-server | 说明 | redis-sentinel | 说明 |
---|---|---|---|
192.168.200.165:8000 | redis-master | ||
192.168.200.164:8000 | redis-slaveA | ||
192.168.200.163:8000 | redis-slaveB |
#部署环境
[root@redis-master redis-4.0.10]# cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)
[root@redis-master redis-4.0.10]# uname -r
3.10.0-514.el7.x86_64
#关闭防火墙和selinux
[root@redis-slaveA ~]# systemctl stop firewalld
[root@redis-slaveA ~]# systemctl disable firewalld
[root@redis-slaveA ~]# setenforce 0
#三台服务器上都进行如下编译安装
[root@redis-master ~]# yum -y install gcc gcc-c++ make automake autoconf
[root@redis-master ~]# tar xf redis-4.0.10.tar.gz -C /usr/src/
[root@redis-master ~]# cd /usr/src/redis-4.0.10/
[root@redis-master redis-4.0.10]# make MALLOC=jemalloc
[root@redis-master redis-4.0.10]# make PREFIX=/usr/local/redis install
[root@redis-master redis-4.0.10]# mkdir -p /usr/local/redis/conf
[root@redis-master redis-4.0.10]# cp redis.conf /usr/local/redis/conf/
[root@redis-master redis-4.0.10]# cp sentinel.conf /usr/local/redis/conf/
redis-master配置文件修改:
port 8000
daemonize yes
bind 0.0.0.0
pidfile /var/run/redis-8000.pid
logfile /var/log/redis/redis-8000.log
redis-slave配置文件修改:
port 8000
daemonize yes
bind 0.0.0.0
pidfile /var/run/redis-8000.pid
logfile /var/log/redis/redis-8000.log
slaveof 192.168.200.132 8000 #比redis主多这行
#先启动redis-master再启动两个从
redis-server /usr/local/redis/conf/redis.conf
#查看redis-slave的日志,同步是否成功
[root@redis-slaveA ~]# tail /var/log/redis/redis-8000.log
15348:S 27 Jun 11:52:47.401 * DB loaded from disk: 0.000 seconds
15348:S 27 Jun 11:52:47.401 * Before turning into a slave, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
15348:S 27 Jun 11:52:47.401 * Ready to accept connections
15348:S 27 Jun 11:52:47.401 * Connecting to MASTER 192.168.200.132:8000
15348:S 27 Jun 11:52:47.401 * MASTER <-> SLAVE sync started #开始同步
15348:S 27 Jun 11:52:47.401 * Non blocking connect for SYNC fired the event.
15348:S 27 Jun 11:52:47.402 * Master replied to PING, replication can continue...
15348:S 27 Jun 11:52:47.402 * Trying a partial resynchronization (request 96f402a0df6695a404d5df5d88f2ef111caf33e0:5865).
15348:S 27 Jun 11:52:47.402 * Successful partial resynchronization with master. #同步成功
15348:S 27 Jun 11:52:47.402 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization.
#通过命令查看主从复制情况
[root@redis-master ~]# redis-cli -p 8000 info replication
# Replication
role:master #本机是主
connected_slaves:2 #两个从
slave0:ip=192.168.200.140,port=8000,state=online,offset=6498,lag=0
slave1:ip=192.168.200.139,port=8000,state=online,offset=6498,lag=1
master_replid:96f402a0df6695a404d5df5d88f2ef111caf33e0
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:6498
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:6498
#redis-master上执行
[root@redis-master ~]# redis-cli -p 8000 set aaa 111
OK
#redis-slave上执行
[root@redis-slaveA ~]# redis-cli -p 8000 get aaa
"111"
[root@redis-slaveB ~]# redis-cli -p 8000 get aaa
"111"
(1)修改sentinel.conf配置文件
#修改配置文件以下行
[root@redis-master ~]# cat -n /usr/local/redis/conf/sentinel.conf | sed -n '21p;69p;98p;106p;131p'
21 port 26379
69 sentinel monitor mymaster 127.0.0.1 6379 2
98 sentinel down-after-milliseconds mymaster 30000
106 sentinel parallel-syncs mymaster 1
131 sentinel failover-timeout mymaster 180000
#修改成如下内容
[root@redis-master ~]# cat -n /usr/local/redis/conf/sentinel.conf | sed -n '21p;69p;98p;106p;131p'
21 port 6800
69 sentinel monitor master8000 192.168.200.132 8000 2
98 sentinel down-after-milliseconds master8000 5000
106 sentinel parallel-syncs master8000 1
131 sentinel failover-timeout master8000 15000
#再在sentinel.conf的最后追加以下四句话
[root@localhost ~]# tail -4 /usr/local/redis/conf/sentinel.conf
daemonize yes #守护进程模式
logfile "/var/log/redis/sentinel.log"
pidfile "/var/run/sentinel.pid"
protected-mode no #
配置文件说明:
sentinel monitor master8000 192.168.200.132 8000 2
master8000:监控的主节点名字(随便写)
192.168.200.132 8000 :主节点的IP和端口
2:一共有两台Sentinel发现有问题就会发生故障转移
sentinel down-after-milliseconds master8000 5000(5秒)
当master8000节点宕机后多久进行检查
sentinel parallel-syncs master8000 1
设定sentinel并发还是串行,1代表每次只能复制一个,可以减轻master压力
sentinel failover-timeout master8000 15000(15秒)
(2)启动redis-sentinel
#三台都启动
[root@localhost ~]# redis-sentinel /usr/local/redis/conf/sentinel.conf &
#启动以后,查看sentinel信息
[root@localhost ~]# redis-cli -p 6800 info sentinel
# Sentinel
sentinel_masters:1 #1个master
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=master8000,status=ok,address=192.168.200.132:8000,slaves=2,sentinels=3 #2从3sentinel
(3)进行redis-master的宕机测试
我们宕掉redis-master的redis-server服务,然后查看sentinel日志
#停掉redis-master
[root@redis-master ~]# redis-cli -p 8000 shutdown
#查看redis-sentinel日志
[root@redis-slaveB ~]# cat /var/log/redis/sentinel.log
16705:X 28 Jun 12:32:11.756 # +sdown master master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:11.828 # +odown master master8000 192.168.200.132 8000 #quorum 2/2
16705:X 28 Jun 12:32:11.828 # +new-epoch 3
16705:X 28 Jun 12:32:11.828 # +try-failover master master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:11.829 # +vote-for-leader c475ebe6805f41a8cbc84b67fd4faede5791703a 3
16705:X 28 Jun 12:32:11.831 # 6c742feab7744f6aa1ff975429ccc7d6e5a3bad1 voted for c475ebe6805f41a8cbc84b67fd4faede5791703a 3
16705:X 28 Jun 12:32:11.832 # a65cb3bfc9385b5bb91ddcd7c38f9ba2c21691b0 voted for c475ebe6805f41a8cbc84b67fd4faede5791703a 3
16705:X 28 Jun 12:32:11.886 # +elected-leader master master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:11.886 # +failover-state-select-slave master master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:11.943 # +selected-slave slave 192.168.200.140:8000 192.168.200.140 8000 @ master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:11.943 * +failover-state-send-slaveof-noone slave 192.168.200.140:8000 192.168.200.140 8000 @ master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:12.009 * +failover-state-wait-promotion slave 192.168.200.140:8000 192.168.200.140 8000 @ master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:12.898 # +promoted-slave slave 192.168.200.140:8000 192.168.200.140 8000 @ master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:12.898 # +failover-state-reconf-slaves master master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:12.994 * +slave-reconf-sent slave 192.168.200.139:8000 192.168.200.139 8000 @ master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:13.906 * +slave-reconf-inprog slave 192.168.200.139:8000 192.168.200.139 8000 @ master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:13.906 * +slave-reconf-done slave 192.168.200.139:8000 192.168.200.139 8000 @ master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:13.968 # -odown master master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:13.968 # +failover-end master master8000 192.168.200.132 8000
16705:X 28 Jun 12:32:13.968 # +switch-master master8000 192.168.200.132 8000 192.168.200.140 8000
16705:X 28 Jun 12:32:13.968 * +slave slave 192.168.200.139:8000 192.168.200.139 8000 @ master8000 192.168.200.140 8000
16705:X 28 Jun 12:32:13.969 * +slave slave 192.168.200.132:8000 192.168.200.132 8000 @ master8000 192.168.200.140 8000
16705:X 28 Jun 12:32:19.025 # +sdown slave 192.168.200.132:8000 192.168.200.132 8000 @ master8000 192.168.200.140 8000
#查看sentinel信息
[root@redis-master ~]# redis-cli -p 6800 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=master8000,status=ok,address=192.168.200.140:8000,slaves=2,sentinels=3 #已经切换到192.168.200.140为主了
#启动redis-master然后再停掉192.168.200.140的redis-server
[root@redis-slaveB ~]# redis-cli -p 8000 shutdown
#查看sentinel信息
[root@redis-master ~]# redis-cli -p 6800 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=master8000,status=ok,address=192.168.200.132:8000,slaves=2,sentinels=3 #redis-master恢复为192.168.200.132了
假如在redis上,我们也建立一个VIP机制,一旦redis-master宕机,那么本来在master上的VIP就会飘逸到新的master上。
如此一来开发在连接redis的时候,即便redis-master发生切换,那么开发也不需要修改代码了。
这里我们可以使用redis sentinel的一个参数client-reconfig-script,这个参数配置执行脚本,sentinel在做failover的时候会执行这个脚本,并且传递6个参数,其中```是新主redis的IP地址,可以在这个脚本里做VIP漂移操作.
, , , , , ,
#在sentinel.conf里增加一句话
[root@localhost ~]# sed -n '170,175p' /usr/local/redis/conf/sentinel.conf
# CLIENTS RECONFIGURATION SCRIPT
#
# sentinel client-reconfig-script
sentinel client-reconfig-script master8000 /usr/local/redis/notify_master6800.sh #增加漂移脚本路径
# When the master changed because of a failover a script can be called in
#写一个漂移脚本
[root@localhost redis]# cat /usr/local/redis/notify_master6800.sh
#!/bin/bash
MASTER_IP=$6 #第六个参数就是sentinel传入进行来的新master的IP
LOCAL_IP="192.168.200.132" #脚本所在服务器的本地IP(每个服务器都不同)
VIP="192.168.200.244"
NETMASK="24"
INTERFACE="ens32"
if [[ "${MASTER_IP}" == "${LOCAL_IP}" ]];then
/usr/sbin/ip addr add ${VIP}/${NETMASK} dev ${INTERFACE}
/usr/sbin/arping -q -c 3 -A ${VIP} -I ${INTERFACE}
exit 0
else
/usr/sbin/ip addr del ${VIP}/${NETMASK} dev ${INTERFACE}
exit 0
fi
exit 1
#给脚本加x权限
[root@localhost redis]# chmod +x notify_master6800.sh
#重新启动所有的redis-sentinel进程
[root@localhost ~]# pkill redis-sentinel
[root@localhost ~]# redis-sentinel /usr/local/redis/conf/sentinel.conf
#第一次时手动给master添加VIP
[root@redis-master ~]# ip addr 192.168.200.244/24 dev ens32
#让ip地址即刻生效
[root@redis-master ~]# arping -q -c 3 -A 192.168.200.244 -I ens32
接下来我们进行ip飘逸测试
#查看VIP所在服务器
[root@redis-master ~]# ip addr show ens32
2: ens32: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:73:91:29 brd ff:ff:ff:ff:ff:ff
inet 192.168.200.132/24 brd 192.168.200.255 scope global dynamic ens32
valid_lft 1711sec preferred_lft 1711sec
inet 192.168.200.244/24 scope global secondary ens32 #VIP在132上
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe73:9129/64 scope link
valid_lft forever preferred_lft forever
#查看redis-master所属服务器
[root@redis-master ~]# redis-cli -p 6800 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=master8000,status=ok,address=192.168.200.132:8000,slaves=2,sentinels=3 #132是主
#停止132服务器的redis-server服务
[root@redis-master ~]# redis-cli -p 8000 shutdown
#查看redis-master的所属服务器
[root@redis-master ~]# redis-cli -p 6800 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=master8000,status=ok,address=192.168.200.140:8000,slaves=2,sentinels=3 #mater被切换到了140
#在140服务器上查看VIP及主从复制情况
[root@localhost ~]# ip addr show ens32
2: ens32: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:03:ee:e5 brd ff:ff:ff:ff:ff:ff
inet 192.168.200.140/24 brd 192.168.200.255 scope global dynamic ens32
valid_lft 1416sec preferred_lft 1416sec
inet 192.168.200.244/24 scope global secondary ens32 #VIP飘逸到了140上
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe03:eee5/64 scope link
valid_lft forever preferred_lft forever
[root@localhost ~]# redis-cli -p 8000 info replication
# Replication
role:master #140是主
connected_slaves:1
slave0:ip=192.168.200.139,port=8000,state=online,offset=7897768,lag=0 #139是从
master_replid:3be1d64af444e02aabc486f83a07f0b66c2671d1
master_replid2:9d70bc1f8d6c0fd720b6f35dc9a8a900593d6577
master_repl_offset:7897914
second_repl_offset:7857768
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:6849339
repl_backlog_histlen:1048576
至此redis-sentinel的VIP漂移测试成功。redis-sentinel可以持续高可用的
,offset=7897768,lag=0 #139是从
master_replid:3be1d64af444e02aabc486f83a07f0b66c2671d1
master_replid2:9d70bc1f8d6c0fd720b6f35dc9a8a900593d6577
master_repl_offset:7897914
second_repl_offset:7857768
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:6849339
repl_backlog_histlen:1048576
> 至此redis-sentinel的VIP漂移测试成功。redis-sentinel可以持续高可用的