1、单个redis服务搭建请参考:redis服务搭建
2、在/usr/local下创建目录redis-cluster,并在redis-cluster下创建 6379、6380、6381目录以及data、temp目录
# cd /usr/local # mkdir redis-cluster
--其他文件创建类似,此处不一一写出
3、复制安装后的redis的配置文件(我的在/etc/目录下)的redis.conf 和 sentinel.conf文件到 6379、6380、6381目录中.
# cp /etc/redis.conf /usr/local/redis-cluster/6379 # cp /etc/redis-sentinel.conf /usr/local/redis-cluster/6379
--6380和6379类似,此处不一一写出
4、主从配置,修改 redis.conf文件
主redis 6379目录:
protected-mode yes port 6379 daemonize yes pidfile /var/run/redis_6379.pid
logfile /var/log/redis/redis_6379.log
dir /usr/local/redis-cluster/6379/data
slave-read-only yes
requirepass foo(设置访问登录密码)
#从服务设置了密码需要加上
masterauth foo
从slave1 6380:
protected-mode yes port 6380 daemonize yes pidfile /var/run/redis_6380.pid logfile /var/log/redis/redis_6380.log dir /usr/local/redis-cluster/6380/data slaveof 127.0.0.1 6379
#若主服务设置了密码需要加上
masterauth foo #从服务密码设置
requirepass foo
从slave2 6381:
protected-mode yes port 6381 daemonize yes pidfile /var/run/redis_6381.pid logfile /var/log/redis/redis_6381.log dir /usr/local/redis-cluster/6381/data slaveof 127.0.0.1 6379 #若主服务设置了密码需要加上,在设置哨兵时主从之间连接需要 masterauth foo #从服务密码设置 requirepass foo
master既可以读,也可以写,而 从服务器是只可以读,不可写的.
5、哨兵配置.
主redis 6379 sentinel.conf
protected-mode no port 26379 dir "/usr/local/redis-cluster/6379/temp" sentinel monitor redis1 127.0.0.1 6379 2 sentinel down-after-milliseconds redis1 10000 sentinel failover-timeout redis1 60000
其中redis1可自定义
从slave1 6380 centinel.conf
protected-mode no port 26380 dir "/usr/local/redis-cluster/6380/temp" sentinel monitor redis1 127.0.0.1 6379 2 sentinel down-after-milliseconds redis1 10000 sentinel failover-timeout redis1 60000
从slave2 7003 sentinel.conf
protected-mode no port 26381 dir "/home/redis/redis-cluster/6381/temp" sentinel monitor redis1 127.0.0.1 6381 2 sentinel down-after-milliseconds redis1 10000 sentinel failover-timeout redis1 60000
哨兵配置完成.
6、启动
启动redis
分别到6379、6380、6381的目录下执行启动命令:
redis-server ./redis.conf
启动哨兵
redis-server ./sentinel.conf
7、查看主从信息:
127.0.0.1:6379> info replication # Replication role:master connected_slaves:2 slave0:ip=127.0.0.1,port=6380,state=online,offset=420,lag=1 slave1:ip=127.0.0.1,port=6381,state=online,offset=420,lag=1 master_repl_offset:434 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:2 repl_backlog_histlen:433
127.0.0.1:6380> info replication # Replication role:slave master_host:127.0.0.1 master_port:6379 master_link_status:up master_last_io_seconds_ago:1 master_sync_in_progress:0 slave_repl_offset:168 slave_priority:100 slave_read_only:1 connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0
127.0.0.1:6381> info replication # Replication role:slave master_host:127.0.0.1 master_port:6379 master_link_status:up master_last_io_seconds_ago:9 master_sync_in_progress:0 slave_repl_offset:406 slave_priority:100 slave_read_only:1 connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0
8、验证
127.0.0.1:6379> set test 123 OK
127.0.0.1:6380> get test "123" 127.0.0.1:6380> set hh 6380 (error) READONLY You can't write against a read only slave.
127.0.0.1:6381> get test "123" 127.0.0.1:6381> set xx 6381 (error) READONLY You can't write against a read only slave.
此时,我们可以把主服务或者从服务停掉进行测试来看看效果
停掉主服务6379:
[root@VM_0_15_centos 6379]# ps -ef |grep redis root 10690 30720 0 18:24 pts/4 00:00:00 tailf -n 200 /var/log/redis/sentinel.log root 15650 1 0 19:01 ? 00:00:01 redis-sentinel *:26379 [sentinel] root 16404 1 0 19:06 ? 00:00:00 redis-sentinel *:26381 [sentinel] root 16565 1 0 19:07 ? 00:00:00 redis-sentinel *:26380 [sentinel] root 17248 28933 0 19:12 pts/0 00:00:00 grep --color=auto redis root 31060 1 0 16:55 ? 00:00:06 redis-server *:6379 root 31488 1 0 16:58 ? 00:00:06 redis-server *:6380 root 31563 29043 0 16:58 pts/1 00:00:00 redis-cli -c -p 6380 -a foo root 31952 1 0 17:01 ? 00:00:06 redis-server *:6381 root 32057 29125 0 17:02 pts/2 00:00:00 redis-cli -c -p 6381 -a foo [root@VM_0_15_centos 6379]# kill 31060
sentinel日志:
15650:X 18 Dec 19:13:44.834 # +sdown master mymaster 127.0.0.1 6379 16565:X 18 Dec 19:13:44.852 # +sdown master mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:44.861 # +sdown master mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:44.920 # +odown master mymaster 127.0.0.1 6379 #quorum 3/2 16404:X 18 Dec 19:13:44.920 # +new-epoch 1 16404:X 18 Dec 19:13:44.920 # +try-failover master mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:44.927 # +vote-for-leader 6d5a34396cd5912cbfe1134a70cd3e14338ebf24 1 15650:X 18 Dec 19:13:44.934 # +new-epoch 1 16565:X 18 Dec 19:13:44.934 # +new-epoch 1 16565:X 18 Dec 19:13:44.940 # +vote-for-leader 6d5a34396cd5912cbfe1134a70cd3e14338ebf24 1 15650:X 18 Dec 19:13:44.940 # +vote-for-leader 6d5a34396cd5912cbfe1134a70cd3e14338ebf24 1 16404:X 18 Dec 19:13:44.940 # 69c00be66f0461192b2db901ece6282e00b6462c voted for 6d5a34396cd5912cbfe1134a70cd3e14338ebf24 1 16404:X 18 Dec 19:13:44.940 # f51e8307952eba4264cc9089adf3c716e658609f voted for 6d5a34396cd5912cbfe1134a70cd3e14338ebf24 1 16565:X 18 Dec 19:13:44.942 # +odown master mymaster 127.0.0.1 6379 #quorum 2/2 16565:X 18 Dec 19:13:44.942 # Next failover delay: I will not start a failover before Tue Dec 18 19:14:05 2018 16404:X 18 Dec 19:13:45.003 # +elected-leader master mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:45.003 # +failover-state-select-slave master mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:45.075 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:45.075 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:45.166 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 15650:X 18 Dec 19:13:45.900 # +odown master mymaster 127.0.0.1 6379 #quorum 3/2 15650:X 18 Dec 19:13:45.900 # Next failover delay: I will not start a failover before Tue Dec 18 19:14:05 2018 16404:X 18 Dec 19:13:45.960 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:45.960 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:46.011 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379 16565:X 18 Dec 19:13:46.012 # +config-update-from sentinel 6d5a34396cd5912cbfe1134a70cd3e14338ebf24 127.0.0.1 26381 @ mymaster 127.0.0.1 6379 16565:X 18 Dec 19:13:46.012 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381 16565:X 18 Dec 19:13:46.012 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381 16565:X 18 Dec 19:13:46.012 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381 15650:X 18 Dec 19:13:46.013 # +config-update-from sentinel 6d5a34396cd5912cbfe1134a70cd3e14338ebf24 127.0.0.1 26381 @ mymaster 127.0.0.1 6379 15650:X 18 Dec 19:13:46.013 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381 15650:X 18 Dec 19:13:46.013 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381 15650:X 18 Dec 19:13:46.013 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381 16404:X 18 Dec 19:13:47.002 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:47.002 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:47.067 # -odown master mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:47.067 # +failover-end master mymaster 127.0.0.1 6379 16404:X 18 Dec 19:13:47.067 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381 16404:X 18 Dec 19:13:47.067 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381 16404:X 18 Dec 19:13:47.067 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381 16565:X 18 Dec 19:14:16.015 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381 15650:X 18 Dec 19:14:16.017 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381 16404:X 18 Dec 19:14:17.154 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
查看replication:
127.0.0.1:6380> info replication # Replication role:slave master_host:127.0.0.1 master_port:6381 master_link_status:up master_last_io_seconds_ago:1 master_sync_in_progress:0 slave_repl_offset:72129 slave_priority:100 slave_read_only:1 connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0
127.0.0.1:6381> info replication # Replication role:master connected_slaves:1 slave0:ip=127.0.0.1,port=6380,state=online,offset=78422,lag=0 master_repl_offset:78422 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:2 repl_backlog_histlen:78421
此时从服务6381成为主服务
重启6379服务:
[root@VM_0_15_centos 6379]# redis-server ./redis.conf
sentinel日志:
16404:X 18 Dec 19:22:33.181 * +convert-to-slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
查看replication:
127.0.0.1:6379> info replication # Replication role:slave master_host:127.0.0.1 master_port:6381 master_link_status:up master_last_io_seconds_ago:-1 master_sync_in_progress:0 slave_repl_offset:1 master_link_down_since_seconds:1545132401 slave_priority:100 slave_read_only:1 connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0
再次停掉从服务器6379可以测试从服务器宕机时的效果
9、java集成
引入依赖:
<dependency> <groupId>redis.clientsgroupId> <artifactId>jedisartifactId> <version>2.9.0version> dependency>
具体测试见:java集成测试参考
10、注意事项
- 只有Sentinel 集群中大多数服务器认定master主观下线时master才会被认定为客观下线,才可以进行故障迁移,也就是说,即使不管我们在sentinel monitor中设置的数是多少,就算是满足了该值,只要达不到大多数,就不会发生故障迁移。
- 官方建议sentinel至少部署三台,且分布在不同机器。这里主要考虑到sentinel的可用性,假如我们只部署了两台sentinel,且quorum设置为1,也可以实现自动故障迁移,但假如其中一台sentinel挂了,就永远不会触发自动故障迁移,因为永远达不到大多数sentinel认定master主观下线了。
- sentinel monitor配置中的master IP尽量不要写127.0.0.1或localhost,因为客户端,如jedis获取master是根据这个获取的,若这样配置,jedis获取的ip则是127.0.0.1,这样就可能导致程序连接不上master
- 当sentinel 启动后会自动的修改sentinel.conf文件,如已发现的master的slave信息,和集群中其它sentinel 的信息等,这样即使重启sentinel也能保持原来的状态。注意,当集群服务器调整时,如更换sentinel的机器,或者新配置一个sentinel,请不要直接复制原来运行过得sentinel配置文件,因为其里面自动生成了以上说的那些信息,应该复制一个新的配置文件或者把自动生成的信息给删掉。
- 当发生故障迁移的时候,master的变更记录与slave更换master的修改会自动同步到redis的配置文件,这样即使重启redis也能保持变更后的状态。