前一章的主从架构无法实现master和slave角色的自动切换,即当master出现redis服务异常、主机断电、磁盘损坏等问题导致master无法使用,而redis高可用无法实现自故障转移(将slave提升为master),需要手动改环境配置才能切换到slave redis 服务器,另外也无法横向扩展Redis服务的并行写入性能,当单台Redis服务器性能无法满足业务写入需求的时候就必须要一种方式解决以上的两个核心问题。 即:
1)master和slave角色的无缝切换,让业务无感知从而不影响业务使用
2)可以横向动态扩展Redis服务器,从而实现多台服务器并行写入以实现更高并发的目的
Redis集群实现方式:客户端分片 代理分片 Redis Cluster
Redis分布式部署方案:
在哨兵 sentinel 机制中,可以解决redis高可用的问题,即当master故障后可以自动将slave提升为master从而可以保证redis服务的正常使用,但是无法解决redis单机写入的瓶颈问题,即单机的redis写入性能受限于单机的内存大小、并发数量、网卡速率等因素,因此redis官方在redis 3.0版本之后推出了无中心架构的redis cluster机制。
在无中心的redis集群中,其每个节点保存当前节点数据和整个集群状态,每个节点都和其他所有节点连接,特点如下:
假如三个主节点分别是:A、B、C三个节点,采用哈希槽(hash slot)的方式来分配16384个slot的话,他们三个节点分别承担的slot区间是:
节点A覆盖:0-5460
节点B覆盖:5461-10922
节点C覆盖:10923-16383
redis cluster 的架构虽然解决了并发的问题,但是又引入了一个新的问题,每个redis master的高可用如何解决?
环境准备:三台服务器,每台服务器启动6379和6380两个redis服务。
192.168.56.199:6379/6380 192.168.56.200:6379/6380 192.168.56.201:6379/6380
cluster-enabled yes #必须开启集群状态,开启redis进程会有cluster显示
cluster-config-file nodes-6380.conf #此文件由redis cluster 集群自动创建和维护,不需要任何手动操作
[root@gbase8c_private cluster]# /usr/local/redis/bin/redis-server /usr/local/redis/etc/redis_cluster_6380.conf
[root@gbase8c_private cluster]# /usr/local/redis/bin/redis-server /usr/local/redis/etc/redis_cluster_6379.conf
[root@gbase8c_private cluster]# ss -ntl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:16379 *:* #服务端通信端口
LISTEN 0 128 *:16380 *:* #服务端通信端口
LISTEN 0 128 *:6379 *:* #客户端通信端口
LISTEN 0 128 *:6380 *:* #客户端通信端口
[root@gbase8c_private cluster]# ps -ef | grep redis
root 13167 1 0 22:30 ? 00:00:00 /usr/local/redis/bin/redis-server 0.0.0.0:6380 [cluster]
root 13194 1 0 22:30 ? 00:00:00 /usr/local/redis/bin/redis-server 0.0.0.0:6379 [cluster]
root 13306 2525 0 22:30 pts/0 00:00:00 grep --color=auto redis
配置文件共同处:
bind 0.0.0.0
daemonize yes
requirepass 123456
dir "/usr/local/redis/data/cluster"
cluster-enabled yes
配置文件不同处:
[root@gbase8c_private etc]# diff redis_cl*
< port 6379
> port 6380
< pidfile /var/run/redis_6379.pid
> pidfile /var/run/redis_6380.pid
< logfile "/usr/local/redis/logs/redis_cluster_199_6379.log"
> logfile "/usr/local/redis/logs/redis_cluster_199_6380.log"
< dbfilename "dump_6379.rdb"
> dbfilename "dump_6380.rdb"
< cluster-config-file nodes-6379.conf
> cluster-config-file nodes-6380.conf
需要使用到集群管理工具redis-trib.rb,这个工具是redis官方推出的管理redis集群的工具,集成在redis的源码src目录下,redis-trib.rb 是redis作者用ruby开发完成的。centos系统yum安装ruby存在版本较低的问题。
yum install ruby rubygems -y
find / -name redis-trib.rb
cp ↑ /usr/bin
gem install redis
解决版本低的问题:
yum remove ruby rubygems -y
wget https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.5.tar.gz
tar xf ruby-2.5.5.tar.gz
cd ruby-2.5.5
./configure
make -j 2
make install
gem install redis
验证redis-trib.rb 命令是否可执行
redis-trib.rb
修改密码为redis登录密码
vim /usr/local/lib/ruby/gems/2.5.0/gems/redis-4.1.0/lib/redis/client.rb
创建redis cluster 集群
redis-trib.rb create --replicas 1 172.18.200.101:6379 172.18.200.102:6379 172.18.200.103:6379 172.18.200.104:6379 172.18.200.105:6379 172.18.200.106:6379
如果有之前的操作导致redis集群创建报错,则执行清空数据和集群命令:
flushall
cluster reset
redis-cli -a 123456 --cluster create 192.168.56.199:6379 192.168.56.199:6380 192.168.56.200:6379 192.168.56.200:6380 192.168.56.201:6379 192.168.56.201:6380 --cluster-replicas 1
日志:
[root@gbase8c_private etc]# redis-cli -a 123456 --cluster create 192.168.56.199:6379 192.168.56.199:6380 192.168.56.200:6379 192.168.56.200:6380 192.168.56.201:6379 192.168.56.201:6380 --cluster-replicas 1
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 192.168.56.200:6380 to 192.168.56.199:6379
Adding replica 192.168.56.199:6380 to 192.168.56.200:6379
Adding replica 192.168.56.201:6380 to 192.168.56.201:6379
>>> Trying to optimize slaves allocation for anti-affinity
[OK] Perfect anti-affinity obtained!
M: ff147ab31cb3984ec48d5354a184633c0594808e 192.168.56.199:6379 #带M的为master
slots:[0-5460] (5461 slots) master #当前master的槽位起始和结束位
S: 282761264f247f7cec59b5d8f0dbe922050865aa 192.168.56.199:6380 #带S的为slave
replicates 5b46c04d7fa5edba767392c51be13e0f0db522e9
M: 0809910b37930a25aa32776bba8272c4493ecc62 192.168.56.200:6379
slots:[5461-10922] (5462 slots) master
S: 858b0507025e970ffcc2855b85e676fa3a0db34f 192.168.56.200:6380
replicates ff147ab31cb3984ec48d5354a184633c0594808e
M: 5b46c04d7fa5edba767392c51be13e0f0db522e9 192.168.56.201:6379
slots:[10923-16383] (5461 slots) master
S: 596112ae9a792595f61dcb63d19b0911c7b20797 192.168.56.201:6380
replicates 0809910b37930a25aa32776bba8272c4493ecc62
Can I set the above configuration? (type 'yes' to accept): yes #输入yes 自动创建集群
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
....
>>> Performing Cluster Check (using node 192.168.56.199:6379)
M: ff147ab31cb3984ec48d5354a184633c0594808e 192.168.56.199:6379 #master的ID及端口
slots:[0-5460] (5461 slots) master #已经分配的槽位
1 additional replica(s) #分配了一个slave
M: 5b46c04d7fa5edba767392c51be13e0f0db522e9 192.168.56.201:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: 596112ae9a792595f61dcb63d19b0911c7b20797 192.168.56.201:6380
slots: (0 slots) slave #slave没有分配槽位
replicates 0809910b37930a25aa32776bba8272c4493ecc62
S: 858b0507025e970ffcc2855b85e676fa3a0db34f 192.168.56.200:6380
slots: (0 slots) slave
replicates ff147ab31cb3984ec48d5354a184633c0594808e
S: 282761264f247f7cec59b5d8f0dbe922050865aa 192.168.56.199:6380
slots: (0 slots) slave
replicates 5b46c04d7fa5edba767392c51be13e0f0db522e9
M: 0809910b37930a25aa32776bba8272c4493ecc62 192.168.56.200:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration. #所有节点槽位分配完成
>>> Check for open slots... #检查打开的槽位
>>> Check slots coverage... #检查插槽覆盖范围
[OK] All 16384 slots covered. #所有槽位(16384个)分配完成
由于未设置masterauth认证密码,所以主从未建立起来,但是集群已经运行,所以需要在每个slave控制台使用config set 设置masterauth密码,或者写在每个redis配置文件中,最好是在控制台设置密码之后再写入配置文件当中。
确认slave状态为up:
[root@gbase8c_private etc]# redis-cli -h 192.168.56.199 -p 6380 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.56.199:6380> info replication
# Replication
role:slave
master_host:192.168.56.201
master_port:6379
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:0
slave_repl_offset:1
master_link_down_since_seconds:1698160154
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:b7bb223ac4f984f16a0f97e9ba6306097981b1da
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
192.168.56.199:6380> config set masterauth 123456
OK
192.168.56.199:6380> info replication
# Replication
role:slave
master_host:192.168.56.201
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:14
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:90b7cf85b6c3b7f5fe7af18615f215fc7907872d
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:14
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:14
确认master状态:
[root@gbase8c_private etc]# redis-cli -h 192.168.56.199 -p 6379 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.56.199:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.56.200,port=6380,state=online,offset=84,lag=1
master_replid:f67fae9bbb0230507fcc7490626814d8df13fa64
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:84
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:84
确认集群状态:
cluster info
192.168.56.199:6379> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:628
cluster_stats_messages_pong_sent:684
cluster_stats_messages_sent:1312
cluster_stats_messages_ping_received:679
cluster_stats_messages_pong_received:628
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:1312
查看集群node对应关系:
cluster nodes
192.168.56.199:6379> cluster nodes
5b46c04d7fa5edba767392c51be13e0f0db522e9 192.168.56.201:6379@16379 master - 0 1698160396206 5 connected 10923-16383
596112ae9a792595f61dcb63d19b0911c7b20797 192.168.56.201:6380@16380 slave 0809910b37930a25aa32776bba8272c4493ecc62 0 1698160395179 6 connected
858b0507025e970ffcc2855b85e676fa3a0db34f 192.168.56.200:6380@16380 slave ff147ab31cb3984ec48d5354a184633c0594808e 0 1698160394065 4 connected
ff147ab31cb3984ec48d5354a184633c0594808e 192.168.56.199:6379@16379 myself,master - 0 1698160391000 1 connected 0-5460
282761264f247f7cec59b5d8f0dbe922050865aa 192.168.56.199:6380@16380 slave 5b46c04d7fa5edba767392c51be13e0f0db522e9 0 1698160393049 5 connected
0809910b37930a25aa32776bba8272c4493ecc62 192.168.56.200:6379@16379 master - 0 1698160393000 3 connected 5461-10922
192.168.56.199:6379> set key1 value1 #经过算法计算,当前key的槽位需要写入指定的ndoe
(error) MOVED 9189 192.168.56.200:6379 #槽位不在当前node所以无法写入
192.168.56.200:6379> set key1 value1 #指定的node就可以写入
OK
192.168.56.201:6379> set key1 value1
(error) MOVED 9189 192.168.56.200:6379
1)redis4
redis-trib.rb check 172.18.200.105:6379
redis-trib.rb info 172.18.200.105:6379
2)redis5
redis-cli -a 123456 --cluster check 192.168.56.199:6379
[root@gbase8c_private etc]# redis-cli -a 123456 --cluster check 192.168.56.199:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.56.199:6379 (ff147ab3...) -> 0 keys | 5461 slots | 1 slaves.
192.168.56.201:6379 (5b46c04d...) -> 0 keys | 5461 slots | 1 slaves.
192.168.56.200:6379 (0809910b...) -> 1 keys | 5462 slots | 1 slaves.
[OK] 1 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.56.199:6379)
M: ff147ab31cb3984ec48d5354a184633c0594808e 192.168.56.199:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
M: 5b46c04d7fa5edba767392c51be13e0f0db522e9 192.168.56.201:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: 596112ae9a792595f61dcb63d19b0911c7b20797 192.168.56.201:6380
slots: (0 slots) slave
replicates 0809910b37930a25aa32776bba8272c4493ecc62
S: 858b0507025e970ffcc2855b85e676fa3a0db34f 192.168.56.200:6380
slots: (0 slots) slave
replicates ff147ab31cb3984ec48d5354a184633c0594808e
S: 282761264f247f7cec59b5d8f0dbe922050865aa 192.168.56.199:6380
slots: (0 slots) slave
replicates 5b46c04d7fa5edba767392c51be13e0f0db522e9
M: 0809910b37930a25aa32776bba8272c4493ecc62 192.168.56.200:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
除了redis官方自带的redis cluster集群之外,还有一些开源的集群解决方案可供参考使用。
codis
Codis是一个分布式Redis解决方案,对于上层的应用来说,连接到Codis Proxy和连接原生的Redis Server没有显著区别。
上层应用可以像使用单机的redis一样使用,Codis 底层会处理请求的转发,不停机的数据迁移等工作,所有后边的一切事情,对于前面的客户端来说都是透明的,可以简单的认为后边连接的是一个内存无限大的redis服务.
codis-proxy 相当于redis,即连接codis-proxy和连接redis是没有任何区别的,codis-proxy无状态,不负责记录是否在哪保存,数据在zookeeper 记录,即codis proxy向zookeeper查询key的记录位置,proxy将请求转发到一个组进行处理,一个组里面有一个master和一个或多个slave组成,默认有1024个槽位,redis cluster默认有16384个槽位,其把不同的槽位的内容放在不同的group。
Github地址:https://github.com/CodisLabs/codis/blob/release3.2/doc/tutorial_zh.md
twemproxy
由twemproxy双向代理客户端实现分片,即代替用户将数据分片到不同的后端服务器进行读写,其还支持memcached,可以为proxy配置算法,缺点为twemproxy是瓶颈,不支持数据迁移
Github地址:https://github.com/twitter/twemproxy