第一节 Redis的安装
第二节 Redis的五种数据结构(String、Hash、List、Set、ZSet)
第三节 Redis的持久化方式
第四节 Redis主从架构
第五节 Redis哨兵高可用架构
本节介绍集群高可用架构、与哨兵架构的区别及搭建过程。
哨兵架构的特点:
三主三从:三个master节点,每个master节点搭配一个slave节点;三台机器,每台机器一主一从。
IP | 端口 |
---|---|
192.168.75.200 | 7001、7002 |
192.168.75.201 | 7001、7002 |
192.168.75.202 | 7001、7002 |
以一台机器为例,其余两台机器类似:
cd /root/packages/redis-5.0.4
mkdir redis-cluster
cd redis-cluster
mkdir 7001 7002
vi redis.conf
daemonize yes
#redis设置端口号
port 7001
# 把pid进程号写入pidfile配置的文件
pidfile /var/run/redis_7001.pid
#指定数据文件存放位置,必须要指定不同的目录位置,不然会丢失数据
dir /root/packages/redis-5.0.4/redis-cluster/7001/
#启动集群模式
cluster-enabled yes
#集群节点信息文件,这里700x最好和port对应上
cluster-config-file nodes-7001.conf
cluster-node-timeout 10000
# bind 127.0.0.1(bind绑定的是自己机器网卡的ip,如果有多块网卡可以配多个ip,代表允许客户端通过机器的哪些网卡ip去访问,内网一般可以不配置bind,注释掉即可)
#关闭保护模式
protected-mode no
appendonly yes
#开启密码校验,需要以下两项
#设置redis访问密码
requirepass admin
#设置集群节点间访问密码,跟上面一致
masterauth admin
把修改后的7001文件夹里的配置文件redis.conf复制到7002文件夹中,按照第二步的操作,可以用批量替换:
:%s/源字符串/目的字符串/g
在另外两台机器重复执行上面1、2、3步的操作步骤。
cd /root/packages/redis-5.0.4
#启动机器所有节点
src/redis-server ./redis-cluster/7001/redis.conf
src/redis-server ./redis-cluster/7002/redis.conf
#查看是否启动成功
ps -ef | grep redis
192.168.75.200结果打印:
[root@node1 redis-5.0.4]# ps -ef | grep redis
root 48737 1 0 16:37 ? 00:00:00 src/redis-server 0.0.0.0:7001 [cluster]
root 48742 1 0 16:37 ? 00:00:00 src/redis-server 0.0.0.0:7002 [cluster]
root 48747 4808 0 16:37 pts/0 00:00:00 grep --color=auto redis
Redis5之前的版本集群是依靠ruby脚本redis-trib.rb实现
cd /root/packages/redis-5.0.4
src/redis-cli -a admin --cluster create --cluster-replicas 1 192.168.75.200:7001 192.168.75.200:7002 192.168.75.201:7001 192.168.75.201:7002 192.168.75.202:7001 192.168.75.202:7002
结果打印:
[root@node1 redis-5.0.4]# src/redis-cli -a admin --cluster create --cluster-replicas 1 192.168.75.200:7001 192.168.75.200:7002 192.168.75.201:7001 192.168.75.201:7002 192.168.75.202:7001 192.168.75.202:7002
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 192.168.75.201:7002 to 192.168.75.200:7001
Adding replica 192.168.75.202:7002 to 192.168.75.201:7001
Adding replica 192.168.75.200:7002 to 192.168.75.202:7001
M: abe9d74cda6ad37836bcfe55ad510fb6245cdd66 192.168.75.200:7001
slots:[0-5460] (5461 slots) master
S: 60164c7d06ff4f7f5849eef54e8641642ed38a16 192.168.75.200:7002
replicates 7d5058f3a940809e9446cbe5ac992d2ab634d8f1
M: 35eeab345e68a07bbfea25e7f24fa37d893ea7ed 192.168.75.201:7001
slots:[5461-10922] (5462 slots) master
S: e3ebb97acfb574e040d22022233f6c6c8bcfaa3a 192.168.75.201:7002
replicates abe9d74cda6ad37836bcfe55ad510fb6245cdd66
M: 7d5058f3a940809e9446cbe5ac992d2ab634d8f1 192.168.75.202:7001
slots:[10923-16383] (5461 slots) master
S: 85ebe7f5fb85329b20bf37c91ede9fa16e52ccd2 192.168.75.202:7002
replicates 35eeab345e68a07bbfea25e7f24fa37d893ea7ed
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
....
>>> Performing Cluster Check (using node 192.168.75.200:7001)
M: abe9d74cda6ad37836bcfe55ad510fb6245cdd66 192.168.75.200:7001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
M: 7d5058f3a940809e9446cbe5ac992d2ab634d8f1 192.168.75.202:7001
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: 60164c7d06ff4f7f5849eef54e8641642ed38a16 192.168.75.200:7002
slots: (0 slots) slave
replicates 7d5058f3a940809e9446cbe5ac992d2ab634d8f1
S: e3ebb97acfb574e040d22022233f6c6c8bcfaa3a 192.168.75.201:7002
slots: (0 slots) slave
replicates abe9d74cda6ad37836bcfe55ad510fb6245cdd66
M: 35eeab345e68a07bbfea25e7f24fa37d893ea7ed 192.168.75.201:7001
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 85ebe7f5fb85329b20bf37c91ede9fa16e52ccd2 192.168.75.202:7002
slots: (0 slots) slave
replicates 35eeab345e68a07bbfea25e7f24fa37d893ea7ed
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
执行上面的命令需要确认三台机器之间的redis实例要能相互访问,可以把所有机器防火墙关掉,如果不关闭防火墙则需要打开redis服务端口和集群节点gossip通信端口(默认是在redis端口号上加1W),例如redis端口号为6379,则gossip通信端口为16379
#关闭防火墙
systemctl stop firewalld # 临时关闭防火墙
systemctl disable firewalld # 禁止开机启动
#-a访问服务端密码,-c表示集群模式,指定ip地址和端口号
./redis-cli -c -h -p
例如:
cd /root/packages/redis-5.0.4
src/redis-cli -a admin -c -h 192.168.75.200 -p 7001
[root@node1 redis-5.0.4]# src/redis-cli -a admin -c -h 192.168.75.200 -p 7001
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.75.200:7001>
cluster info(查看集群信息)
cluster nodes(查看节点列表)
192.168.75.200:7001> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:114
cluster_stats_messages_pong_sent:119
cluster_stats_messages_sent:233
cluster_stats_messages_ping_received:114
cluster_stats_messages_pong_received:114
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:233
192.168.75.200:7001> cluster nodes
7d5058f3a940809e9446cbe5ac992d2ab634d8f1 192.168.75.202:7001@17001 master - 0 1657010794598 5 connected 10923-16383
60164c7d06ff4f7f5849eef54e8641642ed38a16 192.168.75.200:7002@17002 slave 7d5058f3a940809e9446cbe5ac992d2ab634d8f1 0 1657010795207 5 connected
abe9d74cda6ad37836bcfe55ad510fb6245cdd66 192.168.75.200:7001@17001 myself,master - 0 1657010794000 1 connected 0-5460
e3ebb97acfb574e040d22022233f6c6c8bcfaa3a 192.168.75.201:7002@17002 slave abe9d74cda6ad37836bcfe55ad510fb6245cdd66 0 1657010794193 4 connected
35eeab345e68a07bbfea25e7f24fa37d893ea7ed 192.168.75.201:7001@17001 master - 0 1657010794000 3 connected 5461-10922
85ebe7f5fb85329b20bf37c91ede9fa16e52ccd2 192.168.75.202:7002@17002 slave 35eeab345e68a07bbfea25e7f24fa37d893ea7ed 0 1657010793686 6 connected
从上面的输出结果可以看出,每台机器的master和slave都是相互错开,跨机器的
cd /root/packages/redis-5.0.4
src/redis-cli -a admin -c -h 192.168.75.200 -p 7001 shutdown
src/redis-cli -a admin -c -h 192.168.75.200 -p 7002 shutdown
[root@node1 redis-5.0.4]# src/redis-cli -a admin -c -h 192.168.75.200 -p 7002
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.75.200:7002> cluster nodes
60164c7d06ff4f7f5849eef54e8641642ed38a16 192.168.75.200:7002@17002 myself,slave 7d5058f3a940809e9446cbe5ac992d2ab634d8f1 0 1657025544000 2 connected
85ebe7f5fb85329b20bf37c91ede9fa16e52ccd2 192.168.75.202:7002@17002 slave 35eeab345e68a07bbfea25e7f24fa37d893ea7ed 0 1657025544000 6 connected
e3ebb97acfb574e040d22022233f6c6c8bcfaa3a 192.168.75.201:7002@17002 slave abe9d74cda6ad37836bcfe55ad510fb6245cdd66 0 1657025545721 4 connected
7d5058f3a940809e9446cbe5ac992d2ab634d8f1 192.168.75.202:7001@17001 master - 0 1657025544706 5 connected 10923-16383
abe9d74cda6ad37836bcfe55ad510fb6245cdd66 192.168.75.200:7001@17001 master - 0 1657025544000 1 connected 0-5460
35eeab345e68a07bbfea25e7f24fa37d893ea7ed 192.168.75.201:7001@17001 master - 0 1657025544000 3 connected 5461-10922
192.168.75.202:7001下线:
60164c7d06ff4f7f5849eef54e8641642ed38a16 192.168.75.200:7002@17002 myself,master - 0 1657025580000 7 connected 10923-16383
85ebe7f5fb85329b20bf37c91ede9fa16e52ccd2 192.168.75.202:7002@17002 slave 35eeab345e68a07bbfea25e7f24fa37d893ea7ed 0 1657025579000 6 connected
e3ebb97acfb574e040d22022233f6c6c8bcfaa3a 192.168.75.201:7002@17002 slave abe9d74cda6ad37836bcfe55ad510fb6245cdd66 0 1657025580000 4 connected
7d5058f3a940809e9446cbe5ac992d2ab634d8f1 192.168.75.202:7001@17001 master,fail - 1657025570527 1657025569512 5 disconnected
abe9d74cda6ad37836bcfe55ad510fb6245cdd66 192.168.75.200:7001@17001 master - 0 1657025580582 1 connected 0-5460
35eeab345e68a07bbfea25e7f24fa37d893ea7ed 192.168.75.201:7001@17001 master - 0 1657025580177 3 connected 5461-10922
可以看出192.168.75.200:7002变成了新的master,恢复192.168.75.202:7001之后,变成了slave从节点了
192.168.75.200:7002> cluster nodes
60164c7d06ff4f7f5849eef54e8641642ed38a16 192.168.75.200:7002@17002 myself,master - 0 1657025854000 7 connected 10923-16383
85ebe7f5fb85329b20bf37c91ede9fa16e52ccd2 192.168.75.202:7002@17002 slave 35eeab345e68a07bbfea25e7f24fa37d893ea7ed 0 1657025855000 6 connected
e3ebb97acfb574e040d22022233f6c6c8bcfaa3a 192.168.75.201:7002@17002 slave abe9d74cda6ad37836bcfe55ad510fb6245cdd66 0 1657025854000 4 connected
7d5058f3a940809e9446cbe5ac992d2ab634d8f1 192.168.75.202:7001@17001 slave 60164c7d06ff4f7f5849eef54e8641642ed38a16 0 1657025855742 7 connected
abe9d74cda6ad37836bcfe55ad510fb6245cdd66 192.168.75.200:7001@17001 master - 0 1657025855540 1 connected 0-5460
35eeab345e68a07bbfea25e7f24fa37d893ea7ed 192.168.75.201:7001@17001 master - 0 1657025855000 3 connected 5461-10922
默认会对key值使用CRC16算法进行hash得到一个整数值,然后用这个整数值对 16384 进行取模来得到具体槽位。
HASH_SLOT = CRC16(key) mod 16384
当客户端向一个错误的节点发出了指令,该节点会发现指令的 key 所在的槽位并不归自己管理,这时它会向客户端发送一个特殊的跳转指令携带目标操作的节点地址,告诉客户端去连这个节点去获取数据。客户端收到指令后除了跳转到正确的节点上去操作,还会同步更新纠正本地的槽位映射表缓存,后续所有 key 将使用新的槽位映射表。
[root@node1 redis-5.0.4]# src/redis-cli -a admin -c -h 192.168.75.200 -p 7001
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.75.200:7001> set name zhangsang
-> Redirected to slot [5798] located at 192.168.75.201:7001
OK
192.168.75.201:7001>
Redis集群节点间采取gossip协议进行通信。维护集群的元数据(集群节点信息,主从角色,节点数量,各节点共享的数据等)有两种方式:集中式和gossip。
gossip协议包含多种消息,包括ping,pong,meet,fail等。
优缺点:
redis节点有一个专门用于节点间gossip通信的端口,为提供redis服务的端口号+1W,例如7001,则用于节点间gossip通信的就是17001端口。 每个节点每隔一段时间都会往另外几个节点发送ping消息,同时其他节点接收到ping消息之后返回pong消息。
网络抖动是一种非常常见的现象,突然部分连接变得不可访问,然后很快又恢复正常。
对于这种情况下,redis集群提供了一种配置选项cluster-node-timeout,表示当某个节点持续 timeout 的时间不可访问,才可以认定该节点出现故障,需要进行主从切换。如果没有这个配置选项,网络抖动会导致主从频繁切换和数据的重新复制。