redis cluster管理工具redis-trib.rb详解
本次使用的redis版本是:4.0.9
Redis安装过程这里不再讲解,网上有很多教程.这里主要讲解如何搭建集群.
本次使用一机(Ubantu)搭建6个服务端.一主配一从.
2.分别放入以下文件
//客户端运行文件
redis-cli
//配置文件
redis.conf
//服务端运行文件
redis-server
3.修改每个文件夹中的配置文件redis.conf
//端口,相应地修改为7000-7005
port 7000
//使能集群
cluster-enable yes
//集群会将当前节点 记录的集群状态持久化到这个文件中
cluster-config-file nodes.conf
配置完成
cd 7000
./redis-server redis.conf
cd ..
cd 7001
./redis-server redis.conf
cd ..
cd 7002
./redis-server redis.conf
cd ..
cd 7003
./redis-server redis.conf
cd ..
cd 7004
./redis-server redis.conf
cd ..
cd 7005
./redis-server redis.conf
cd ..
cd 7000
./redis-cli -p 7000 shutdown
./redis-cli -p 7001 shutdown
./redis-cli -p 7002 shutdown
./redis-cli -p 7003 shutdown
./redis-cli -p 7004 shutdown
./redis-cli -p 7005 shutdown
注意:这两个脚本文件需要放在7000-7005同级的目录里.
./start-all.sh
./stop-all.sh
4.登录客户端查看集群状态
//登录
redis-cli -p 7000
//查看状态
info cluster
1.安装ruby工具
//安装ruby
sudo apt-get install ruby
//安装ruby连接接接口
gem install redis
如果不安装接口,后续操作 会报错
/usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- redis (LoadError)
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from ./redis-trib.rb:25:in `'
2.初始化集群
cluster_enabled为1表示集群可以正常使用了.现在每个节点都是独立的.需要将他们加入到同一个集群中.
Redis的src文件夹里提供了一个辅助工具redis-trib.rb.使用ruby编写的.
将其复制到7000-7005同级的目录下.并创建初始化脚本init-cluster.sh
init-cluster.sh
./redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005
–replicas 1:表示每个数据库都有一个从数据库,因此整个集群一共有三个主数据库和是三个从数据库.
执行脚本
./init-cluster.sh
有可能会报错 Node 127.0.0.1:7001 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0.
解决办法:
1.把目录下的dump.rdb和nodes-700x.conf删除掉.
2. 登录客户端,执行flushdb
再次执行脚本,报警告Trying to optimize slaves allocation for anti-affinity [WARNING] Some slaves are in the same host as their master
这个无所谓,只是建议不要主从数据库放在同个机器上,这样子宕机时会不安全.
>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
127.0.0.1:7000
127.0.0.1:7001
127.0.0.1:7002
Adding replica 127.0.0.1:7004 to 127.0.0.1:7000
Adding replica 127.0.0.1:7005 to 127.0.0.1:7001
Adding replica 127.0.0.1:7003 to 127.0.0.1:7002
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
M: 77bde185502595509d9aa000cb67b0aa55a9439f 127.0.0.1:7000
slots:0-5460,5474,9494,9499,9842,11037,12291,12933,13310,14635 (5470 slots) master
M: daea47a8df47877859078837ca88d8b39f62c41b 127.0.0.1:7001
slots:5282,5461-10922,11037,12291,12933,13310,14635 (5468 slots) master
M: a4d6171872e7393180441031bdfd5b5f00809a45 127.0.0.1:7002
slots:5282,5474,9494,9499,9842,10923-16383 (5466 slots) master
S: 7fbccf4b2c6b614784c2943eb4775834fe9ebcfd 127.0.0.1:7003
replicates a4d6171872e7393180441031bdfd5b5f00809a45
S: 174f0b1128f479578db4c51c448cb8f6c31d0a84 127.0.0.1:7004
replicates 77bde185502595509d9aa000cb67b0aa55a9439f
S: b26dd23a6c2f21d3cbc4df1e846e373b4749b9cc 127.0.0.1:7005
replicates daea47a8df47877859078837ca88d8b39f62c41b
Can I set the above configuration? (type 'yes' to accept):
输入:yes,出现以下内容说明创建成功.
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join...
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: 052d58c37eb64a73b3f4ca881bba2dc20706094e 127.0.0.1:7000
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: 5be70c15ffbda0d2d45a8616b75655e330c84a92 127.0.0.1:7005
slots: (0 slots) slave
replicates 389e4349c62d4fa0548d31c087bc44306a4726e0
M: 435aab857cd933296b501876a950408d25832f9a 127.0.0.1:7001
slots:5461-10922 (5462 slots) master
1 additional replica(s)
S: 7f07f770ce5bd1b78cb1f5033be846c8c85d4b5b 127.0.0.1:7003
slots: (0 slots) slave
replicates 052d58c37eb64a73b3f4ca881bba2dc20706094e
M: 389e4349c62d4fa0548d31c087bc44306a4726e0 127.0.0.1:7002
slots:10923-16383 (5461 slots) master
1 additional replica(s)
S: 29846d40aaf965794e50a62cbfd5598a0d9e5659 127.0.0.1:7004
slots: (0 slots) slave
replicates 435aab857cd933296b501876a950408d25832f9a
[OK] All nodes agree about slots configuration.
可以看到给各个主数据库分配的槽
127.0.0.1:7000 --> slots:0-5460
127.0.0.1:7001 --> slots:5461-10922
127.0.0.1:7002 --> slots:10923-16383
如果出现的是以下内容,说明创建失败
/var/lib/gems/2.3.0/gems/redis-4.0.2/lib/redis/client.rb:119:in `call': ERR Slot 5282 is already busy (Redis::CommandError)
from /var/lib/gems/2.3.0/gems/redis-4.0.2/lib/redis.rb:2854:in `block in method_missing'
from /var/lib/gems/2.3.0/gems/redis-4.0.2/lib/redis.rb:45:in `block in synchronize'
from /usr/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'
from /var/lib/gems/2.3.0/gems/redis-4.0.2/lib/redis.rb:45:in `synchronize'
from /var/lib/gems/2.3.0/gems/redis-4.0.2/lib/redis.rb:2853:in `method_missing'
from ./redis-trib.rb:212:in `flush_node_config'
from ./redis-trib.rb:906:in `block in flush_nodes_config'
from ./redis-trib.rb:905:in `each'
from ./redis-trib.rb:905:in `flush_nodes_config'
from ./redis-trib.rb:1426:in `create_cluster_cmd'
from ./redis-trib.rb:1830:in `'
创建失败先删除各个文件夹下的dump.rdb和nodes-700x.conf.然后关闭所有节点,重复以上操作即可.
整个集群创建完成.
使用客户端连接任意一个节点,使用-c表示以集群的方式登录,-p指定端口
./7000/redis-cli -c -p 7000
//无任何参数,单机方式登录,默认地址127.0.0.1,默认端口 7000
./7000/redis-cli
//无任何参数,集群方式登录,-h指定地址127.1.2.3,-p 指定端口 7000
./7000/redis-cli -c -h 127.1.2.3 -p 7000
登录后查询状态和节点信息
cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:74
cluster_stats_messages_pong_sent:73
cluster_stats_messages_sent:147
cluster_stats_messages_ping_received:68
cluster_stats_messages_pong_received:74
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:147
cluster_state:ok 表示集群连接正常
cluster nodes
5be70c15ffbda0d2d45a8616b75655e330c84a92 127.0.0.1:7005@17005 slave 389e4349c62d4fa0548d31c087bc44306a4726e0 0 1535617661593 6 connected
435aab857cd933296b501876a950408d25832f9a 127.0.0.1:7001@17001 master - 0 1535617662594 2 connected 5461-10922
7f07f770ce5bd1b78cb1f5033be846c8c85d4b5b 127.0.0.1:7003@17003 slave 052d58c37eb64a73b3f4ca881bba2dc20706094e 0 1535617664599 4 connected
389e4349c62d4fa0548d31c087bc44306a4726e0 127.0.0.1:7002@17002 master - 0 1535617663000 3 connected 10923-16383
29846d40aaf965794e50a62cbfd5598a0d9e5659 127.0.0.1:7004@17004 slave 435aab857cd933296b501876a950408d25832f9a 0 1535617663598 5 connected
052d58c37eb64a73b3f4ca881bba2dc20706094e 127.0.0.1:7000@17000 myself,master - 0 1535617662000 1 connected 0-5460
复制上述的其中两个文件夹,命名为7006(配置为主数据库)和7007(配置为从数据库).删除文件夹下的dump.rdb和nodes-700x.conf
修改配置文件,将端口号更改为7006和7007
更改cluster-config-file nodes-7006.conf 和 cluster-config-file nodes-7007.conf
add-node命令可以将新节点加入集群,节点可以为master,也可以为某个master节点的slave。
add-node new_host:new_port existing_host:existing_port
–slave
–master-id
add-node有两个可选参数:
–slave:设置该参数,则新节点以slave的角色加入集群
–master-id:这个参数需要设置了–slave才能生效,–master-id用来指定新节点的master节点。如果不设置该参数,则会随机为节点选择master节点。
new_host:新加入节点的ip
new_port:新加入节点的port
existing_host:已经存在集群中节点的ip
existing_port:已经存在集群中节点的port
添加节点
先启动节点
lgj@lgj-Lenovo-G470:~/redis-4.0.9/7006$ redis-server redis.conf
再添加节点到集群
lgj@lgj-Lenovo-G470:~/redis-4.0.9$ ./redis-trib.rb add-node 127.0.0.1:7006 127.0.0.1:7000
输出,添加成功([OK] New node added correctly)
>>> Adding node 127.0.0.1:7006 to cluster 127.0.0.1:7000
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: 052d58c37eb64a73b3f4ca881bba2dc20706094e 127.0.0.1:7000
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: 5be70c15ffbda0d2d45a8616b75655e330c84a92 127.0.0.1:7005
slots: (0 slots) slave
replicates 389e4349c62d4fa0548d31c087bc44306a4726e0
M: 435aab857cd933296b501876a950408d25832f9a 127.0.0.1:7001
slots:5461-10922 (5462 slots) master
1 additional replica(s)
S: 7f07f770ce5bd1b78cb1f5033be846c8c85d4b5b 127.0.0.1:7003
slots: (0 slots) slave
replicates 052d58c37eb64a73b3f4ca881bba2dc20706094e
M: 389e4349c62d4fa0548d31c087bc44306a4726e0 127.0.0.1:7002
slots:10923-16383 (5461 slots) master
1 additional replica(s)
S: 29846d40aaf965794e50a62cbfd5598a0d9e5659 127.0.0.1:7004
slots: (0 slots) slave
replicates 435aab857cd933296b501876a950408d25832f9a
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:7006 to make it join the cluster.
[OK] New node added correctly.
查看节点信息,可以看到7006已经被添加集群中,并且作为master
lgj@lgj-Lenovo-G470:~/redis-4.0.9$ ./7007/redis-cli -p 7006
127.0.0.1:7006> cluster nodes
be3e6b774fc7f3c394d444fef43d24437525953d 127.0.0.1:7006@17006 myself,master - 0 1535622141000 0 connected
5be70c15ffbda0d2d45a8616b75655e330c84a92 127.0.0.1:7005@17005 slave 389e4349c62d4fa0548d31c087bc44306a4726e0 0 1535622143178 3 connected
435aab857cd933296b501876a950408d25832f9a 127.0.0.1:7001@17001 master - 0 1535622143000 2 connected 5461-10922
29846d40aaf965794e50a62cbfd5598a0d9e5659 127.0.0.1:7004@17004 slave 435aab857cd933296b501876a950408d25832f9a 0 1535622144181 2 connected
052d58c37eb64a73b3f4ca881bba2dc20706094e 127.0.0.1:7000@17000 master - 0 1535622142000 1 connected 0-5460
389e4349c62d4fa0548d31c087bc44306a4726e0 127.0.0.1:7002@17002 master - 0 1535622140170 3 connected 10923-16383
7f07f770ce5bd1b78cb1f5033be846c8c85d4b5b 127.0.0.1:7003@17003 slave 052d58c37eb64a73b3f4ca881bba2dc20706094e 0 1535622141174 1 connected
添加完主节点需要对主节点进行hash槽分配这样该主节才可以存储数据。
redis集群有16384个槽,集群中的每个master结点分配一些槽,通过cluster nodes查看集群结点可以看到槽占用情况
reshard
reshard命令可以在线把集群的一些slot从集群原来slot负责节点迁移到新的节点,利用reshard可以完成集群的在线横向扩容和缩容。
reshard的参数很多,下面来一一解释一番:
reshard host:port
–from
–to
–slots
–yes
–timeout
–pipeline
host:port:这个是必传参数,用来从一个节点获取整个集群信息,相当于获取集群信息的入口。
–from :需要从哪些源节点上迁移slot,可从多个源节点完成迁移,以逗号隔开,传递的是节点的node id,还可以直接传递–from all,这样源节点就是集群的所有节点,不传递该参数的话,则会在迁移过程中提示用户输入。
–to :slot需要迁移的目的节点的node id,目的节点只能填写一个,不传递该参数的话,则会在迁移过程中提示用户输入。
–slots :需要迁移的slot数量,不传递该参数的话,则会在迁移过程中提示用户输入。
–yes:设置该参数,可以在打印执行reshard计划的时候,提示用户输入yes确认后再执行reshard。
–timeout :设置migrate命令的超时时间。
–pipeline :定义cluster getkeysinslot命令一次取出的key数量,不传的话使用默认值为10。
//开始reshard,从127.0.0.1:7001获取集群信息
lgj@lgj-Lenovo-G470:~/redis-4.0.9$ ./redis-trib.rb reshard 127.0.0.1:7001
>>> Performing Cluster Check (using node 127.0.0.1:7001)
M: 435aab857cd933296b501876a950408d25832f9a 127.0.0.1:7001
slots:5461-10922 (5462 slots) master
1 additional replica(s)
S: 5be70c15ffbda0d2d45a8616b75655e330c84a92 127.0.0.1:7005
slots: (0 slots) slave
replicates 389e4349c62d4fa0548d31c087bc44306a4726e0
M: be3e6b774fc7f3c394d444fef43d24437525953d 127.0.0.1:7006
slots: (0 slots) master
0 additional replica(s)
M: 052d58c37eb64a73b3f4ca881bba2dc20706094e 127.0.0.1:7000
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: 7f07f770ce5bd1b78cb1f5033be846c8c85d4b5b 127.0.0.1:7003
slots: (0 slots) slave
replicates 052d58c37eb64a73b3f4ca881bba2dc20706094e
S: 29846d40aaf965794e50a62cbfd5598a0d9e5659 127.0.0.1:7004
slots: (0 slots) slave
replicates 435aab857cd933296b501876a950408d25832f9a
M: 389e4349c62d4fa0548d31c087bc44306a4726e0 127.0.0.1:7002
slots:10923-16383 (5461 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
//需要分配的slot数
How many slots do you want to move (from 1 to 16384)? 800
What is the receiving node ID? be3e6b774fc7f3c394d444fef43d24437525953d
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
//从哪个节点分配,这里选择所有
Source node #1:all
查看重新分配slot后的情况
lgj@lgj-Lenovo-G470:~/redis-4.0.9$ ./7007/redis-cli -p 7006
127.0.0.1:7006> cluster nodes
be3e6b774fc7f3c394d444fef43d24437525953d 127.0.0.1:7006@17006 myself,master - 0 1535624156000 7 connected 0-265 5461-5727 10923-11188
5be70c15ffbda0d2d45a8616b75655e330c84a92 127.0.0.1:7005@17005 slave 389e4349c62d4fa0548d31c087bc44306a4726e0 0 1535624156415 3 connected
435aab857cd933296b501876a950408d25832f9a 127.0.0.1:7001@17001 master - 0 1535624158422 2 connected 5728-10922
29846d40aaf965794e50a62cbfd5598a0d9e5659 127.0.0.1:7004@17004 slave 435aab857cd933296b501876a950408d25832f9a 0 1535624157418 2 connected
052d58c37eb64a73b3f4ca881bba2dc20706094e 127.0.0.1:7000@17000 master - 0 1535624159424 1 connected 266-5460
389e4349c62d4fa0548d31c087bc44306a4726e0 127.0.0.1:7002@17002 master - 0 1535624157000 3 connected 11189-16383
7f07f770ce5bd1b78cb1f5033be846c8c85d4b5b 127.0.0.1:7003@17003 slave 052d58c37eb64a73b3f4ca881bba2dc20706094e 0 1535624155000 1 connected
7006 --> 0-265 5461-5727 10923-11188
7000 --> 266-5460
7001 --> 5728-10922
7002 --> 11189-16383
添加主节点完毕
1.启动7007节点
lgj@lgj-Lenovo-G470:~/redis-4.0.9$ ./7007/redis-server redis.conf
2.添加从节点
命令格式为:
./redis-trib.rb add-node --slave --master-id 主节点id 添加节点的ip和端口 集群中已存在节点ip和端口
//执行指令
lgj@lgj-Lenovo-G470:~/redis-4.0.9$ ./redis-trib.rb add-node
--slave --master-id be3e6b774fc7f3c394d444fef43d24437525953d
127.0.0.1:7007 127.0.0.1:7000
输出
>>> Adding node 127.0.0.1:7007 to cluster 127.0.0.1:7000
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: 052d58c37eb64a73b3f4ca881bba2dc20706094e 127.0.0.1:7000
slots:266-5460 (5195 slots) master
1 additional replica(s)
S: 5be70c15ffbda0d2d45a8616b75655e330c84a92 127.0.0.1:7005
slots: (0 slots) slave
replicates 389e4349c62d4fa0548d31c087bc44306a4726e0
M: be3e6b774fc7f3c394d444fef43d24437525953d 127.0.0.1:7006
slots:0-265,5461-5727,10923-11188 (799 slots) master
0 additional replica(s)
M: 435aab857cd933296b501876a950408d25832f9a 127.0.0.1:7001
slots:5728-10922 (5195 slots) master
1 additional replica(s)
S: 7f07f770ce5bd1b78cb1f5033be846c8c85d4b5b 127.0.0.1:7003
slots: (0 slots) slave
replicates 052d58c37eb64a73b3f4ca881bba2dc20706094e
M: 389e4349c62d4fa0548d31c087bc44306a4726e0 127.0.0.1:7002
slots:11189-16383 (5195 slots) master
1 additional replica(s)
S: 29846d40aaf965794e50a62cbfd5598a0d9e5659 127.0.0.1:7004
slots: (0 slots) slave
replicates 435aab857cd933296b501876a950408d25832f9a
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:7007 to make it join the cluster.
Waiting for the cluster to join.
>>> Configure node as replica of 127.0.0.1:7006.
[OK] New node added correctly.
3.查看节点信息
lgj@lgj-Lenovo-G470:~/redis-4.0.9$ ./7007/redis-cli -p 7006
127.0.0.1:7006> cluster nodes
be3e6b774fc7f3c394d444fef43d24437525953d 127.0.0.1:7006@17006 myself,master - 0 1535624889000 7 connected 0-265 5461-5727 10923-11188
5be70c15ffbda0d2d45a8616b75655e330c84a92 127.0.0.1:7005@17005 slave 389e4349c62d4fa0548d31c087bc44306a4726e0 0 1535624893330 3 connected
435aab857cd933296b501876a950408d25832f9a 127.0.0.1:7001@17001 master - 0 1535624893000 2 connected 5728-10922
29846d40aaf965794e50a62cbfd5598a0d9e5659 127.0.0.1:7004@17004 slave 435aab857cd933296b501876a950408d25832f9a 0 1535624894332 2 connected
052d58c37eb64a73b3f4ca881bba2dc20706094e 127.0.0.1:7000@17000 master - 0 1535624888000 1 connected 266-5460
//新创建的从节点,id(be3e6b774fc7f3c394d444fef43d24437525953d)是7006的
3cee88c47919c79c42a149fc03a3f61095062adf 127.0.0.1:7007@17007 slave be3e6b774fc7f3c394d444fef43d24437525953d 0 1535624892326 7 connected
389e4349c62d4fa0548d31c087bc44306a4726e0 127.0.0.1:7002@17002 master - 0 1535624892000 3 connected 11189-16383
7f07f770ce5bd1b78cb1f5033be846c8c85d4b5b 127.0.0.1:7003@17003 slave 052d58c37eb64a73b3f4ca881bba2dc20706094e 0 1535624893000 1 connected
./redis-trib.rb del-node 127.0.0.1:7002 389e4349c62d4fa0548d31c087bc44306a4726e0
删除已经占有hash槽的结点会失败,报错如下:
[ERR] Node 127.0.0.1 is not empty! Reshard data away and try again.
需要将该结点占用的hash槽分配出去,请参考《hash槽重新分配》这段内容。
已经添加了新的节点,因此需要同步修改脚本,比便后续操作
start-all.sh
stop-all.sh
init-cluster.sh