关于集群的架构请参考redis集群架构
一、准备阶段
准备阶段略长,若已经熟悉redis cluster,可以跳过
1.需要安装ruby
在开始搭建redis之前,你需要确保你本地有ruby(2.0以上),且该ruby 中有redis 扩展包。
ruby 官网地址
ruby 官网安装教程
一般我都是直接下载安装(即官网教程里的“Building from Source”)安装完毕之后记得确认是否安装成功
//查看本机是否安装ruby,以及ruby版本
//注意这里的2.3.1 以及安装时间2016-05-01
$ ruby -v
ruby 2.3.1p481 (2016-05-01 revision 45883) [universal.x86_64-darwin13]
2.需要有ruby redis包
//验证是否安装好ruby redis扩展包:[这里查看ruby 扩展包有更好的办法请告诉我,我对ruby完全不熟悉]
//如果没有安装好ruby redis,请看"2.1 安装ruby redis 扩展包",
//如果已经安装好,直接看"二 实施阶段"
# find / -name "redis"
/Library/Ruby/Gems/2.3.1/gems/redis-3.0.6/lib/redis
2.1 安装ruby redis 扩展包
//查看你本地是否有gem,如果没有安装,请看"2.2 安装gem",否则请看下面内容
$ gem -v
2.0.14
#gem install redis -v 3.0.6
//因为gems官网的镜像在国外,所以国内网络经常断连。你需要链接一个国内镜像,我用的是https://ruby.taobao.org
//查看现在镜像的来源地址
$ gem sources -l
*** CURRENT SOURCES ***
https://rubygems.org/
/**
* 增加https://ruby.taobao.org为镜像,并且移除官网的https://rubygems.org/镜像
*/
$ gem sources --add https://ruby.taobao.org/ --remove https://rubygems.org/
//查看现在镜像的来源地址
$ gem sources -l
*** CURRENT SOURCES ***
https://ruby.taobao.org
//安装ruby redis 3.0.6包[这个包需要与你本地的redis包的版本一致哦]
#gem install redis -v 3.0.6
//验证是否安装好redis扩展包:[这里查看ruby 扩展包有更好的办法请告诉我,我对ruby完全不熟悉]
# find / -name "redis"
/Library/Ruby/Gems/2.3.1/gems/redis-3.0.6/lib/redis
2.2 安装gem
rubygems官网下载地址
下载后就按照正常的安装步骤安装即可
//"./configure" 该步骤可以配置安装路径,以及其他参数,请查看帮助
$ ./configure
$ make
$ sudo make install
$ gem -v
2.0.14
3.你需要了解redis的架构
redis架构相关 http://blog.csdn.net/naixiyi/article/details/51335059
4.认识redis.conf文件关于redis集群(redis cluster)的配置
要使redis以集群方式启动,而不是普通单例方式启动,需要更改redis.conf文件以下字段.保存为demo_redis.conf
//这里的port 随意你定,你只要保证该端口7000以及(7000+10000)端口是空闲的即可。前者是用于服务client的端口。后者是用于各个实例相互通信的bus port,默认为服务client端口加10000
port 7000
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
5.认识redis-trib.rb脚本
该脚本的位置在./../redis-3.0.6/src/redis-trib.rb。你可以查看有的各个命令
$ ./../redis-3.0.6/src/redis-trib.rb
Usage: redis-trib
create host1:port1 ... hostN:portN
--replicas
check host:port
info host:port
fix host:port
--timeout
reshard host:port
--from
--to
--slots
--yes
--timeout
--pipeline
rebalance host:port
--weight
--auto-weights
--threshold
--use-empty-masters
--timeout
--simulate
--pipeline
add-node new_host:new_port existing_host:existing_port
--slave
--master-id
del-node host:port node_id
set-timeout host:port milliseconds
call host:port command arg arg .. arg
import host:port
--from
--copy
--replace
help (show this help)
流程为
二 实施阶段(使用redis-trib.rb方法)
//再次确认
//ruby 已经安装
$ ruby -v
ruby 2.3.1p481 (2016-05-01 revision 45883) [universal.x86_64-darwin13]
//ruby redis 包已经安装
# find / -name "redis"
/Library/Ruby/Gems/2.3.1/gems/redis-3.0.6/lib/redis
//大致路径为
1.创建各个实例的目录
2.启动各个目录内的实例
3.连接各个实例
4.给各个实例分配slot
5.增加实例为master
5.1启动实例
5.2增加实例为master
5.3为新增加的master 再分区(resharding) ,即从其他master移动一些slot
6.增加一个实例为salve
6.1启动实例
6.2增加实例为master
7.通过master reshard删除一个master实例
7.1 移除改master 实例的slot
7.2 del-node
8.通过master failover删除一个master实例
8.1 人工备份一个master
8.2 del-node
9.删除一个salve 实例
1.创建各个实例的目录
//一般以port为目录名
$ ls -al
total 0
drwxr-xr-x 10 admin 340 5 8 16:38 .
drwxrwxr-x+ 81 root admin 2754 5 1 23:40 ..
drwxr-xr-x 3 admin 102 5 8 16:37 9001
drwxr-xr-x 3 admin 102 5 8 16:38 9002
drwxr-xr-x 3 admin 102 5 8 16:38 9003
drwxr-xr-x 3 admin 102 5 8 16:39 9004
drwxr-xr-x 3 admin 102 5 8 16:39 9005
drwxr-xr-x 3 admin 102 5 8 16:39 9006
drwxr-xr-x@ 25 admin 850 4 26 21:53 redis-3.0.6
//将demo_redis.conf(见上面“4.认识redis.conf文件关于redis集群(redis cluster)的配置”) 复制到各个目录中,例如9001 目录中
$ cp ./demo_redis.conf ./9001/
$ ls -al ./9001/
drwxr-xr-x 6 admin 204 5 5 17:45 .
drwxr-xr-x 12 admin 408 5 6 12:23 ..
-rwxr-xr-x 1 admin 41611 5 1 23:50 redis.conf
//将该redis.conf中的port端口号改为9001
//其他各个目录也同样
2.启动各个目录内的实例
/**
* [一定要再该端口的目录下面哦,再外面目录都是不行的,
* 即便你将redis-server后面地址也重新写正确,都是会报错的,
* 原因是redis启动后要在该目录下面生成相关与该端口的
* log,aof,rdb,node.*.conf等文件]
*/
$ cd 9001
$ pwd
/Applications/redis-cluster/9001
//redis-server这里的只要是你的redis-server地址就可以,我这里是正好是redis-3.0.6包下面
$ ./../redis-3.0.6/src/redis-server ./redis.conf
//[验证-是否启动了redis 9001实例]
$ ps -e|grep 9001
15962 ?? 0:00.01 ./../redis-3.0.6/src/redis-server *:9001 [cluster]
15965 ttys005 0:00.00 grep 9001
deMacBook-Pro:9001 $ ./../redis-3.0.6/src/redis-cli -c -p 9001
127.0.0.1:9001> cluster nodes
c33b0b3eea734d962022be568344ba9ec64356a9 :9001 myself,master - 0 0 0 connected
//再示例9002
$ cd ./../9002
$ ./../redis-3.0.6/src/redis-server redis.conf
$ ps -e|grep 900
15962 ?? 0:00.10 ./../redis-3.0.6/src/redis-server *:9001 [cluster]
15974 ?? 0:00.01 ./../redis-3.0.6/src/redis-server *:9002 [cluster]
15976 ttys005 0:00.00 grep 900
$ ./../redis-3.0.6/src/redis-cli -c -p 9002
127.0.0.1:9002> cluster nodes
296cb822111108b5d67a4753339a936aee1be59a :9002 myself,master - 0 0 0 connected
//其他目录也同理
3.连接各个实例
$ ps -e|grep 900
15962 ?? 0:00.68 ./../redis-3.0.6/src/redis-server *:9001 [cluster]
15974 ?? 0:00.62 ./../redis-3.0.6/src/redis-server *:9002 [cluster]
15989 ?? 0:00.42 ./../redis-3.0.6/src/redis-server *:9003 [cluster]
15997 ?? 0:00.40 ./../redis-3.0.6/src/redis-server *:9004 [cluster]
16030 ?? 0:00.06 ./../redis-3.0.6/src/redis-server *:9005 [cluster]
16039 ?? 0:00.01 ./../redis-3.0.6/src/redis-server *:9006 [cluster]
16034 ttys004 0:00.00 src/redis-cli -c -p 9005
16041 ttys005 0:00.00 grep 900
/*连接*/
$ ./../redis-3.0.6/src/redis-trib.rb create --replicas 1 127.0.0.1:9001 127.0.0.1:9002 127.0.0.1:9003 127.0.0.1:9004 127.0.0.1:9005 127.0.0.1:9006
>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
127.0.0.1:9001
127.0.0.1:9002
127.0.0.1:9003
Adding replica 127.0.0.1:9004 to 127.0.0.1:9001
Adding replica 127.0.0.1:9005 to 127.0.0.1:9002
Adding replica 127.0.0.1:9006 to 127.0.0.1:9003
......
replicates 7bda0805a4172de778f1a7d30a5c9d851c563812
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
//[验证] 集群是正常可写的
127.0.0.1:9001> set test test
-> Redirected to slot [6918] located at 127.0.0.1:9002
OK
4.给各个实例分配slot
/**
1. 在client 查看实例状态,可以看到有3个master,3个slave
2. master还有各自的slot分布情况,例如9003 slots:10923-16383 说明已经分配好slots,就不需要分配了
*/
$ src/redis-cli -c -p 9001
127.0.0.1:9001> cluster nodes
b1953bf7307e0d844ecf4cc34256aea791e00370 127.0.0.1:9005 slave 296cb822111108b5d67a4753339a936aee1be59a 0 1462698245682 5 connected
16e79f7ceb0bfd6fb128200ad8c12b4f72a36176 127.0.0.1:9004 slave c33b0b3eea734d962022be568344ba9ec64356a9 0 1462698244170 4 connected
296cb822111108b5d67a4753339a936aee1be59a 127.0.0.1:9002 master - 0 1462698247198 2 connected 5461-10922
7bda0805a4172de778f1a7d30a5c9d851c563812 127.0.0.1:9003 master - 0 1462698246185 3 connected 10923-16383
c33b0b3eea734d962022be568344ba9ec64356a9 127.0.0.1:9001 myself,master - 0 0 1 connected 0-5460
aa74ab2bb7a31f74a3b9b30efb4090d39472742d 127.0.0.1:9006 slave 7bda0805a4172de778f1a7d30a5c9d851c563812 0 1462698245178 6 connected
//测试9001实例,从上面可以看到9001管理的slots为0-5460
//a9001是在卡槽2641上的,即是在9001node 管理的卡槽
127.0.0.1:9001> cluster keyslot a9001
(integer) 2641
127.0.0.1:9001> set a9001 9001
OK
127.0.0.1:9001> keys *
1) "a9001"
127.0.0.1:9001> cluster keyslot test
(integer) 6918
//测试9002实例,从上面看到9002 管理的slots为5461-10922
127.0.0.1:9001> cluster keyslot test2
(integer) 8899
//显然命中9002管理的卡槽slot 8899 然后客户端也跳转到了9002
127.0.0.1:9001> set test2 9002
-> Redirected to slot [8899] located at 127.0.0.1:9002
OK
127.0.0.1:9002> keys *
1) "test"
2) "test2"
//同理测试9003
5.增加实例为master
5.1 启动实例
//一定要在该实例的目录下启动
$ cd ../9007
$ pwd
/Applications/redis-cluster/9007
//启动实例
$ ./../redis-3.0.6/src/redis-server redis.conf
//*[验证实例已经启动]*/
$ ps -e|grep 9007
16071 ?? 0:00.01 ./../redis-3.0.6/src/redis-server *:9007 [cluster]
$ ./../redis-3.0.6/src/redis-cli -c -p 9007
127.0.0.1:9007> cluster nodes
74b02117a30268842e0ce617b235ee48e3152d04 :9007 myself,master - 0 0 0 connected
5.2增加实例为master
$ ./../redis-3.0.6/src/redis-trib.rb add-node 127.0.0.1:9007 127.0.0.1:9001
>>> Adding node 127.0.0.1:9007 to cluster 127.0.0.1:9001
>>> Performing Cluster Check (using node 127.0.0.1:9001)
M: c33b0b3eea734d962022be568344ba9ec64356a9 127.0.0.1:9001
......
[OK] New node added correctly.
5.3为新增加的master 再分区(resharding) ,即从其他master移动一些slot
//你可以在client 查看现在9007的实例是没有slots的
127.0.0.1:9007> cluster nodes
b1953bf7307e0d844ecf4cc34256aea791e00370 127.0.0.1:9005 slave 296cb822111108b5d67a4753339a936aee1be59a 0 1462699050020 2 connected
aa74ab2bb7a31f74a3b9b30efb4090d39472742d 127.0.0.1:9006 slave 7bda0805a4172de778f1a7d30a5c9d851c563812 0 1462699049010 3 connected
74b02117a30268842e0ce617b235ee48e3152d04 127.0.0.1:9007 myself,master - 0 0 0 connected
c33b0b3eea734d962022be568344ba9ec64356a9 127.0.0.1:9001 master - 0 1462699045986 1 connected 0-5460
16e79f7ceb0bfd6fb128200ad8c12b4f72a36176 127.0.0.1:9004 slave c33b0b3eea734d962022be568344ba9ec64356a9 0 1462699046994 1 connected
296cb822111108b5d67a4753339a936aee1be59a 127.0.0.1:9002 master - 0 1462699048003 2 connected 5461-10922
7bda0805a4172de778f1a7d30a5c9d851c563812 127.0.0.1:9003 master - 0 1462699044980 3 connected 10923-16383
//我们从9001 移动200个slots 给9007
$ ./../redis-3.0.6/src/redis-trib.rb reshard --from c33b0b3eea734d962022be568344ba9ec64356a9 \
--to 74b02117a30268842e0ce617b235ee48e3152d04 \
--slots 200 --yes --timeout 5000 127.0.0.1:9001
>>> Performing Cluster Check (using node 127.0.0.1:9001)
......
Moving slot 199 from 127.0.0.1:9001 to 127.0.0.1:9007:
6.增加一个实例为salve
6.1启动实例
$ cd ../9008
$ pwd
/Applications/redis-cluster/9008
$ ./../redis-3.0.6/src/redis-server redis.conf
//
//[验证]
$ ps -e|grep 9008
16092 ?? 0:00.01 ./../redis-3.0.6/src/redis-server *:9008 [cluster]
16094 ttys005 0:00.00 grep 9008
$ ./../redis-3.0.6/src/redis-cli -c -p 9008
127.0.0.1:9008> cluster nodes
b88cea86d04a4a3d2fb60d6c97284b5bd06e6a5d :9008 myself,master - 0 0 0 connected
6.2增加实例为slave
//增加9008实例为9001(即该c33b...6a9node点)实例的slave
$ ./../redis-3.0.6/src/redis-trib.rb add-node \
--slave \
--master-id c33b0b3eea734d962022be568344ba9ec64356a9 127.0.0.1:9008 \
127.0.0.1:9001
>>> Adding node 127.0.0.1:9008 to cluster 127.0.0.1:9001
...
[OK] New node added correctly.
7.通过master reshard删除一个master实例
我们试着删除9007节点
7.1 移除该master 实例的slot
//这里"74b02..."是9007 node,而"c33b0b..."是9001 node
$ ./../redis-3.0.6/src/redis-trib.rb reshard \
--from 74b02117a30268842e0ce617b235ee48e3152d04 \
--to c33b0b3eea734d962022be568344ba9ec64356a9 \
--slots 200 \
--yes \
127.0.0.1:9001
>>> Performing Cluster Check (using node 127.0.0.1:9001)
......
Moving slot 199 from 127.0.0.1:9007 to 127.0.0.1:9001:
7.2 del-node
//查看cluster nodes 发现 9007已经没有slots
127.0.0.1:9002> cluster nodes
74b02117a30268842e0ce617b235ee48e3152d04 127.0.0.1:9007 master - 0 1462700615149 7 connected
//删除9007 无slots的master 节点
$ ./../redis-3.0.6/src/redis-trib.rb del-node 127.0.0.1:9001 74b02117a30268842e0ce617b235ee48e3152d04
>>> Removing node 74b02117a30268842e0ce617b235ee48e3152d04 from cluster 127.0.0.1:9001
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
//[验证]
$ ps -e|grep 9007
16147 ttys005 0:00.00 grep 9007
//查看nodes表 已经无9007
127.0.0.1:9002> cluster nodes
b1953bf7307e0d844ecf4cc34256aea791e00370 127.0.0.1:9005 slave 296cb822111108b5d67a4753339a936aee1be59a 0 1462700880271 5 connected
b88cea86d04a4a3d2fb60d6c97284b5bd06e6a5d 127.0.0.1:9008 slave c33b0b3eea734d962022be568344ba9ec64356a9 0 1462700881277 8 connected
7bda0805a4172de778f1a7d30a5c9d851c563812 127.0.0.1:9003 master - 0 1462700879261 3 connected 10923-16383
16e79f7ceb0bfd6fb128200ad8c12b4f72a36176 127.0.0.1:9004 slave c33b0b3eea734d962022be568344ba9ec64356a9 0 1462700876236 8 connected
aa74ab2bb7a31f74a3b9b30efb4090d39472742d 127.0.0.1:9006 slave 7bda0805a4172de778f1a7d30a5c9d851c563812 0 1462700876742 6 connected
296cb822111108b5d67a4753339a936aee1be59a 127.0.0.1:9002 myself,master - 0 0 2 connected 5461-10922
c33b0b3eea734d962022be568344ba9ec64356a9 127.0.0.1:9001 master - 0 1462700877749 8 connected 0-5460
8.通过master failover删除一个master实例
我们这里删除9003,使9003 的从库slave 9006 failover 它。
8.1 人工备份一个master
//使9006 备份9003,成为新的master
/**
* This command, that can only be send to a Redis Cluster slave node, forces the slave to start a manual
* failover of its master instance.
* 这个命令必须在slave node上面执行,使slave去failover它的master(小三去赶下去正室,从而成为正室)
* */
$ ./../redis-3.0.6/src/redis-cli -c -p 9006
127.0.0.1:9006> cluster failover
OK
//[验证]
127.0.0.1:9006> cluster nodes
b1953bf7307e0d844ecf4cc34256aea791e00370 127.0.0.1:9005 slave 296cb822111108b5d67a4753339a936aee1be59a 0 1462701334986 5 connected
16e79f7ceb0bfd6fb128200ad8c12b4f72a36176 127.0.0.1:9004 slave c33b0b3eea734d962022be568344ba9ec64356a9 0 1462701332974 8 connected
b88cea86d04a4a3d2fb60d6c97284b5bd06e6a5d 127.0.0.1:9008 slave c33b0b3eea734d962022be568344ba9ec64356a9 0 1462701335994 8 connected
c33b0b3eea734d962022be568344ba9ec64356a9 127.0.0.1:9001 master - 0 1462701331964 8 connected 0-5460
aa74ab2bb7a31f74a3b9b30efb4090d39472742d 127.0.0.1:9006 myself,master - 0 0 9 connected 10923-16383
7bda0805a4172de778f1a7d30a5c9d851c563812 127.0.0.1:9003 slave aa74ab2bb7a31f74a3b9b30efb4090d39472742d 0 1462701337004 9 connected
296cb822111108b5d67a4753339a936aee1be59a 127.0.0.1:9002 master - 0 1462701333978 2 connected 5461-10922
8.2 del-node
//
//删除9003
$ ./../redis-3.0.6/src/redis-trib.rb del-node 127.0.0.1:9001 7bda0805a4172de778f1a7d30a5c9d851c563812
>>> Removing node 7bda0805a4172de778f1a7d30a5c9d851c563812 from cluster 127.0.0.1:9001
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
//[验证] 已经删除9003 node
127.0.0.1:9006> cluster nodes
b1953bf7307e0d844ecf4cc34256aea791e00370 127.0.0.1:9005 slave 296cb822111108b5d67a4753339a936aee1be59a 0 1462701613370 5 connected
16e79f7ceb0bfd6fb128200ad8c12b4f72a36176 127.0.0.1:9004 slave c33b0b3eea734d962022be568344ba9ec64356a9 0 1462701609338 8 connected
b88cea86d04a4a3d2fb60d6c97284b5bd06e6a5d 127.0.0.1:9008 slave c33b0b3eea734d962022be568344ba9ec64356a9 0 1462701610349 8 connected
c33b0b3eea734d962022be568344ba9ec64356a9 127.0.0.1:9001 master - 0 1462701614375 8 connected 0-5460
aa74ab2bb7a31f74a3b9b30efb4090d39472742d 127.0.0.1:9006 myself,master - 0 0 9 connected 10923-16383
296cb822111108b5d67a4753339a936aee1be59a 127.0.0.1:9002 master - 0 1462701612363 2 connected 5461-10922
9.删除一个salve 实例
//
//删除9008 slave node,node-id:7bda0805a...
$ ./../redis-3.0.6/src/redis-trib.rb del-node 127.0.0.1:9001 7bda0805a4172de778f1a7d30a5c9d851c563812
>>> Removing node 7bda0805a4172de778f1a7d30a5c9d851c563812 from cluster 127.0.0.1:9001
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
zhengshanshandeMacBook-Pro:9008 zhengshanshan$ ./../redis-3.0.6/src/redis-trib.rb del-node 127.0.0.1:9001 b88cea86d04a4a3d2fb60d6c97284b5bd06e6a5d
>>> Removing node b88cea86d04a4a3d2fb60d6c97284b5bd06e6a5d from cluster 127.0.0.1:9001
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
//[验证] 已经删除9008 node
$ ps -e|grep 9008
16183 ttys005 0:00.00 grep 9008
127.0.0.1:9006> cluster nodes
b1953bf7307e0d844ecf4cc34256aea791e00370 127.0.0.1:9005 slave 296cb822111108b5d67a4753339a936aee1be59a 0 1462701683953 5 connected
16e79f7ceb0bfd6fb128200ad8c12b4f72a36176 127.0.0.1:9004 slave c33b0b3eea734d962022be568344ba9ec64356a9 0 1462701686980 8 connected
c33b0b3eea734d962022be568344ba9ec64356a9 127.0.0.1:9001 master - 0 1462701684962 8 connected 0-5460
aa74ab2bb7a31f74a3b9b30efb4090d39472742d 127.0.0.1:9006 myself,master - 0 0 9 connected 10923-16383
296cb822111108b5d67a4753339a936aee1be59a 127.0.0.1:9002 master - 0 1462701685972 2 connected 5461-10922