3、Redis 集群特性之容错、数据迁移

前言:

该篇中主要讲解一下redis的容错以及数据的迁移(横向拓展)


redis 集群信息

在前面章节中讲到将Node加入到cluster以后打印了如下日志:

[root@localhost src]# ./redis-trib.rb create --replicas 1 192.168.1.103:7000 192.168.1.103:7001 192.168.1.103:7002 192.168.1.103:7003 192.168.1.103:7004 192.168.1.103:7005 
>>> Creating cluster
Connecting to node 192.168.1.103:7000: OK
Connecting to node 192.168.1.103:7001: OK
Connecting to node 192.168.1.103:7002: OK
Connecting to node 192.168.1.103:7003: OK
Connecting to node 192.168.1.103:7004: OK
Connecting to node 192.168.1.103:7005: OK
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
192.168.1.103:7000
192.168.1.103:7001
192.168.1.103:7002
Adding replica 192.168.1.103:7003 to 192.168.1.103:7000
Adding replica 192.168.1.103:7004 to 192.168.1.103:7001
Adding replica 192.168.1.103:7005 to 192.168.1.103:7002
M  4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000
   slots:0-5460(5461 slots) master 
M: f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001
   slots:5461-10922 (5462 slots) master
M: 7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002
   slots:10923-16383 (5461 slots) master
S: 778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003
   replicates 4bc092eb4731152d15172b065c74c7a795fe6304
S: 907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004
   replicates f37ec54101536425ce8798e041ad75a582d7e153
S: b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005
   replicates 7b0ca3978858454051ad572aa816eec450f31a53
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join...
>>> Performing Cluster Check (using node 192.168.1.103:7000)
M: 4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000
   slots:0-5460 (5461 slots) master
M: f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001
   slots:5461-10922 (5462 slots) master
M: 7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002
   slots:10923-16383 (5461 slots) master
M: 778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003
   slots: (0 slots) master
   replicates 4bc092eb4731152d15172b065c74c7a795fe6304
M: 907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004
   slots: (0 slots) master
   replicates f37ec54101536425ce8798e041ad75a582d7e153
M: b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005
   slots: (0 slots) master
   replicates 7b0ca3978858454051ad572aa816eec450f31a53
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

可以得到如下信息


节点名称

7000

7001

7002

7003

7004

7005

NODEID

4bc092eb4731152d15172b065c74c7a795fe6304

f37ec54101536425ce8798e041ad75a582d7e153

7b0ca3978858454051ad572aa816eec450f31a53

778e649f47fa98f6d1f6b1f1043812c6685dc4a8

907feee1b665554cadc64921c7fcb8c05b8a5ab6

b2bea8ede402e2112cced7d7cea52127f18edef2

主从

master

master

master

slave

slave

slave

所属mater节点




7000

7002

7001

slot

0-5460

5461-10922

10923-16383






redis集群提供 16384 slot,slot可以理解为存储单元,在一个slot可存放多个key值,集群环境下,16384个slot分配给master节点,这也解释了为什么在slave节点上不能进行写操作。


连接到任意一个节点,查询集群相关信息

192.168.1.103:7001> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:2
cluster_stats_messages_sent:10436
cluster_stats_messages_received:10106

查询集群节点信息

192.168.1.103:7001> cluster nodes
778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439569459017 4 connected
7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439569460528 3 connected 10923-16383
907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439569461031 5 connected
4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master - 0 1439569460025 1 connected 0-5460
b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439569459520 6 connected
f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 myself,master - 0 0 2 connected 5461-10922
192.168.1.103:7001> 

该命令能有效观察集群信息!!!


集群容错:选举

增加节点 node-7006

启动该节点

[root@localhost my-redis-cluster]# ls
node-7000.conf  node-7001.conf  node-7002.conf  node-7003.conf  node-7004.conf  node-7005.conf  node-7006.conf
[root@localhost my-redis-cluster]# redis-server  node-7006.conf 
[root@localhost my-redis-cluster]# ps -ef | grep redis
root      2382     1  0 8月14 ?       00:00:05 redis-server *:7000 [cluster]
root      2394     1  0 8月14 ?       00:00:05 redis-server *:7001 [cluster]
root      2398     1  0 8月14 ?       00:00:05 redis-server *:7002 [cluster]
root      2402     1  0 8月14 ?       00:00:05 redis-server *:7003 [cluster]
root      2408     1  0 8月14 ?       00:00:05 redis-server *:7004 [cluster]
root      2414     1  0 8月14 ?       00:00:05 redis-server *:7005 [cluster]
root      2929     1  0 00:29 ?        00:00:00 redis-server *:7006 [cluster]
root      2933  2248  0 00:29 pts/0    00:00:00 grep --color=auto redis

此时该节点不在当前的集群中,将该节点加入到集群,使用 cluster meet ip port,注意是在客户端执行该命令哦~~

 

192.168.1.103:7001> cluster meet 192.168.1.103 7006
OK

重新查询集群信息

192.168.1.103:7001> cluster nodes
778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439570083805 4 connected
7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439570085318 3 connected 10923-16383
907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439570085822 5 connected
4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master - 0 1439570085822 1 connected 0-5460
b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439570084814 6 connected
c75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 master - 0 1439570084309 0 connected
f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 myself,master - 0 0 2 connected 5461-10922


默认加入的节点为master,由于该master没有分配任何的slot,可以讲该节点挂在其他的master下,作为master下的slave。


接下来看下如何将7006挂到7000下

切换到7006客户端

mac:bin lkl$ ./redis-cli -c -h 192.168.1.103 -p 7006


执行 cluster  replicate master-node-id

192.168.1.103:7006> cluster replicate 4bc092eb4731152d15172b065c74c7a795fe6304
OK

再次查询cluster nodes

192.168.1.103:7006> cluster nodes 
907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439570605532 2 connected
4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master - 0 1439570604025 1 connected 0-5460
778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439570603524 1 connected
b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439570604528 3 connected
f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 master - 0 1439570603524 2 connected 5461-10922
7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439570604025 3 connected 10923-16383
c75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 myself,slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 0 0 connected

redis 集群中提高容错,在master宕机后,会在master下的slave下自主选举, 当选的slave会升级为mater,同时接管master中的数据。


接下来带大家看下redis的选举


在生产环境中,一般都建议将节点数目设置为奇数,enhance容错能力。

7000下现在有两个slave 7003 、7006,这里再增加一个7007挂在7000下,操作类似前面,不再废话,直接把结果展现出来

[root@localhost redis-3.0.3]# cd my-redis-cluster/
[root@localhost my-redis-cluster]# ls
node-7000.conf  node-7001.conf  node-7002.conf  node-7003.conf  node-7004.conf  node-7005.conf  node-7006.conf  node-7007.conf
[root@localhost my-redis-cluster]# redis-server node-7007.conf 
[root@localhost my-redis-cluster]# ps -ef | grep redis
root      2382     1  0 8月14 ?       00:00:07 redis-server *:7000 [cluster]
root      2394     1  0 8月14 ?       00:00:07 redis-server *:7001 [cluster]
root      2398     1  0 8月14 ?       00:00:07 redis-server *:7002 [cluster]
root      2402     1  0 8月14 ?       00:00:07 redis-server *:7003 [cluster]
root      2408     1  0 8月14 ?       00:00:07 redis-server *:7004 [cluster]
root      2414     1  0 8月14 ?       00:00:07 redis-server *:7005 [cluster]
root      2929     1  0 00:29 ?        00:00:01 redis-server *:7006 [cluster]
root      3128     1  0 00:47 ?        00:00:00 redis-server *:7007 [cluster]
root      3132  2248  0 00:48 pts/0    00:00:00 grep --color=auto redis

192.168.1.103:7006> cluster meet 192.168.1.103 7007
OK
192.168.1.103:7006> cluster nodes
907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439571019922 2 connected
4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master - 0 1439571020927 1 connected 0-5460
778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439571020424 1 connected
b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439571020927 3 connected
f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 master - 0 1439571019416 2 connected 5461-10922
054b9e2efd44f8704d1aa6851f75f316911b7b4d 192.168.1.103:7007 master - 0 1439571019214 0 connected
7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439571020927 3 connected 10923-16383
c75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 myself,slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 0 0 connected

mac:bin lkl$ ./redis-cli -c -h 192.168.1.103 -p 7007 
192.168.1.103:7007> cluster replicate 4bc092eb4731152d15172b065c74c7a795fe6304
OK
192.168.1.103:7007> cluster nodes
b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439571183241 3 connected
054b9e2efd44f8704d1aa6851f75f316911b7b4d 192.168.1.103:7007 myself,slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 0 0 connected
4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master - 0 1439571182232 1 connected 0-5460
7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439571183241 3 connected 10923-16383
907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439571182736 2 connected
f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 master - 0 1439571184249 2 connected 5461-10922
778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439571182736 1 connected
c75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 slave 4bc092eb4731152d15172b065c74c7a795fe6304 0 1439571183745 1 connected

经过以上操作,7007将挂在7000下。

模拟7000宕机,直接kill process

[root@localhost my-redis-cluster]# ps -ef | grep redis
root      2382     1  0 8月14 ?       00:00:07 redis-server *:7000 [cluster]
root      2394     1  0 8月14 ?       00:00:07 redis-server *:7001 [cluster]
root      2398     1  0 8月14 ?       00:00:07 redis-server *:7002 [cluster]
root      2402     1  0 8月14 ?       00:00:07 redis-server *:7003 [cluster]
root      2408     1  0 8月14 ?       00:00:07 redis-server *:7004 [cluster]
root      2414     1  0 8月14 ?       00:00:07 redis-server *:7005 [cluster]
root      2929     1  0 00:29 ?        00:00:01 redis-server *:7006 [cluster]
root      3128     1  0 00:47 ?        00:00:00 redis-server *:7007 [cluster]
root      3132  2248  0 00:48 pts/0    00:00:00 grep --color=auto redis
[root@localhost my-redis-cluster]# kill -9 2382


等待一段时间以后,通过客户端连接上任意一个节点,查询集群情况

mac:bin lkl$ ./redis-cli -c -h 192.168.1.103 -p 7003
192.168.1.103:7003> cluster nodes
7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439571326184 3 connected 10923-16383
054b9e2efd44f8704d1aa6851f75f316911b7b4d 192.168.1.103:7007 slave c75edec9024b2ef4397b70fde2d5227aa9135900 0 1439571324671 7 connected
c75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 master - 0 1439571324671 7 connected 0-5460
778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 myself,slave c75edec9024b2ef4397b70fde2d5227aa9135900 0 0 4 connected
f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 master - 0 1439571324671 2 connected 5461-10922
b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439571325177 6 connected
907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439571326183 5 connected
4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master,fail - 1439571226713 1439571225807 1 disconnected


  可以看到 slave通过自主选举功能,7006接替了7000的工作,分配了0-5460slot


数据迁移:slot的变迁

随着业务的发展,redis的节点承载的压力也会增大,redis的集群可通过水平横向的拓展,在集群中加入新的master-slave去分担集群中其他节点的压力。由于redis cluster中数据存放在slot中,可以将线上的reids数据

slot迁移到新加入的master-slave。接下来讲述下如何操作slot。


将指定的slot迁移到指定节点id的master上

192.168.1.103:7006> cluster setslot  6 node  7b0ca3978858454051ad572aa816eec450f31a53
OK
192.168.1.103:7006> cluster nodes
907feee1b665554cadc64921c7fcb8c05b8a5ab6 192.168.1.103:7004 slave f37ec54101536425ce8798e041ad75a582d7e153 0 1439574079852 2 connected
4bc092eb4731152d15172b065c74c7a795fe6304 192.168.1.103:7000 master,fail - 1439571226672 1439571224258 1 disconnected
778e649f47fa98f6d1f6b1f1043812c6685dc4a8 192.168.1.103:7003 slave c75edec9024b2ef4397b70fde2d5227aa9135900 0 1439574077836 7 connected
b2bea8ede402e2112cced7d7cea52127f18edef2 192.168.1.103:7005 slave 7b0ca3978858454051ad572aa816eec450f31a53 0 1439574078342 3 connected
f37ec54101536425ce8798e041ad75a582d7e153 192.168.1.103:7001 master - 0 1439574079852 2 connected 5461-10922
054b9e2efd44f8704d1aa6851f75f316911b7b4d 192.168.1.103:7007 slave c75edec9024b2ef4397b70fde2d5227aa9135900 0 1439574078342 7 connected
7b0ca3978858454051ad572aa816eec450f31a53 192.168.1.103:7002 master - 0 1439574078342 3 connected 6 10923-16383
c75edec9024b2ef4397b70fde2d5227aa9135900 192.168.1.103:7006 myself,master - 0 0 7 connected 0-5 7-5460

这里只是举个例子,迁移还有很多的其他的操作,大家可以自行搜索。













你可能感兴趣的:(redis,集群,迁移,容错,选举)