Redis Cluster集群实验

redis版本:3.x,注意redis4开始不需要安装ruby。安装ruby有点坑坑坑

1、安装依赖包

安装libtool、gcc、automake

安装ruby(版本需高于2.2.2,yum直接安装的版本过低)

 

1.1 安装ruby

redis-trib.rb需要ruby运行

安装ruby遇到很多问题,一开始下载ruby源码安装,都遇到问题(有提示opensslloadfile errorzlib等错误,尝试解决错误都未能解决)

后使用rvm安装ruby,一切胜利。

 

[root@redis1 ~]# gpg --keyserver hkp://keys.gnupg.net --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3

gpg: directory `/root/.gnupg' created

gpg: new configuration file `/root/.gnupg/gpg.conf' created

gpg: WARNING: options in `/root/.gnupg/gpg.conf' are not yet active during this run

gpg: keyring `/root/.gnupg/secring.gpg' created

gpg: keyring `/root/.gnupg/pubring.gpg' created

gpg: requesting key D39DC0E3 from hkp server keys.gnupg.net

gpg: /root/.gnupg/trustdb.gpg: trustdb created

gpg: key D39DC0E3: public key "Michal Papis (RVM signing) " imported

gpg: no ultimately trusted keys found

gpg: Total number processed: 1

gpg:               imported: 1  (RSA: 1)

 

[root@redis1 ~]# curl -sSL https://get.rvm.io | bash -s stable

Downloading https://github.com/rvm/rvm/archive/1.29.3.tar.gz

 Downloading https://github.com/rvm/rvm/releases/download/1.29.3/1.29.3.tar.gz.asc

curl: (28) Connection timed out after 30000 milliseconds

 

Could not download 'https://github.com/rvm/rvm/releases/download/1.29.3/1.29.3.tar.gz.asc'.

  curl returned status '28'.

 

Creating group 'rvm'

 

Installing RVM to /usr/local/rvm/

Installation of RVM in /usr/local/rvm/ is almost complete:

 

  * First you need to add all users that will be using rvm to 'rvm' group,

    and logout - login again, anyone using rvm will be operating with `umask u=rwx,g=rwx,o=rx`.

 

  * To start using RVM you need to run `source /etc/profile.d/rvm.sh`

    in all your open shell windows, in rare cases you need to reopen all shell windows

 

[root@redis1 ~]# find / -name rvm -print

/usr/local/rvm

/usr/local/rvm/src/rvm

/usr/local/rvm/src/rvm/bin/rvm

/usr/local/rvm/src/rvm/lib/rvm

/usr/local/rvm/src/rvm/scripts/rvm

/usr/local/rvm/bin/rvm

/usr/local/rvm/lib/rvm

/usr/local/rvm/scripts/rvm

 

[root@redis1 ~]# source /usr/local/rvm/scripts/rvm

[root@redis1 ruby-2.5.1]# rvm list known

# MRI Rubies

[ruby-]1.8.6[-p420]

[ruby-]1.8.7[-head] # security released on head

[ruby-]1.9.1[-p431]

[ruby-]1.9.2[-p330]

[ruby-]1.9.3[-p551]

[ruby-]2.0.0[-p648]

[ruby-]2.1[.10]

[ruby-]2.2[.7]

[ruby-]2.3[.4]

[ruby-]2.4[.1]

ruby-head

 

[root@redis1 ~]# rvm install 2.4.1

[root@redis1 ~]# rvm use 2.4.1

[root@redis1 ~]# rvm use 2.4.1 --default

[root@redis1 ~]# ruby --version

[root@redis1 ~]# gem install redis

 

2、准备节点

安装好后redis,修改完成配置文件,启动redis。

注意:建议在创建集群前最好不设置密码,在创建完成后再设置密码。

 

port 6379                               //端口

cluster-enabled yes                     //开启集群模式

cluster-config-file nodes-7001.conf     //集群内部的配置文件

cluster-node-timeout 15000              //节点超时时间,单位毫秒

###其他配置与单机模式一样

 

redis-server  /opt/redis/7001/redis.conf

redis-server  /opt/redis/7002/redis.conf

redis-server  /opt/redis/7003/redis.conf

redis-server  /opt/redis/7004/redis.conf

redis-server  /opt/redis/7005/redis.conf

redis-server  /opt/redis/7006/redis.conf

 

3、创建集群

[root@redis1 redis]# cd /usr/local/src/redis-4.0.8/src/

redis-trib.rb 6个节点replicas 1表示每个主节点分配1个从节点,此6个节点则前3个为主节点,后3个为从节点,主从对应关系规则不定

 

[root@redis1 src]# ./redis-trib.rb create --replicas 1 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006

>>> Creating cluster

>>> Performing hash slots allocation on 6 nodes...

Using 3 masters:

127.0.0.1:7001

127.0.0.1:7002

127.0.0.1:7003

Adding replica 127.0.0.1:7005 to 127.0.0.1:7001

Adding replica 127.0.0.1:7006 to 127.0.0.1:7002

Adding replica 127.0.0.1:7004 to 127.0.0.1:7003

>>> Trying to optimize slaves allocation for anti-affinity

[WARNING] Some slaves are in the same host as their master

M: 6a4393d5815659c50d5940d259d4ed30c29deaeb 127.0.0.1:7001

   slots:0-5460 (5461 slots) master

M: 00e22063a70187149a9c7cc7db2ebb5548f0c358 127.0.0.1:7002

   slots:5461-10922 (5462 slots) master

M: f64042bcc90635e040829c9dd84274c68d5467e3 127.0.0.1:7003

   slots:10923-16383 (5461 slots) master

S: fad5fb9e2a21e7f9e34b92389718acf59486ba01 127.0.0.1:7004

   replicates 00e22063a70187149a9c7cc7db2ebb5548f0c358

S: 28cff9d6b32aaf258bc481ee50cbc950e0f1fb05 127.0.0.1:7005

   replicates f64042bcc90635e040829c9dd84274c68d5467e3

S: 8ce18d5bd6f4be2595a42cb71040a750e41d4b01 127.0.0.1:7006

   replicates 6a4393d5815659c50d5940d259d4ed30c29deaeb

Can I set the above configuration? (type 'yes' to accept): yes

>>> Nodes configuration updated

>>> Assign a different config epoch to each node

>>> Sending CLUSTER MEET messages to join the cluster

Waiting for the cluster to join....

>>> Performing Cluster Check (using node 127.0.0.1:7001)

M: 6a4393d5815659c50d5940d259d4ed30c29deaeb 127.0.0.1:7001

   slots:0-5460 (5461 slots) master

   1 additional replica(s)

M: 00e22063a70187149a9c7cc7db2ebb5548f0c358 127.0.0.1:7002

   slots:5461-10922 (5462 slots) master

   1 additional replica(s)

S: 28cff9d6b32aaf258bc481ee50cbc950e0f1fb05 127.0.0.1:7005

   slots: (0 slots) slave

   replicates f64042bcc90635e040829c9dd84274c68d5467e3

M: f64042bcc90635e040829c9dd84274c68d5467e3 127.0.0.1:7003

   slots:10923-16383 (5461 slots) master

   1 additional replica(s)

S: fad5fb9e2a21e7f9e34b92389718acf59486ba01 127.0.0.1:7004

   slots: (0 slots) slave

   replicates 00e22063a70187149a9c7cc7db2ebb5548f0c358

S: 8ce18d5bd6f4be2595a42cb71040a750e41d4b01 127.0.0.1:7006

   slots: (0 slots) slave

   replicates 6a4393d5815659c50d5940d259d4ed30c29deaeb

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

4、测试阶段

4.1 状态查看

[root@redis1 ruby]# redis-cli -c -p 7003 -a 123456     ### -c 表示集群方式连接

127.0.0.1:7003> cluster nodes

fad5fb9e2a21e7f9e34b92389718acf59486ba01 127.0.0.1:7004@17004 slave 00e22063a70187149a9c7cc7db2ebb5548f0c358 0 1525006603892 4 connected

28cff9d6b32aaf258bc481ee50cbc950e0f1fb05 127.0.0.1:7005@17005 slave f64042bcc90635e040829c9dd84274c68d5467e3 0 1525006604896 5 connected

00e22063a70187149a9c7cc7db2ebb5548f0c358 127.0.0.1:7002@17002 master - 0 1525006604000 2 connected 5461-10922

8ce18d5bd6f4be2595a42cb71040a750e41d4b01 127.0.0.1:7006@17006 slave 6a4393d5815659c50d5940d259d4ed30c29deaeb 0 1525006602883 6 connected

f64042bcc90635e040829c9dd84274c68d5467e3 127.0.0.1:7003@17003 myself,master - 0 1525006602000 3 connected 10923-16383

6a4393d5815659c50d5940d259d4ed30c29deaeb 127.0.0.1:7001@17001 master - 0 1525006605902 1 connected 0-5460

127.0.0.1:7003> cluster info

cluster_state:ok

cluster_slots_assigned:16384

cluster_slots_ok:16384

cluster_slots_pfail:0

cluster_slots_fail:0

cluster_known_nodes:6

cluster_size:3

cluster_current_epoch:6

cluster_my_epoch:3

cluster_stats_messages_ping_sent:1034

cluster_stats_messages_pong_sent:1090

cluster_stats_messages_meet_sent:3

cluster_stats_messages_sent:2127

cluster_stats_messages_ping_received:1086

cluster_stats_messages_pong_received:1037

cluster_stats_messages_meet_received:4

cluster_stats_messages_received:2127

4.2 模拟主节点崩溃

kill 了7001端口的redis,可以看到该节点集群状态一开始是disconnected状态,然后falivover从节点切换为主节点。并且重新启动了7001redis,该节点自动变成从节点

 

 

5、添加节点

先添加主节点,然后添加从节点

 

5.1 添加主节点

add-node 新节点IP:端口 集群节点IP:端口

[root@redis1 src]# ./redis-trib.rb add-node 127.0.0.1:7007  127.0.0.1:7001

>>> Adding node 127.0.0.1:7007 to cluster 127.0.0.1:7001

>>> Performing Cluster Check (using node 127.0.0.1:7001)

S: 6a4393d5815659c50d5940d259d4ed30c29deaeb 127.0.0.1:7001

   slots: (0 slots) slave

   replicates 8ce18d5bd6f4be2595a42cb71040a750e41d4b01

M: f64042bcc90635e040829c9dd84274c68d5467e3 127.0.0.1:7003

   slots:10923-16383 (5461 slots) master

   1 additional replica(s)

M: 00e22063a70187149a9c7cc7db2ebb5548f0c358 127.0.0.1:7002

   slots:5461-10922 (5462 slots) master

   1 additional replica(s)

S: 28cff9d6b32aaf258bc481ee50cbc950e0f1fb05 127.0.0.1:7005

   slots: (0 slots) slave

   replicates f64042bcc90635e040829c9dd84274c68d5467e3

S: fad5fb9e2a21e7f9e34b92389718acf59486ba01 127.0.0.1:7004

   slots: (0 slots) slave

   replicates 00e22063a70187149a9c7cc7db2ebb5548f0c358

M: 8ce18d5bd6f4be2595a42cb71040a750e41d4b01 127.0.0.1:7006

   slots:0-5460 (5461 slots) master

   1 additional replica(s)

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

>>> Send CLUSTER MEET to node 127.0.0.1:7007 to make it join the cluster.

[OK] New node added correctly.

5.2 添加从节点

[root@redis1 src]# ./redis-trib.rb add-node --slave 127.0.0.1:7008  127.0.0.1:7007

>>> Adding node 127.0.0.1:7008 to cluster 127.0.0.1:7007

>>> Performing Cluster Check (using node 127.0.0.1:7007)

M: 5091cdf86735dbf7688da813c5e5426bbeb1ce5e 127.0.0.1:7007

   slots: (0 slots) master

   0 additional replica(s)

M: 8ce18d5bd6f4be2595a42cb71040a750e41d4b01 127.0.0.1:7006

   slots:0-5460 (5461 slots) master

   1 additional replica(s)

M: 00e22063a70187149a9c7cc7db2ebb5548f0c358 127.0.0.1:7002

   slots:5461-10922 (5462 slots) master

   1 additional replica(s)

M: f64042bcc90635e040829c9dd84274c68d5467e3 127.0.0.1:7003

   slots:10923-16383 (5461 slots) master

   1 additional replica(s)

S: 28cff9d6b32aaf258bc481ee50cbc950e0f1fb05 127.0.0.1:7005

   slots: (0 slots) slave

   replicates f64042bcc90635e040829c9dd84274c68d5467e3

S: fad5fb9e2a21e7f9e34b92389718acf59486ba01 127.0.0.1:7004

   slots: (0 slots) slave

   replicates 00e22063a70187149a9c7cc7db2ebb5548f0c358

S: 6a4393d5815659c50d5940d259d4ed30c29deaeb 127.0.0.1:7001

   slots: (0 slots) slave

   replicates 8ce18d5bd6f4be2595a42cb71040a750e41d4b01

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

Automatically selected master 127.0.0.1:7007

>>> Send CLUSTER MEET to node 127.0.0.1:7008 to make it join the cluster.

Waiting for the cluster to join.

>>> Configure node as replica of 127.0.0.1:7007.

[OK] New node added correctly.

 

5.3 重新分配slot

节点添加后,默认是没有slot的,需要手动resharding

./redis-trib.rb reshard 127.0.0.1:7001   ####这边报错了

[root@redis1 src]#  ./redis-trib.rb reshard 127.0.0.1:7001

>>> Performing Cluster Check (using node 127.0.0.1:7001)

How many slots do you want to move (from 1 to 16384)? 这边输入需要迁移的SLOT数量

What is the receiving node ID? 5091cdf86735dbf7688da813c5e5426bbeb1ce5e

Please enter all the source node IDs.

  Type 'all' to use all the nodes as source nodes for the hash slots.

  Type 'done' once you entered all the source nodes IDs.

Source node #1:all

 

Moving slot 5797 from 127.0.0.1:7002 to 127.0.0.1:7007:

Moving slot 5798 from 127.0.0.1:7002 to 127.0.0.1:7007:

[ERR] Calling MIGRATE: ERR Syntax error, try CLIENT (LIST | KILL | GETNAME | SETNAME | PAUSE | REPLY)

 

尝试reshard也报错

[root@redis1 src]#  ./redis-trib.rb reshard 127.0.0.1:7007

 [WARNING] Node 127.0.0.1:7007 has slots in importing state (5798).

[WARNING] Node 127.0.0.1:7002 has slots in migrating state (5798).

[WARNING] The following slots are open: 5798

>>> Check slots coverage...

[OK] All 16384 slots covered.

*** Please fix your cluster problems before resharding

 

[root@redis1 src]#  ./redis-trib.rb fix 127.0.0.1:7007

>>> Performing Cluster Check (using node 127.0.0.1:7007)

M: 5091cdf86735dbf7688da813c5e5426bbeb1ce5e 127.0.0.1:7007

   slots:5461-5797 (337 slots) master

   0 additional replica(s)

M: 8ce18d5bd6f4be2595a42cb71040a750e41d4b01 127.0.0.1:7006

   slots:0-5460 (5461 slots) master

   1 additional replica(s)

M: 00e22063a70187149a9c7cc7db2ebb5548f0c358 127.0.0.1:7002

   slots:5798-10922 (5125 slots) master

   1 additional replica(s)

M: f64042bcc90635e040829c9dd84274c68d5467e3 127.0.0.1:7003

   slots:10923-16383 (5461 slots) master

   1 additional replica(s)

S: 28cff9d6b32aaf258bc481ee50cbc950e0f1fb05 127.0.0.1:7005

   slots: (0 slots) slave

   replicates f64042bcc90635e040829c9dd84274c68d5467e3

S: fad5fb9e2a21e7f9e34b92389718acf59486ba01 127.0.0.1:7004

   slots: (0 slots) slave

   replicates 00e22063a70187149a9c7cc7db2ebb5548f0c358

S: 6a4393d5815659c50d5940d259d4ed30c29deaeb 127.0.0.1:7001

   slots: (0 slots) slave

   replicates 8ce18d5bd6f4be2595a42cb71040a750e41d4b01

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

[WARNING] Node 127.0.0.1:7007 has slots in importing state (5798).

[WARNING] The following slots are open: 5798

>>> Fixing open slot 5798

Set as migrating in:

Set as importing in: 127.0.0.1:7007

>>> Moving all the 5798 slot keys to its owner 127.0.0.1:7002

Moving slot 5798 from 127.0.0.1:7007 to 127.0.0.1:7002:

>>> Setting 5798 as STABLE in 127.0.0.1:7007

>>> Check slots coverage...

[OK] All 16384 slots covered.

 

 

[root@redis1 src]#  ./redis-trib.rb check 127.0.0.1:7007

>>> Performing Cluster Check (using node 127.0.0.1:7007)

M: 5091cdf86735dbf7688da813c5e5426bbeb1ce5e 127.0.0.1:7007

   slots:5461-5797 (337 slots) master

   0 additional replica(s)

M: 8ce18d5bd6f4be2595a42cb71040a750e41d4b01 127.0.0.1:7006

   slots:0-5460 (5461 slots) master

   1 additional replica(s)

M: 00e22063a70187149a9c7cc7db2ebb5548f0c358 127.0.0.1:7002

   slots:5798-10922 (5125 slots) master

   1 additional replica(s)

M: f64042bcc90635e040829c9dd84274c68d5467e3 127.0.0.1:7003

   slots:10923-16383 (5461 slots) master

   1 additional replica(s)

S: 28cff9d6b32aaf258bc481ee50cbc950e0f1fb05 127.0.0.1:7005

   slots: (0 slots) slave

   replicates f64042bcc90635e040829c9dd84274c68d5467e3

S: fad5fb9e2a21e7f9e34b92389718acf59486ba01 127.0.0.1:7004

   slots: (0 slots) slave

   replicates 00e22063a70187149a9c7cc7db2ebb5548f0c358

S: 6a4393d5815659c50d5940d259d4ed30c29deaeb 127.0.0.1:7001

   slots: (0 slots) slave

   replicates 8ce18d5bd6f4be2595a42cb71040a750e41d4b01

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered

尝试重新resharding slot,这边没任何错误

 

6 删除节点

6.1 删除从节点

从节点没有分配slot的,可以直接删除

[root@redis1 src]# ./redis-trib.rb del-node 127.0.0.1:7008 79e368e25d010911fd0368332b41ecd2f3b62287

>>> Removing node 79e368e25d010911fd0368332b41ecd2f3b62287 from cluster 127.0.0.1:7008

>>> Sending CLUSTER FORGET messages to the cluster...

>>> SHUTDOWN the node.

6.2 删除主节点

删除主节点前,需要把所有该节点上的slot迁移到其他节点后,才能删除

[root@redis1 src]# ./redis-trib.rb reshard 127.0.0.1:7007

 

[root@redis1 src]# ./redis-trib.rb del-node 127.0.0.1:7007 5091cdf86735dbf7688da813c5e5426bbeb1ce5e

 

遇到问题

1、can't connect to node

[root@redis1 src]# ./redis-trib.rb create --replicas 1 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006

>>> Creating cluster

[ERR] Sorry, can't connect to node 127.0.0.1:7001

 

redis设置了密码引起,建议创建集群前不设置密码。

 

处理方法1:

修改文件/usr/local/rvm/gems/ruby-2.4.1/gems/redis-4.0.1/lib/redis/client.rb

处理方法2:

redis改成无密码,然后创建集群,创建完成后设置密码

---创建完成后,设置密码

config set masterauth abc 

config set requirepass abc 

config rewrite 

 

2、redis-trib.rb运行报错,确实ruby环境

源码安装怎么都错,最后使用gem方式安装

 

cluster参数说明

1、cluster-enabled

如果想在特定的Redis实例中启用Redis群集支持就设置为yes。 否则,实例通常作为独立实例启动。

 

2、cluster-config-file

请注意,尽管有此选项的名称,但这不是用户可编辑的配置文件,而是Redis群集节点每次发生更改时自动保留群集配置(基本上为状态)的文件,以便能够 在启动时重新读取它。 该文件列出了群集中其他节点,它们的状态,持久变量等等。 由于某些消息的接收,通常会将此文件重写并刷新到磁盘上。

 

3、cluster-node-timeout

Redis群集节点可以不可用的最长时间,而不会将其视为失败。 如果主节点超过指定的时间不可达,它将由其从属设备进行故障切换。 此参数控制Redis群集中的其他重要事项。 值得注意的是,每个无法在指定时间内到达大多数主节点的节点将停止接受查询。

 

4、cluster-slave-validity-factor

如果设置为0,无论主设备和从设备之间的链路保持断开连接的时间长短,从设备都将尝试故障切换主设备。 如果该值为正值,则计算最大断开时间作为节点超时值乘以此选项提供的系数,如果该节点是从节点,则在主链路断开连接的时间超过指定的超时值时,它不会尝试启动故障切换。 例如,如果节点超时设置为5秒,并且有效因子设置为10,则与主设备断开连接超过50秒的从设备将不会尝试对其主设备进行故障切换。 请注意,如果没有从服务器节点能够对其进行故障转移,则任何非零值都可能导致Redis群集在主服务器出现故障后不可用。 在这种情况下,只有原始主节点重新加入集群时,集群才会返回可用。

 

5、cluster-migration-barrier

主设备将保持连接的最小从设备数量,以便另一个从设备迁移到不受任何从设备覆盖的主设备。有关更多信息,请参阅本教程中有关副本迁移的相应部分。

 

6、cluster-require-full-coverage

如果将其设置为yes,则默认情况下,如果key的空间的某个百分比未被任何节点覆盖,则集群停止接受写入。 如果该选项设置为no,则即使只处理关于keys子集的请求,群集仍将提供查询。

 

 

 

 

你可能感兴趣的:(Redis)