Redis Sentinel(哨兵) 和 Master+Slave(主从)的实现和原理分析

Redis Sentinel(哨兵) 和 Master+Slave(主从)的实现和原理分析

Author QiuRiMangCao 秋日芒草


单节点

server01
server02 } redis 单节点
server03

master slave

server01
server02 } redis (master,slave)[数据备份][读写分离] slave减轻master的压力,当master挂了,slave不支持写,只支持读,所以服务还是不可用
server03

集群

server01 } { node2 (master,slave)数据备份][读写分离] slave减轻master的压力,当master挂了,slave不支持写,只支持读,所以服务还是不可用,整个节点不可用
server02 } redis { node1 (master,slave)
server03 } { node3 (master,slave)

sentinel

server01
server02 } redis (master,slave,redis-sentinel)sentinel用于切换master或slave,这样都可以写了
server03


主从配置(master,slave)

修改redis.conf 文件,设置主从配置

或者修改redis.conf文件,使用daemonize yes

./redis-server &

指定配置文件启动

./redis-server /etc/redis/6379.conf

指定端口连接

redis-cli -p 6380

启动

redis redis-server /etc/myredis/redis.config

然后再测试启动成功与否

redis-cli ping

如果不是使用脚本启动则需要使用redis-cli shutdown命令来停止

命令:

redis-cli -p 8888 shutdown

查询redis的版本信息

[root@localhost bin]# ./redis-server -v

vim 定位行

跳转到文件尾
输入冒号(:),打开命令输入框
输入命令:$

跳转到文件头
输入冒号(:),打开命令输入框
输入命令1,是“一”的阿拉伯数字,不是小写的L

master日志信息

3311:M 20 Oct 15:37:58.872 * Ready to accept connections
3311:M 20 Oct 15:39:45.855 * Slave 127.0.0.1:1001 asks for synchronization
3311:M 20 Oct 15:39:45.855 * Full resync requested by slave 127.0.0.1:1001
3311:M 20 Oct 15:39:45.855 * Starting BGSAVE for SYNC with target: disk
3311:M 20 Oct 15:39:45.855 * Background saving started by pid 3325
3325:C 20 Oct 15:39:45.859 * DB saved on disk
3325:C 20 Oct 15:39:45.860 * RDB: 0 MB of memory used by copy-on-write
3311:M 20 Oct 15:39:45.932 * Background saving terminated with success
3311:M 20 Oct 15:39:45.933 * Synchronization with slave 127.0.0.1:1001 succeeded

slave日志信息

3321:S 20 Oct 15:39:45.854 * Connecting to MASTER 127.0.0.1:1000
3321:S 20 Oct 15:39:45.854 * MASTER <-> SLAVE sync started
3321:S 20 Oct 15:39:45.855 * Non blocking connect for SYNC fired the event.
3321:S 20 Oct 15:39:45.855 * Master replied to PING, replication can continue…
3321:S 20 Oct 15:39:45.855 * Partial resynchronization not possible (no cached master)
3321:S 20 Oct 15:39:45.855 * Full resync from master: 1a326d8a3bc1af413789dfa9dca65954072418d5:0
3321:S 20 Oct 15:39:45.933 * MASTER <-> SLAVE sync: receiving 175 bytes from master
3321:S 20 Oct 15:39:45.933 * MASTER <-> SLAVE sync: Flushing old data
3321:S 20 Oct 15:39:45.933 * MASTER <-> SLAVE sync: Loading DB in memory
3321:S 20 Oct 15:39:45.933 * MASTER <-> SLAVE sync: Finished with success

查询redis上的所有key

127.0.0.1:1000> keys *

存入master

[root@localhost redis]# ./bin/redis-cli -p 1000
127.0.0.1:1000> set password 123456

slave同步数据

[root@localhost redis]# ./bin/redis-cli -p 1001
127.0.0.1:1001> keys *
1) “password”

slave服务不让写

127.0.0.1:1001> set user:password zhangsan:123456
(error) READONLY You can’t write against a read only slave.

查看服务信息

127.0.0.1:1000> info

Server

redis_version:4.0.1
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:92e72a18d61bfe4f
redis_mode:standalone
os:Linux 3.10.0-693.el7.x86_64 x86_64

主备信息-master

# Replication
role:master #角色
connected_slaves:1
# slave0:ip=127.0.0.1,port=1001 - slave信息
slave0:ip=127.0.0.1,port=1001,state=online,offset=1084,lag=1
master_replid:1a326d8a3bc1af413789dfa9dca65954072418d5
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1084
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1084

主备信息-slave

# Replication
role:slave
master_host:127.0.0.1
master_port:1000
master_link_status:up
master_last_io_seconds_ago:5
master_sync_in_progress:0
slave_repl_offset:1602
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:1a326d8a3bc1af413789dfa9dca65954072418d5
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1602
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1602

memcached 和 redis的区别,redis可以缓存到硬盘,而memcached不可用

通讯过程:slave(在启动的时候会向master发送同步命令),master(将镜像数据以文件的形式同步到slave),新数据(增量[心跳]同步数据到slave),在同步数据的时候,master是非阻塞的状态,而slave是阻塞的状态,目的:防止slave在同步数据的时候,应用服务器来读取数据出现问题


主从配置(master,slave)+ redis-sentinel 高可用

redis本身不具备master-slave切换,所以要使用redis-sentinel来完成(master-slave)的自动切换

redis-sentinel可以监控扩展多个节点,就是多个(master-slave)的集群

(master-slave)的集群中slave可以多个,方便哨兵的切换到任意一个slave

sentinel配置文件在运行期间可以被多态修改,会在master服务不好用的时候将配置文件修改成slave的配置,重启sentinel会被恢复配置

sentinel在网络环境中如何知道master节点是否好用的? 是根据互相ping - pong 来判断是否可用的

但网络还是会存在不稳定的情况,可以导致一次或多次ping不通服务器,所以sentinel会有一个规则去识别—->然后就产生了选举(投票)- 还必须满足记数才能选举出

sentinel可以有奇数个来参加选举并投票,满足一般以上挂掉整个集群就挂掉了

移动sentinel配置文件到指定目录

mv ./sentinel.conf ./sentinel

移动指定文件夹到指定目录

mv sentinel/ ../

sentinel的启动

[root@localhost bin]# ./redis-sentinel ../redis-pub/sentinel/sentinel.conf

查看sentinel和master,slave启动情况

[root@localhost bin]# ps -ef | grep redis
root 3311 1 0 15:37 ? 00:00:05 ./bin/redis-server 127.0.0.1:1000
root 3321 1 0 15:39 ? 00:00:05 ./bin/redis-server 127.0.0.1:1001
root 3336 1726 0 15:44 pts/1 00:00:00 ./bin/redis-cli -p 1000
root 3532 1802 0 16:55 pts/2 00:00:00 ./redis-sentinel *:26379 [sentinel]
root 3537 2678 0 16:56 pts/3 00:00:00 grep –color=auto redis

sentinel 启动时监控的master和slave

3532:X 20 Oct 16:55:58.458 # Sentinel ID is ddcf5dd45ac986e979558ce338948d3bc463a9d5
3532:X 20 Oct 16:55:58.458 # +monitor master mymaster 127.0.0.1 1000 quorum 1
3532:X 20 Oct 16:55:58.459 * +slave slave 127.0.0.1:1001 127.0.0.1 1001 @ mymaster 127.0.0.1 1000

配置sentinel后再去slave存入数据,还是提示失败

[root@localhost bin]# ./redis-cli -p 1001
127.0.0.1:1001> get password
“123456”
127.0.0.1:1001> set username zhaojian
(error) READONLY You can’t write against a read only slave.

停掉master:1000,并查看sentinel和slave信息

[root@localhost redis]# ./bin/redis-cli -p 1000 shutdown
[root@localhost redis]# ps -ef | grep redis
root 3321 1 0 15:39 ? 00:00:05 ./bin/redis-server 127.0.0.1:1001
root 3532 1802 0 16:55 pts/2 00:00:00 ./redis-sentinel *:26379 [sentinel]
root 3538 1373 0 17:00 pts/0 00:00:00 ./redis-cli -p 1001
root 3578 1726 0 17:02 pts/1 00:00:00 grep –color=auto redis

过配置文件中的30s后,会在启动Sentinel启动页面上输出日志信息

3532:X 20 Oct 16:55:58.455 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
3532:X 20 Oct 16:55:58.458 # Sentinel ID is ddcf5dd45ac986e979558ce338948d3bc463a9d5
3532:X 20 Oct 16:55:58.458 # +monitor master mymaster 127.0.0.1 1000 quorum 1
3532:X 20 Oct 16:55:58.459 * +slave slave 127.0.0.1:1001 127.0.0.1 1001 @ mymaster 127.0.0.1 1000
3532:X 20 Oct 17:03:11.847 # +sdown master mymaster 127.0.0.1 1000
3532:X 20 Oct 17:03:11.847 # +odown master mymaster 127.0.0.1 1000 #quorum 1/1
3532:X 20 Oct 17:03:11.847 # +new-epoch 1
3532:X 20 Oct 17:03:11.847 # +try-failover master mymaster 127.0.0.1 1000
3532:X 20 Oct 17:03:11.850 # +vote-for-leader ddcf5dd45ac986e979558ce338948d3bc463a9d5 1
3532:X 20 Oct 17:03:11.850 # +elected-leader master mymaster 127.0.0.1 1000
3532:X 20 Oct 17:03:11.850 # +failover-state-select-slave master mymaster 127.0.0.1 1000
3532:X 20 Oct 17:03:11.927 # +selected-slave slave 127.0.0.1:1001 127.0.0.1 1001 @ mymaster 127.0.0.1 1000
3532:X 20 Oct 17:03:11.927 * +failover-state-send-slaveof-noone slave 127.0.0.1:1001 127.0.0.1 1001 @ mymaster 127.0.0.1 1000
3532:X 20 Oct 17:03:11.999 * +failover-state-wait-promotion slave 127.0.0.1:1001 127.0.0.1 1001 @ mymaster 127.0.0.1 1000
3532:X 20 Oct 17:03:12.006 # +promoted-slave slave 127.0.0.1:1001 127.0.0.1 1001 @ mymaster 127.0.0.1 1000
3532:X 20 Oct 17:03:12.007 # +failover-state-reconf-slaves master mymaster 127.0.0.1 1000
3532:X 20 Oct 17:03:12.065 # +failover-end master mymaster 127.0.0.1 1000

master从1000装换到现在的1001,所以master-slave切换成功

3532:X 20 Oct 17:03:12.066 # +switch-master mymaster 127.0.0.1 1000 127.0.0.1 1001
3532:X 20 Oct 17:03:12.066 * +slave slave 127.0.0.1:1000 127.0.0.1 1000 @ mymaster 127.0.0.1 1001
3532:X 20 Oct 17:03:42.085 # +sdown slave 127.0.0.1:1000 127.0.0.1 1000 @ mymaster 127.0.0.1 1001

现在之前slave角色变为master,这就是sentinel机制的作用,已经启动监控的作用,已经将slave切换成master

# Replication
role:master
connected_slaves:0
master_replid:41997ee7b0607cda24f609821af975cd5b3c802f
master_replid2:1a326d8a3bc1af413789dfa9dca65954072418d5
master_repl_offset:32899
second_repl_offset:32900
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:32899

现在再给1001(之前的slave)存值成功

127.0.0.1:1001> set username zhaojian
OK

现在再去连1000,会连不上

再启动1000 redis

[root@localhost bin]# ./redis-server ../redis-pub/master/redis.conf

现在1000这个已经成为slave了

# Replication
role:slave
master_host:127.0.0.1
master_port:1001
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:51801
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:7ad3eb737bc808a362da9b06c2943f4b35711de5
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:51801
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:32900
repl_backlog_histlen:18902

因为是slave,所以不能写入

127.0.0.1:1000> keys *
1) “password”
2) “username”
127.0.0.1:1000> set age 11
(error) READONLY You can’t write against a read only slave.

总结:master和slave的配置文件没变,只是sentinel的配置文件在动态的变

查看sentinel配置文件信息

[root@localhost sentinel]# cat sentinel.conf

之前的master连接信息变成如下

sentinel myid ddcf5dd45ac986e979558ce338948d3bc463a9d5

当时配置的1000已经动态变为1001了

# Default is 30 seconds.
sentinel monitor mymaster 127.0.0.1 1001 1

末尾动态增加如下,目的:是描述当前master和slave标识状态

sentinel known-slave mymaster 127.0.0.1 1000
sentinel current-epoch 1

停掉1000 和 1001 保留sentinel,之前在1001配置了slaveof 127.0.0.1 1000。

[root@localhost redis]# ./bin/redis-cli -p 1000 shutdown
[root@localhost redis]# ./bin/redis-cli -p 1001 shutdow

重新启动

[root@localhost bin]# ./redis-server ../redis-pub/master/redis.conf
[root@localhost bin]# ./redis-server ../redis-pub/slave/redis.conf
[root@localhost bin]# ps -ef | grep redis
root 3532 1802 0 16:55 pts/2 00:00:05 ./redis-sentinel *:26379 [sentinel]
root 3665 1 0 17:37 ? 00:00:00 ./redis-server 127.0.0.1:1000
root 3670 1 0 17:37 ? 00:00:00 ./redis-server 127.0.0.1:1001
root 3676 1373 0 17:37 pts/0 00:00:00 grep –color=auto redis

连接1000客户端,并查看信息

[root@localhost redis]# ./bin/redis-cli -p 1000
127.0.0.1:1000> info

重启后1000还是没从slave变成master,sentinel并没有停止,所以说明是sentinel在起作用了。

# Replication
role:slave
master_host:127.0.0.1
master_port:1001
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:96505
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:8c54c599d796c83615067b1b665502a81061285a
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:96505
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576

如果将master和slave切换过来需要怎么做? 可以直接修改sentinel的配置信息或者杀掉sentinel的进程,再重新启动master+slave+sentinel


你可能感兴趣的:(后台框架)