Redis sentinel哨兵启动、故障切换过程简单分析

Sentinel原理简介

sentinel是Redis高可用Ha的重要组成部分,在Redis master/slave架构下,担任对主从复制的状态监控,并在主节点异常后自动将从节点提升为主节点对外提供服务。

下图展示了一个在哨兵sentinel集群中监控redis主从复制的一个例子,其中:

Redis sentinel哨兵启动、故障切换过程简单分析_第1张图片

1. Sentinel集群包括三个sentinel节点sentinel1、sentinel2、seninel3,sentinel集群各节点之间互相监控哨兵运行状态。

2.Sentinel集群各节点分别与Redis主节点进行ping命令,以检查Redis主节点的运行状态。

3.假设Sentinel集群检测到Redis主节点Master宕机,在指定时间内未恢复,则Sentinel集群就会对Redis做故障转移操作。

3.1 首先,Sentinel集群从各slave节点中挑选一台优先级最高的slave节点提升为Master节点。

3.2,其次,新的Master节点向原Master的所有从节点发送slaveof命令,让它们作为新Master的slave节点,并将新的Master节点数据复制数据各个slave节点上,故障转移完成。

3.3 最后,Sentinel集群会继续监视老的Master节点,老的Master恢复上线后,Sentinel会将它设置为新Master的slave节点。

3.4 故障转移后的拓扑图如下所示,在图中,slave节点slave-1被选举成为新的Master的节点。

 Redis sentinel哨兵启动、故障切换过程简单分析_第2张图片

一 .Sentinel节点初始化过程

一个Sentinel节点对Redis主从节点、sentinel节点获得运行数据后,最终在内存生成如下所示的内存结构,我们根据此图进行sentinel初始化过程原理介绍。

Redis sentinel哨兵启动、故障切换过程简单分析_第3张图片

一个Sentinel节点启动后,依次执行以下步骤:

1. 载入sentinel专用的代码块

Sentinel服务器本质上是一个无持久化特性和部分功能限制的普通Redis服务器

  • 无持久化:Sentinel没有Redis的RDB和AOF特性,不需要将运行的内存数据记录到数据库中
  • 功能限制:Sentinel启动时会使用Sentinel专用代码,比如,默认配置属性和命令列表,在源码文件src/sentinel.c文件中可以看到载入Sentinel专用代码

载入的默认配置属性

/* A Sentinel Redis Instance object is monitoring. */
#define SRI_MASTER  (1<<0)
#define SRI_SLAVE   (1<<1)
#define SRI_SENTINEL (1<<2)
#define SRI_S_DOWN (1<<3)   /* Subjectively down (no quorum). */
#define SRI_O_DOWN (1<<4)   /* Objectively down (confirmed by others). */
#define SRI_MASTER_DOWN (1<<5) /* A Sentinel with this flag set thinks that its master is down. */
#define SRI_FAILOVER_IN_PROGRESS (1<<6) /* Failover is in progress for this master. */
#define SRI_PROMOTED (1<<7)            /* Slave selected for promotion. */
#define SRI_RECONF_SENT (1<<8)     /* SLAVEOF  sent. */
#define SRI_RECONF_INPROG (1<<9)   /* Slave synchronization in progress. */
#define SRI_RECONF_DONE (1<<10)     /* Slave synchronized with new master. */
#define SRI_FORCE_FAILOVER (1<<11)  /* Force failover with master up. */
#define SRI_SCRIPT_KILL_SENT (1<<12) /* SCRIPT KILL already sent on -BUSY */

/* Note: times are in milliseconds. */
#define SENTINEL_INFO_PERIOD 10000
#define SENTINEL_PING_PERIOD 1000
#define SENTINEL_ASK_PERIOD 1000
#define SENTINEL_PUBLISH_PERIOD 2000
#define SENTINEL_DEFAULT_DOWN_AFTER 30000
#define SENTINEL_HELLO_CHANNEL "__sentinel__:hello"
#define SENTINEL_TILT_TRIGGER 2000
#define SENTINEL_TILT_PERIOD (SENTINEL_PING_PERIOD*30)
#define SENTINEL_DEFAULT_SLAVE_PRIORITY 100
#define SENTINEL_SLAVE_RECONF_TIMEOUT 10000
#define SENTINEL_DEFAULT_PARALLEL_SYNCS 1
#define SENTINEL_MIN_LINK_RECONNECT_PERIOD 15000
#define SENTINEL_DEFAULT_FAILOVER_TIMEOUT (60*3*1000)
#define SENTINEL_MAX_PENDING_COMMANDS 100
#define SENTINEL_ELECTION_TIMEOUT 10000
#define SENTINEL_MAX_DESYNC 1000

载入的默认命令列表

struct redisCommand sentinelcmds[] = {
    {"ping",pingCommand,1,"",0,NULL,0,0,0,0,0},
    {"sentinel",sentinelCommand,-2,"",0,NULL,0,0,0,0,0},
    {"subscribe",subscribeCommand,-2,"",0,NULL,0,0,0,0,0},
    {"unsubscribe",unsubscribeCommand,-1,"",0,NULL,0,0,0,0,0},
    {"psubscribe",psubscribeCommand,-2,"",0,NULL,0,0,0,0,0},
    {"punsubscribe",punsubscribeCommand,-1,"",0,NULL,0,0,0,0,0},
    {"publish",sentinelPublishCommand,3,"",0,NULL,0,0,0,0,0},
    {"info",sentinelInfoCommand,-1,"",0,NULL,0,0,0,0,0},
    {"role",sentinelRoleCommand,1,"l",0,NULL,0,0,0,0,0},
    {"client",clientCommand,-2,"rs",0,NULL,0,0,0,0,0},
    {"shutdown",shutdownCommand,-1,"",0,NULL,0,0,0,0,0}
};

2. 初始化Sentinel结构sentinelState

该sentinelState结构保存了服务器中所有和sentinel有关的状态信息

struct sentinelState {
    char myid[CONFIG_RUN_ID_SIZE+1]; /* This sentinel ID. */
    uint64_t current_epoch;         /* Current epoch. */
    dict *masters;      /* Dictionary of master sentinelRedisInstances.
                           Key is the instance name, value is the
                           sentinelRedisInstance structure pointer. */
    int tilt;           /* Are we in TILT mode? */
    int running_scripts;    /* Number of scripts in execution right now. */
    mstime_t tilt_start_time;       /* When TITL started. */
    mstime_t previous_time;         /* Last time we ran the time handler. */
    list *scripts_queue;            /* Queue of user scripts to execute. */
    char *announce_ip;  /* IP addr that is gossiped to other sentinels if
                           not NULL. */
    int announce_port;  /* Port that is gossiped to other sentinels if
                           non zero. */
    unsigned long simfailure_flags; /* Failures simulation. */
} sentinel;

其中masters属性为一个字典结构,字典的键为被监控的master服务器的标识符,字典的值为一个sentinelRedisInstance结构的数据结构,根据节点类型不同,字典值的数据结构也不一样。

typedef struct sentinelRedisInstance {
    int flags;      /* See SRI_... defines */
    char *name;     /* Master name from the point of view of this sentinel. */
    char *runid;    /* Run ID of this instance, or unique ID if is a Sentinel.*/
    uint64_t config_epoch;  /* Configuration epoch. */
    sentinelAddr *addr; /* Master host. */
    instanceLink *link; /* Link to the instance, may be shared for Sentinels. */
    mstime_t last_pub_time;   /* Last time we sent hello via Pub/Sub. */
    mstime_t last_hello_time; /* Only used if SRI_SENTINEL is set. Last time
                                 we received a hello from this Sentinel
                                 via Pub/Sub. */
    mstime_t last_master_down_reply_time; /* Time of last reply to
                                             SENTINEL is-master-down command. */
    mstime_t s_down_since_time; /* Subjectively down since time. */
    mstime_t o_down_since_time; /* Objectively down since time. */
    mstime_t down_after_period; /* Consider it down after that period. */
    mstime_t info_refresh;  /* Time at which we received INFO output from it. */

    /* Role and the first time we observed it.
     * This is useful in order to delay replacing what the instance reports
     * with our own configuration. We need to always wait some time in order
     * to give a chance to the leader to report the new configuration before
     * we do silly things. */
    int role_reported;
    mstime_t role_reported_time;
    mstime_t slave_conf_change_time; /* Last time slave master addr changed. */

    /* Master specific. */
    dict *sentinels;    /* Other sentinels monitoring the same master. */
    dict *slaves;       /* Slaves for this master instance. */
    unsigned int quorum;/* Number of sentinels that need to agree on failure. */
    int parallel_syncs; /* How many slaves to reconfigure at same time. */
    char *auth_pass;    /* Password to use for AUTH against master & slaves. */

    /* Slave specific. */
    mstime_t master_link_down_time; /* Slave replication link down time. */
    int slave_priority; /* Slave priority according to its INFO output. */
    mstime_t slave_reconf_sent_time; /* Time at which we sent SLAVE OF  */
    struct sentinelRedisInstance *master; /* Master instance if it's slave. */
    char *slave_master_host;    /* Master host as reported by INFO */
    int slave_master_port;      /* Master port as reported by INFO */
    int slave_master_link_status; /* Master link status as reported by INFO */
    unsigned long long slave_repl_offset; /* Slave replication offset. */
    /* Failover */
    char *leader;       /* If this is a master instance, this is the runid of
                           the Sentinel that should perform the failover. If
                           this is a Sentinel, this is the runid of the Sentinel
                           that this Sentinel voted as leader. */
    uint64_t leader_epoch; /* Epoch of the 'leader' field. */
    uint64_t failover_epoch; /* Epoch of the currently started failover. */
    int failover_state; /* See SENTINEL_FAILOVER_STATE_* defines. */
    mstime_t failover_state_change_time;
    mstime_t failover_start_time;   /* Last failover attempt start time. */
    mstime_t failover_timeout;      /* Max time to refresh failover state. */
    mstime_t failover_delay_logged; /* For what failover_start_time value we
                                       logged the failover delay. */
    struct sentinelRedisInstance *promoted_slave; /* Promoted slave instance. */
    /* Scripts executed to notify admin or reconfigure clients: when they
     * are set to NULL no script is executed. */
    char *notification_script;
    char *client_reconfig_script;
    sds info; /* cached INFO output */
} sentinelRedisInstance;

其中结构sentinelRedisInstance的*add属性指向的是一个sentinelAddr结构的指针

/* Address object, used to describe an ip:port pair. */
typedef struct sentinelAddr {
    char *ip;
    int port;
} sentinelAddr;

比如,我们在Sentinel配置文件中定义如下参数

sentinel monitor mymaster 172.16.101.58 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000

那么Sentinel master字典初始化结构为

sentinelRedisInstance

flags

SRI_MASTER

name

mymaster

runid

1462343b52cd8bb8e10f2786ee14438f35907b88

config_epoch
0

add  ---> sentinelAddr

ip

172.16.101.58

port

6379

down_after_period

5000

failover_timeout

60000

...........................

 

 

 3.创建连向Master节点的网络连接

Sentinel实例初始化完成之后会创建两个连接master节点的网络连接,一个为订阅连接,一个为命令连接

订阅连接:专门用于订阅Master节点的__sentinel__:hello频道。

命令连接:专门用户处理与Master节点的命令发送与回复。

在master节点查看客户端连接时候可以看到有两个sentinel节点的的连接

127.0.0.1:6379> client list
id=16 addr=172.16.101.58:61106 fd=10 name=sentinel-50523f14-cmd age=67041 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping
id=17 addr=172.16.101.58:61108 fd=11 name=sentinel-50523f14-pubsub age=67041 idle=1 flags=N db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=subscribe

二.  获取Master节点和Slave节点信息

1.获取master节点信息

Sentinel节点启动之后,默认以间隔10s的频率,向被监控的master节点发送info命令,来获取master节点的当前运行信息,获取的关键数据如下

run_id:1462343b52cd8bb8e10f2786ee14438f35907b88
role:master
slave0:ip=172.16.101.59,port=6379,state=online,offset=14496948,lag=0
slave1:ip=172.16.101.60,port=6379,state=online,offset=14496934,lag=1

从以上信息也可以看到,Sentinel并不需要连接slave服务器,便可以获取到slave节点的信息

Sentinel通过info命令获取到master节点信息,然后根据master节点的run-id、role等信息,对实例结构中的master节点信息的run-id(master节点重启后会重新生成)等进行更新。

2. 获取Slave节点信息

Sentinel根据从master节点获得的slave信息,为这个slave在内存中创建相应的slave实例结构,还会创建连接到slave节点的命令连接和订阅连接,在命令连接创建之后,Sentinel默认以10s/次的间隔连接slave节点执行info命令,获得slave节点如下关键信息,然后对Sentinel的slave实例结构进行运行数据更新。

run_id:64b742b55c58a99d2f4acab86dcadb9de5ee6a3b
# Replication
role:slave
master_host:172.16.101.58
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:16146680
slave_priority:100
slave_read_only:1
connected_slaves:0
min_slaves_good_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

3. Sentinel向Master和Slave节点频道__sentinel__:hello Pub和Sub消息,并实时更新自己实例结构

在默认情况下,Sentinel会以间隔1s/次的频率,通过命令连接向被监视的master/slave节点的__sentinel__:hello频道发送以下格式的消息

PUBLISH __sentinel__:hello ", , , , ,  ,"

与此同时,sentinel也会向每个master/slave节点订阅频道__sentinel__:hello,这种订阅会一直持续,订阅频道收到的消息如下所示:

1) "message"
2) "__sentinel__:hello"
3) "172.16.101.59,26379,7d120b6abd66f89f3ea7ed01e4d6371913831fda,85,mymaster,172.16.101.58,6379,85"
1) "message"
2) "__sentinel__:hello"
3) "172.16.101.58,26379,50523f147d20b661896d13a4d51637736c475ef1,85,mymaster,172.16.101.58,6379,85"
1) "message"
2) "__sentinel__:hello"
3) "172.16.101.58,26379,50523f147d20b661896d13a4d51637736c475ef1,85,mymaster,172.16.101.58,6379,85"
1) "message"
2) "__sentinel__:hello"
3) "172.16.101.59,26379,7d120b6abd66f89f3ea7ed01e4d6371913831fda,85,mymaster,172.16.101.58,6379,85"
1) "message"
2) "__sentinel__:hello"
3) "172.16.101.60,26379,bfecd21a116f8b8d79607d0d58acd3f2008196fd,85,mymaster,172.16.101.58,6379,85"
1) "message"
2) "__sentinel__:hello"
3) "172.16.101.60,26379,bfecd21a116f8b8d79607d0d58acd3f2008196fd,85,mymaster,172.16.101.58,6379,85"

如果Sentinel通过对比sentinel_run_id,发现该条消息的sentinel_run_id的自己的相同,说明该条消息为自己Pub的订阅消息,并选择忽略,否则就根据该消息的字段,去更新SentinelRedisInstance实例结构中的sentinels字典。

4.Sentinel根据实例结构sentinel字典属性,去创建到其他sentinel节点的命令连接 

当Sentinel通过频道信息发现新的Sentinel节点更新到内存后,同时创建一个连向新Sentinel节点的命令连接,当一个发现周期结束后,每个监控master节点的Sentinel节点都会发现对方,最终形成了一个Sentinel相互连通的网络环境。

三.主观下线、客观下线、选举和故障转移

 1. 主观下线(Subjectively Down)

在默认情况下,Sentinel以1s/次的频率向所有连接的实例(master、slave、Sentinel节点)发送ping命令,如果在指定的时间内没有返回有效回复,那么该Sentinel节点就认为目标节点主观下线。

指定的有效时间由参数down-after-milliseconds“”控制,默认为30s,即30s内目标节点没有返回有效的ping回复,那么Sentinel节点就认为目标节点主观下线,所谓主观下线,就是当前的Sentinel节点认为,并不代表其他节点的也认为,因为这也有可能是当前的Sentinel节点到目标节点的网络延迟抖动等,对其他Sentinel并没有影响。

所谓的有效回复指以下三种其一:

  • PING replied with +PONG.
  • PING replied with -LOADING error.
  • PING replied with -MASTERDOWN error.

当接收到目标节点返回的除以上类型外其他的返回值时,该Sentinel节点就认为目标节点主观下线,随后向Sentinel日志记录以下log

16889:X 04 Apr 22:48:57.193 # +sdown master mymaster 172.16.101.58 6379

2.客观下线(Subjectively Down)

当主观下线发生后,Sentinel节点就将sentinelRedisInstance实例结构中的flag属性值修改为SRI_S_DOWN,同时会向其他Sentinel节点发送“sentinel is-master-down-by-addr ”命令来询问目标节点是否进入下线状态,

如在当前A节点认为目标节点主观下线后,会发送如下命令询问其他sentinel节点目标节点是否已经下线

sentinel is-master-down-by-addr 172.16.101.60 6379 85 *

如果run-id为*,代表该命令只是为了向其他节点询问目标节点是否已经下线。

当其他B、C等sentinel节点接收到上述消息后,会提取master信息并检查目标节点是否已下线,如果下线。则会向A节点回复包含3个参数的命令回复。

1) 1   #down_state,0为未下线,1为已下线
2) *   #leader_runid,*为仅仅用于检测是否客观下线
3) 0   #leader_epoch,用于检测客观下线时,该值永远为0

A节点接收到命令回复后,统计B、C节点中认为目标节点客观下线的数量,如果数量超过配置文件中指定的判断客观下线时指定的最小所需数量时,Sentinel节点就将sentinelRedisInstance实例结构中的flag属性值修改为SRI_O_DOWN。同时向Sentinel日志文件中记录一下log信息。

5494:X 04 Apr 22:48:57.305 # +odown master mymaster 172.16.101.58 6379 #quorum 3/2

判断客观下线的最小所需数量为Sentinel配置参数中的quorum指定。

sentinel monitor mymaster master_ip master_port quorum

3.选举出Sentinel领导者

客观下线发生后,各个Sentinel节点开始向其他所有Sentinel节点发送“sentinel is-master-down-by-addr ”命令。

sentinel is-master-down-by-addr 172.16.101.60 6379 86 50523f147d20b661896d13a4d51637736c475ef1

如果Sentinel的节点数为3个,分别是A、B、C

此时A发送选举的run-id为发送该命令的A的run-id,Sentinel遵从先到先得的原则,如果Sentinel A是第一个发送给Sentinel B的,那么Sentinel B就认为A是Sentinel的领导者,节点C发送过来的选举请求命令会被Sentinel B节点拒绝,并且将B认为的Sentinel领导者A的run-id返回给C,C也认为A是Sentinel的领导者,此时3节点中认为A是领导者的数量为n/2+1等于2,超过sentinel节点半数,那么A就成为Sentinel节点中的领导者

A节点的投票选举,A投给了run-id为7d120b6abd66f89f3ea7ed01e4d6371913831fda的节点

16889:X 04 Apr 22:48:57.418 # +vote-for-leader 7d120b6abd66f89f3ea7ed01e4d6371913831fda 86

B节点的投票选举,B投给了run-id为7d120b6abd66f89f3ea7ed01e4d6371913831fda的节点

5494:X 04 Apr 22:48:57.327 # +vote-for-leader 7d120b6abd66f89f3ea7ed01e4d6371913831fda 86

C节点的投票选举,C投给了run-id为bfecd21a116f8b8d79607d0d58acd3f2008196fd的节点

16523:X 04 Apr 22:48:57.359 # +vote-for-leader bfecd21a116f8b8d79607d0d58acd3f2008196fd 86

查看三个节点对应的run-id信息

sentinel known-sentinel mymaster 172.16.101.58 26379 50523f147d20b661896d13a4d51637736c475ef1
sentinel known-sentinel mymaster 172.16.101.59 26379 7d120b6abd66f89f3ea7ed01e4d6371913831fda
sentinel known-sentinel mymaster 172.16.101.60 26379 bfecd21a116f8b8d79607d0d58acd3f2008196fd

说明run-id为7d120b6abd66f89f3ea7ed01e4d6371913831fda的Sentinel节点172.16.101.59赢得了3个节点中的2票,成为Sentinel中的领导者.

5494:X 04 Apr 22:48:57.450 # +elected-leader master mymaster 172.16.101.58 6379

4. 故障转移

当选举出Sentinel中的领导者后,Sentinel领导者开始执行故障转移

5494:X 04 Apr 22:48:57.450 # +failover-state-select-slave master mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:57.541 # +selected-slave slave 172.16.101.60:6379 172.16.101.60 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:57.541 * +failover-state-send-slaveof-noone slave 172.16.101.60:6379 172.16.101.60 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:57.631 * +failover-state-wait-promotion slave 172.16.101.60:6379 172.16.101.60 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:58.415 # +promoted-slave slave 172.16.101.60:6379 172.16.101.60 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:58.420 # +failover-state-reconf-slaves master mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:58.465 * +slave-reconf-sent slave 172.16.101.59:6379 172.16.101.59 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:59.461 * +slave-reconf-inprog slave 172.16.101.59:6379 172.16.101.59 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:59.461 * +slave-reconf-done slave 172.16.101.59:6379 172.16.101.59 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:59.517 # +failover-end master mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:59.517 # +switch-master mymaster 172.16.101.58 6379 172.16.101.60 6379
5494:X 04 Apr 22:48:59.517 * +slave slave 172.16.101.59:6379 172.16.101.59 6379 @ mymaster 172.16.101.60 6379
5494:X 04 Apr 22:48:59.517 * +slave slave 172.16.101.58:6379 172.16.101.58 6379 @ mymaster 172.16.101.60 6379
5494:X 04 Apr 22:49:04.562 # +sdown slave 172.16.101.58:6379 172.16.101.58 6379 @ mymaster 172.16.101.60 6379

1.Sentinel领导者首先选择一个slave节点,将其转换为master节点

5494:X 04 Apr 22:48:57.541 # +selected-slave slave 172.16.101.60:6379 172.16.101.60 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:57.541 * +failover-state-send-slaveof-noone slave 172.16.101.60:6379 172.16.101.60 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:57.631 * +failover-state-wait-promotion slave 172.16.101.60:6379 172.16.101.60 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:58.415 # +promoted-slave slave 172.16.101.60:6379 172.16.101.60 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:58.420 # +failover-state-reconf-slaves master mymaster 172.16.101.58 6379

2. Sentinel领导者将原所有slave节点改为复制新的master

5494:X 04 Apr 22:48:58.465 * +slave-reconf-sent slave 172.16.101.59:6379 172.16.101.59 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:59.461 * +slave-reconf-inprog slave 172.16.101.59:6379 172.16.101.59 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:59.461 * +slave-reconf-done slave 172.16.101.59:6379 172.16.101.59 6379 @ mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:59.517 # +failover-end master mymaster 172.16.101.58 6379
5494:X 04 Apr 22:48:59.517 # +switch-master mymaster 172.16.101.58 6379 172.16.101.60 6379
5494:X 04 Apr 22:48:59.517 * +slave slave 172.16.101.59:6379 172.16.101.59 6379 @ mymaster 172.16.101.60 6379

3. Sentinel领导者将原master节点改为新master的slave,因为原master已经下线,所以新master会将原master标记为客观下线状态

5494:X 04 Apr 22:48:59.517 * +slave slave 172.16.101.58:6379 172.16.101.58 6379 @ mymaster 172.16.101.60 6379
5494:X 04 Apr 22:49:04.562 # +sdown slave 172.16.101.58:6379 172.16.101.58 6379 @ mymaster 172.16.101.60 6379

 4. Sentinel领导者继续监控原master节点,等到原master重新上线后,将它变为新master的slave节点

16523:X 04 Apr 23:23:59.346 # -sdown slave 172.16.101.58:6379 172.16.101.58 6379 @ mymaster 172.16.101.60 6379
16523:X 04 Apr 23:24:09.322 * +convert-to-slave slave 172.16.101.58:6379 172.16.101.58 6379 @ mymaster 172.16.101.60 6379

 

 以上就是关于哨兵Sentinel的简单介绍。

 

你可能感兴趣的:(Redis sentinel哨兵启动、故障切换过程简单分析)