1.redis客户端与服务端
1.1 客户端
1.1.1 客户端种类
redis 客户端主要分为三种,普通客户端、发布订阅客户端、slave 客户端。
普通客户端我们不用多说,也是我们用的最多的客户端。
redis 客户端可以订阅任意数量的频道,如果你订阅了某个服务器频道,那么你的客户端就也是一个发布订阅客户端,当频道中有新消息,服务器会向你推送。
当 redis 集群中部署了主从节点的时候,所有的从节点服务器又称为 slave 客户端,他们会向 master 定期拉取最新数据。
1.1.2 客户端参数说明
直接来看redis中是如何定义client的
typedef struct client {
uint64_t id; /* Client incremental unique ID. */
connection *conn;
int resp; /* RESP protocol version. Can be 2 or 3. */
redisDb *db; /* Pointer to currently SELECTed DB. */
robj *name; /* As set by CLIENT SETNAME. */
sds querybuf; /* Buffer we use to accumulate client queries. */
size_t qb_pos; /* The position we have read in querybuf. */
sds pending_querybuf; /* If this client is flagged as master, this buffer
represents the yet not applied portion of the
replication stream that we are receiving from
the master. */
size_t querybuf_peak; /* Recent (100ms or more) peak of querybuf size. */
int argc; /* Num of arguments of current command. */
robj **argv; /* Arguments of current command. */
struct redisCommand *cmd, *lastcmd; /* Last command executed. */
user *user; /* User associated with this connection. If the
user is set to NULL the connection can do
anything (admin). */
int reqtype; /* Request protocol type: PROTO_REQ_* */
int multibulklen; /* Number of multi bulk arguments left to read. */
long bulklen; /* Length of bulk argument in multi bulk request. */
list *reply; /* List of reply objects to send to the client. */
unsigned long long reply_bytes; /* Tot bytes of objects in reply list. */
size_t sentlen; /* Amount of bytes already sent in the current
buffer or object being sent. */
time_t ctime; /* Client creation time. */
time_t lastinteraction; /* Time of the last interaction, used for timeout */
time_t obuf_soft_limit_reached_time;
uint64_t flags; /* Client flags: CLIENT_* macros. */
int authenticated; /* Needed when the default user requires auth. */
int replstate; /* Replication state if this is a slave. */
int repl_put_online_on_ack; /* Install slave write handler on first ACK. */
int repldbfd; /* Replication DB file descriptor. */
off_t repldboff; /* Replication DB file offset. */
off_t repldbsize; /* Replication DB file size. */
sds replpreamble; /* Replication DB preamble. */
long long read_reploff; /* Read replication offset if this is a master. */
long long reploff; /* Applied replication offset if this is a master. */
long long repl_ack_off; /* Replication ack offset, if this is a slave. */
long long repl_ack_time;/* Replication ack time, if this is a slave. */
long long psync_initial_offset; /* FULLRESYNC reply offset other slaves
copying this slave output buffer
should use. */
char replid[CONFIG_RUN_ID_SIZE+1]; /* Master replication ID (if master). */
int slave_listening_port; /* As configured with: SLAVECONF listening-port */
char slave_ip[NET_IP_STR_LEN]; /* Optionally given by REPLCONF ip-address */
int slave_capa; /* Slave capabilities: SLAVE_CAPA_* bitwise OR. */
multiState mstate; /* MULTI/EXEC state */
int btype; /* Type of blocking op if CLIENT_BLOCKED. */
blockingState bpop; /* blocking state */
long long woff; /* Last write global replication offset. */
list *watched_keys; /* Keys WATCHED for MULTI/EXEC CAS */
dict *pubsub_channels; /* channels a client is interested in (SUBSCRIBE) */
list *pubsub_patterns; /* patterns a client is interested in (SUBSCRIBE) */
sds peerid; /* Cached peer ID. */
listNode *client_list_node; /* list node in client list */
RedisModuleUserChangedFunc auth_callback; /* Module callback to execute
* when the authenticated user
* changes. */
void *auth_callback_privdata; /* Private data that is passed when the auth
* changed callback is executed. Opaque for
* Redis Core. */
void *auth_module; /* The module that owns the callback, which is used
* to disconnect the client if the module is
* unloaded for cleanup. Opaque for Redis Core.*/
/* If this client is in tracking mode and this field is non zero,
* invalidation messages for keys fetched by this client will be send to
* the specified client ID. */
uint64_t client_tracking_redirection;
rax *client_tracking_prefixes; /* A dictionary of prefixes we are already
subscribed to in BCAST mode, in the
context of client side caching. */
/* In clientsCronTrackClientsMemUsage() we track the memory usage of
* each client and add it to the sum of all the clients of a given type,
* however we need to remember what was the old contribution of each
* client, and in which categoty the client was, in order to remove it
* before adding it the new value. */
uint64_t client_cron_last_memory_usage;
int client_cron_last_memory_type;
/* Response buffer */
int bufpos;
char buf[PROTO_REPLY_CHUNK_BYTES];
} client;
authenticated:身份验证,为0时只能接受AUTH命令,为1时则正常使用
从终端执行“client list”获取到的值
127.0.0.1:6379> client list
id=308 addr=127.0.0.1:51338 fd=7 name= age=4752555 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=client
id:自增唯一id
addr:客户端地址 + 端口
fd:套接字描述符(伪客户端为-1,其他为正整数)
name:客户端的名字,默认是空,可以用 “client setname“ 为客户端命名
age:生命周期,以秒为单位
idle:空闲时长,以秒为单位
flags:客户端的标志值
db:指向客户端正在使用的数据库
sub:已订阅频道的数量
psub:已订阅模式的数量
multi:在事务中被执行的命令数量
qbuf:输入缓冲区--已使用的缓冲区
qbuf-free:输入缓冲区--空闲的缓冲区
obl:输出缓冲区--固定缓冲区长度
oll:输出缓冲区--动态缓冲区长度
omem:固定缓冲区和动态缓冲区总共占用了多少字节
events:文件描述符时间
cmd:最近一次执行的命令
客户端 flag 可以由以下部分组成:
- O:客户端是 MONITOR 模式下的附属节点(slave)
- S:客户端是一般模式下(normal)的附属节点
- M:客户端是主节点(master)
- x:客户端正在执行事务
- b:客户端正在等待阻塞事件
- i:客户端正在等待 VM I/O 操作(已废弃)
- d:一个受监视(watched)的键已被修改, EXEC 命令将失败
- c:在将回复完整地写出之后,关闭链接
- u:客户端未被阻塞(unblocked)
- A:尽可能快地关闭连接
- N:未设置任何 flag
文件描述符事件可以是:
- r:客户端套接字(在事件 loop 中)是可读的(readable)
- w:客户端套接字(在事件 loop 中)是可写的(writeable)
1.2 服务端
Redis: under the hood
1.2.1 Redis服务启动流程
概要
Redis通过初始化全局服务器状态变量并读取可选的配置文件以覆盖所有默认值来启动。它建立了一个全局命令表,该表将命令名称与实现该命令的实际功能连接起来。它使用最佳可用的基础系统库创建事件循环,以进行事件/就绪通知,并为有新客户端套接字连接接受时注册处理程序函数。它还注册了一个定期的(即基于时间的)事件处理程序来处理cron密钥到期之类的类似任务,需要在常规客户端处理路径之外解决。一旦客户端已连接,便在事件循环中注册了一个功能,以便在客户端具有要读取的数据(即查询命令)时得到通知。解析客户端的查询,并调用命令处理程序以执行命令并将响应写回到客户端(事件通知循环也处理向客户端的数据写入)。客户端对象被重置,服务器准备处理更多查询。
开始全局服务器状态初始化
首先会调用initServerConfig()初始化一个变量server,其类型为struct redisServer,用作全局服务器状态。
源码如下:
struct redisServer {
/* General */
pid_t pid; /* Main process pid. */
char *configfile; /* Absolute config file path, or NULL */
char *executable; /* Absolute executable file path. */
char **exec_argv; /* Executable argv vector (copy). */
int dynamic_hz; /* Change hz value depending on # of clients. */
int config_hz; /* Configured HZ value. May be different than
the actual 'hz' field value if dynamic-hz
is enabled. */
int hz; /* serverCron() calls frequency in hertz */
redisDb *db;
dict *commands; /* Command table */
dict *orig_commands; /* Command table before command renaming. */
aeEventLoop *el;
_Atomic unsigned int lruclock; /* Clock for LRU eviction */
int shutdown_asap; /* SHUTDOWN needed ASAP */
int activerehashing; /* Incremental rehash in serverCron() */
int active_defrag_running; /* Active defragmentation running (holds current scan aggressiveness) */
char *pidfile; /* PID file path */
int arch_bits; /* 32 or 64 depending on sizeof(long) */
int cronloops; /* Number of times the cron function run */
char runid[CONFIG_RUN_ID_SIZE+1]; /* ID always different at every exec. */
int sentinel_mode; /* True if this instance is a Sentinel. */
size_t initial_memory_usage; /* Bytes used after initialization. */
int always_show_logo; /* Show logo even for non-stdout logging. */
/* Modules */
dict *moduleapi; /* Exported core APIs dictionary for modules. */
dict *sharedapi; /* Like moduleapi but containing the APIs that
modules share with each other. */
list *loadmodule_queue; /* List of modules to load at startup. */
int module_blocked_pipe[2]; /* Pipe used to awake the event loop if a
client blocked on a module command needs
to be processed. */
pid_t module_child_pid; /* PID of module child */
/* Networking */
int port; /* TCP listening port */
int tls_port; /* TLS listening port */
int tcp_backlog; /* TCP listen() backlog */
char *bindaddr[CONFIG_BINDADDR_MAX]; /* Addresses we should bind to */
int bindaddr_count; /* Number of addresses in server.bindaddr[] */
char *unixsocket; /* UNIX socket path */
mode_t unixsocketperm; /* UNIX socket permission */
int ipfd[CONFIG_BINDADDR_MAX]; /* TCP socket file descriptors */
int ipfd_count; /* Used slots in ipfd[] */
int tlsfd[CONFIG_BINDADDR_MAX]; /* TLS socket file descriptors */
int tlsfd_count; /* Used slots in tlsfd[] */
int sofd; /* Unix socket file descriptor */
int cfd[CONFIG_BINDADDR_MAX];/* Cluster bus listening socket */
int cfd_count; /* Used slots in cfd[] */
list *clients; /* List of active clients */
list *clients_to_close; /* Clients to close asynchronously */
list *clients_pending_write; /* There is to write or install handler. */
list *clients_pending_read; /* Client has pending read socket buffers. */
list *slaves, *monitors; /* List of slaves and MONITORs */
client *current_client; /* Current client executing the command. */
rax *clients_timeout_table; /* Radix tree for blocked clients timeouts. */
long fixed_time_expire; /* If > 0, expire keys against server.mstime. */
rax *clients_index; /* Active clients dictionary by client ID. */
int clients_paused; /* True if clients are currently paused */
mstime_t clients_pause_end_time; /* Time when we undo clients_paused */
char neterr[ANET_ERR_LEN]; /* Error buffer for anet.c */
dict *migrate_cached_sockets;/* MIGRATE cached sockets */
_Atomic uint64_t next_client_id; /* Next client unique ID. Incremental. */
int protected_mode; /* Don't accept external connections. */
int gopher_enabled; /* If true the server will reply to gopher
queries. Will still serve RESP2 queries. */
int io_threads_num; /* Number of IO threads to use. */
int io_threads_do_reads; /* Read and parse from IO threads? */
int io_threads_active; /* Is IO threads currently active? */
long long events_processed_while_blocked; /* processEventsWhileBlocked() */
/* RDB / AOF loading information */
int loading; /* We are loading data from disk if true */
off_t loading_total_bytes;
off_t loading_loaded_bytes;
time_t loading_start_time;
off_t loading_process_events_interval_bytes;
/* Fast pointers to often looked up command */
struct redisCommand *delCommand, *multiCommand, *lpushCommand,
*lpopCommand, *rpopCommand, *zpopminCommand,
*zpopmaxCommand, *sremCommand, *execCommand,
*expireCommand, *pexpireCommand, *xclaimCommand,
*xgroupCommand, *rpoplpushCommand;
/* Fields used only for stats */
time_t stat_starttime; /* Server start time */
long long stat_numcommands; /* Number of processed commands */
long long stat_numconnections; /* Number of connections received */
long long stat_expiredkeys; /* Number of expired keys */
double stat_expired_stale_perc; /* Percentage of keys probably expired */
long long stat_expired_time_cap_reached_count; /* Early expire cylce stops.*/
long long stat_expire_cycle_time_used; /* Cumulative microseconds used. */
long long stat_evictedkeys; /* Number of evicted keys (maxmemory) */
long long stat_keyspace_hits; /* Number of successful lookups of keys */
long long stat_keyspace_misses; /* Number of failed lookups of keys */
long long stat_active_defrag_hits; /* number of allocations moved */
long long stat_active_defrag_misses; /* number of allocations scanned but not moved */
long long stat_active_defrag_key_hits; /* number of keys with moved allocations */
long long stat_active_defrag_key_misses;/* number of keys scanned and not moved */
long long stat_active_defrag_scanned; /* number of dictEntries scanned */
size_t stat_peak_memory; /* Max used memory record */
long long stat_fork_time; /* Time needed to perform latest fork() */
double stat_fork_rate; /* Fork rate in GB/sec. */
long long stat_rejected_conn; /* Clients rejected because of maxclients */
long long stat_sync_full; /* Number of full resyncs with slaves. */
long long stat_sync_partial_ok; /* Number of accepted PSYNC requests. */
long long stat_sync_partial_err;/* Number of unaccepted PSYNC requests. */
list *slowlog; /* SLOWLOG list of commands */
long long slowlog_entry_id; /* SLOWLOG current entry ID */
long long slowlog_log_slower_than; /* SLOWLOG time limit (to get logged) */
unsigned long slowlog_max_len; /* SLOWLOG max number of items logged */
struct malloc_stats cron_malloc_stats; /* sampled in serverCron(). */
_Atomic long long stat_net_input_bytes; /* Bytes read from network. */
_Atomic long long stat_net_output_bytes; /* Bytes written to network. */
size_t stat_rdb_cow_bytes; /* Copy on write bytes during RDB saving. */
size_t stat_aof_cow_bytes; /* Copy on write bytes during AOF rewrite. */
size_t stat_module_cow_bytes; /* Copy on write bytes during module fork. */
uint64_t stat_clients_type_memory[CLIENT_TYPE_COUNT];/* Mem usage by type */
long long stat_unexpected_error_replies; /* Number of unexpected (aof-loading, replica to master, etc.) error replies */
long long stat_io_reads_processed; /* Number of read events processed by IO / Main threads */
long long stat_io_writes_processed; /* Number of write events processed by IO / Main threads */
_Atomic long long stat_total_reads_processed; /* Total number of read events processed */
_Atomic long long stat_total_writes_processed; /* Total number of write events processed */
/* The following two are used to track instantaneous metrics, like
* number of operations per second, network traffic. */
struct {
long long last_sample_time; /* Timestamp of last sample in ms */
long long last_sample_count;/* Count in last sample */
long long samples[STATS_METRIC_SAMPLES];
int idx;
} inst_metric[STATS_METRIC_COUNT];
/* Configuration */
int verbosity; /* Loglevel in redis.conf */
int maxidletime; /* Client timeout in seconds */
int tcpkeepalive; /* Set SO_KEEPALIVE if non-zero. */
int active_expire_enabled; /* Can be disabled for testing purposes. */
int active_expire_effort; /* From 1 (default) to 10, active effort. */
int active_defrag_enabled;
int jemalloc_bg_thread; /* Enable jemalloc background thread */
size_t active_defrag_ignore_bytes; /* minimum amount of fragmentation waste to start active defrag */
int active_defrag_threshold_lower; /* minimum percentage of fragmentation to start active defrag */
int active_defrag_threshold_upper; /* maximum percentage of fragmentation at which we use maximum effort */
int active_defrag_cycle_min; /* minimal effort for defrag in CPU percentage */
int active_defrag_cycle_max; /* maximal effort for defrag in CPU percentage */
unsigned long active_defrag_max_scan_fields; /* maximum number of fields of set/hash/zset/list to process from within the main dict scan */
_Atomic size_t client_max_querybuf_len; /* Limit for client query buffer length */
int dbnum; /* Total number of configured DBs */
int supervised; /* 1 if supervised, 0 otherwise. */
int supervised_mode; /* See SUPERVISED_* */
int daemonize; /* True if running as a daemon */
clientBufferLimitsConfig client_obuf_limits[CLIENT_TYPE_OBUF_COUNT];
/* AOF persistence */
int aof_enabled; /* AOF configuration */
int aof_state; /* AOF_(ON|OFF|WAIT_REWRITE) */
int aof_fsync; /* Kind of fsync() policy */
char *aof_filename; /* Name of the AOF file */
int aof_no_fsync_on_rewrite; /* Don't fsync if a rewrite is in prog. */
int aof_rewrite_perc; /* Rewrite AOF if % growth is > M and... */
off_t aof_rewrite_min_size; /* the AOF file is at least N bytes. */
off_t aof_rewrite_base_size; /* AOF size on latest startup or rewrite. */
off_t aof_current_size; /* AOF current size. */
off_t aof_fsync_offset; /* AOF offset which is already synced to disk. */
int aof_flush_sleep; /* Micros to sleep before flush. (used by tests) */
int aof_rewrite_scheduled; /* Rewrite once BGSAVE terminates. */
pid_t aof_child_pid; /* PID if rewriting process */
list *aof_rewrite_buf_blocks; /* Hold changes during an AOF rewrite. */
sds aof_buf; /* AOF buffer, written before entering the event loop */
int aof_fd; /* File descriptor of currently selected AOF file */
int aof_selected_db; /* Currently selected DB in AOF */
time_t aof_flush_postponed_start; /* UNIX time of postponed AOF flush */
time_t aof_last_fsync; /* UNIX time of last fsync() */
time_t aof_rewrite_time_last; /* Time used by last AOF rewrite run. */
time_t aof_rewrite_time_start; /* Current AOF rewrite start time. */
int aof_lastbgrewrite_status; /* C_OK or C_ERR */
unsigned long aof_delayed_fsync; /* delayed AOF fsync() counter */
int aof_rewrite_incremental_fsync;/* fsync incrementally while aof rewriting? */
int rdb_save_incremental_fsync; /* fsync incrementally while rdb saving? */
int aof_last_write_status; /* C_OK or C_ERR */
int aof_last_write_errno; /* Valid if aof_last_write_status is ERR */
int aof_load_truncated; /* Don't stop on unexpected AOF EOF. */
int aof_use_rdb_preamble; /* Use RDB preamble on AOF rewrites. */
/* AOF pipes used to communicate between parent and child during rewrite. */
int aof_pipe_write_data_to_child;
int aof_pipe_read_data_from_parent;
int aof_pipe_write_ack_to_parent;
int aof_pipe_read_ack_from_child;
int aof_pipe_write_ack_to_child;
int aof_pipe_read_ack_from_parent;
int aof_stop_sending_diff; /* If true stop sending accumulated diffs
to child process. */
sds aof_child_diff; /* AOF diff accumulator child side. */
/* RDB persistence */
long long dirty; /* Changes to DB from the last save */
long long dirty_before_bgsave; /* Used to restore dirty on failed BGSAVE */
pid_t rdb_child_pid; /* PID of RDB saving child */
struct saveparam *saveparams; /* Save points array for RDB */
int saveparamslen; /* Number of saving points */
char *rdb_filename; /* Name of RDB file */
int rdb_compression; /* Use compression in RDB? */
int rdb_checksum; /* Use RDB checksum? */
int rdb_del_sync_files; /* Remove RDB files used only for SYNC if
the instance does not use persistence. */
time_t lastsave; /* Unix time of last successful save */
time_t lastbgsave_try; /* Unix time of last attempted bgsave */
time_t rdb_save_time_last; /* Time used by last RDB save run. */
time_t rdb_save_time_start; /* Current RDB save start time. */
int rdb_bgsave_scheduled; /* BGSAVE when possible if true. */
int rdb_child_type; /* Type of save by active child. */
int lastbgsave_status; /* C_OK or C_ERR */
int stop_writes_on_bgsave_err; /* Don't allow writes if can't BGSAVE */
int rdb_pipe_write; /* RDB pipes used to transfer the rdb */
int rdb_pipe_read; /* data to the parent process in diskless repl. */
connection **rdb_pipe_conns; /* Connections which are currently the */
int rdb_pipe_numconns; /* target of diskless rdb fork child. */
int rdb_pipe_numconns_writing; /* Number of rdb conns with pending writes. */
char *rdb_pipe_buff; /* In diskless replication, this buffer holds data */
int rdb_pipe_bufflen; /* that was read from the the rdb pipe. */
int rdb_key_save_delay; /* Delay in microseconds between keys while
* writing the RDB. (for testings). negative
* value means fractions of microsecons (on average). */
int key_load_delay; /* Delay in microseconds between keys while
* loading aof or rdb. (for testings). negative
* value means fractions of microsecons (on average). */
/* Pipe and data structures for child -> parent info sharing. */
int child_info_pipe[2]; /* Pipe used to write the child_info_data. */
struct {
int process_type; /* AOF or RDB child? */
size_t cow_size; /* Copy on write size. */
unsigned long long magic; /* Magic value to make sure data is valid. */
} child_info_data;
/* Propagation of commands in AOF / replication */
redisOpArray also_propagate; /* Additional command to propagate. */
/* Logging */
char *logfile; /* Path of log file */
int syslog_enabled; /* Is syslog enabled? */
char *syslog_ident; /* Syslog ident */
int syslog_facility; /* Syslog facility */
int crashlog_enabled; /* Enable signal handler for crashlog.
* disable for clean core dumps. */
int memcheck_enabled; /* Enable memory check on crash. */
int use_exit_on_panic; /* Use exit() on panic and assert rather than
* abort(). useful for Valgrind. */
/* Replication (master) */
char replid[CONFIG_RUN_ID_SIZE+1]; /* My current replication ID. */
char replid2[CONFIG_RUN_ID_SIZE+1]; /* replid inherited from master*/
long long master_repl_offset; /* My current replication offset */
long long second_replid_offset; /* Accept offsets up to this for replid2. */
int slaveseldb; /* Last SELECTed DB in replication output */
int repl_ping_slave_period; /* Master pings the slave every N seconds */
char *repl_backlog; /* Replication backlog for partial syncs */
long long repl_backlog_size; /* Backlog circular buffer size */
long long repl_backlog_histlen; /* Backlog actual data length */
long long repl_backlog_idx; /* Backlog circular buffer current offset,
that is the next byte will'll write to.*/
long long repl_backlog_off; /* Replication "master offset" of first
byte in the replication backlog buffer.*/
time_t repl_backlog_time_limit; /* Time without slaves after the backlog
gets released. */
time_t repl_no_slaves_since; /* We have no slaves since that time.
Only valid if server.slaves len is 0. */
int repl_min_slaves_to_write; /* Min number of slaves to write. */
int repl_min_slaves_max_lag; /* Max lag of slaves to write. */
int repl_good_slaves_count; /* Number of slaves with lag <= max_lag. */
int repl_diskless_sync; /* Master send RDB to slaves sockets directly. */
int repl_diskless_load; /* Slave parse RDB directly from the socket.
* see REPL_DISKLESS_LOAD_* enum */
int repl_diskless_sync_delay; /* Delay to start a diskless repl BGSAVE. */
/* Replication (slave) */
char *masteruser; /* AUTH with this user and masterauth with master */
char *masterauth; /* AUTH with this password with master */
char *masterhost; /* Hostname of master */
int masterport; /* Port of master */
int repl_timeout; /* Timeout after N seconds of master idle */
client *master; /* Client that is master for this slave */
client *cached_master; /* Cached master to be reused for PSYNC. */
int repl_syncio_timeout; /* Timeout for synchronous I/O calls */
int repl_state; /* Replication status if the instance is a slave */
off_t repl_transfer_size; /* Size of RDB to read from master during sync. */
off_t repl_transfer_read; /* Amount of RDB read from master during sync. */
off_t repl_transfer_last_fsync_off; /* Offset when we fsync-ed last time. */
connection *repl_transfer_s; /* Slave -> Master SYNC connection */
int repl_transfer_fd; /* Slave -> Master SYNC temp file descriptor */
char *repl_transfer_tmpfile; /* Slave-> master SYNC temp file name */
time_t repl_transfer_lastio; /* Unix time of the latest read, for timeout */
int repl_serve_stale_data; /* Serve stale data when link is down? */
int repl_slave_ro; /* Slave is read only? */
int repl_slave_ignore_maxmemory; /* If true slaves do not evict. */
time_t repl_down_since; /* Unix time at which link with master went down */
int repl_disable_tcp_nodelay; /* Disable TCP_NODELAY after SYNC? */
int slave_priority; /* Reported in INFO and used by Sentinel. */
int slave_announce_port; /* Give the master this listening port. */
char *slave_announce_ip; /* Give the master this ip address. */
/* The following two fields is where we store master PSYNC replid/offset
* while the PSYNC is in progress. At the end we'll copy the fields into
* the server->master client structure. */
char master_replid[CONFIG_RUN_ID_SIZE+1]; /* Master PSYNC runid. */
long long master_initial_offset; /* Master PSYNC offset. */
int repl_slave_lazy_flush; /* Lazy FLUSHALL before loading DB? */
/* Replication script cache. */
dict *repl_scriptcache_dict; /* SHA1 all slaves are aware of. */
list *repl_scriptcache_fifo; /* First in, first out LRU eviction. */
unsigned int repl_scriptcache_size; /* Max number of elements. */
/* Synchronous replication. */
list *clients_waiting_acks; /* Clients waiting in WAIT command. */
int get_ack_from_slaves; /* If true we send REPLCONF GETACK. */
/* Limits */
unsigned int maxclients; /* Max number of simultaneous clients */
unsigned long long maxmemory; /* Max number of memory bytes to use */
int maxmemory_policy; /* Policy for key eviction */
int maxmemory_samples; /* Pricision of random sampling */
int lfu_log_factor; /* LFU logarithmic counter factor. */
int lfu_decay_time; /* LFU counter decay factor. */
long long proto_max_bulk_len; /* Protocol bulk length maximum size. */
int oom_score_adj_base; /* Base oom_score_adj value, as observed on startup */
int oom_score_adj_values[CONFIG_OOM_COUNT]; /* Linux oom_score_adj configuration */
int oom_score_adj; /* If true, oom_score_adj is managed */
/* Blocked clients */
unsigned int blocked_clients; /* # of clients executing a blocking cmd.*/
unsigned int blocked_clients_by_type[BLOCKED_NUM];
list *unblocked_clients; /* list of clients to unblock before next loop */
list *ready_keys; /* List of readyList structures for BLPOP & co */
/* Client side caching. */
unsigned int tracking_clients; /* # of clients with tracking enabled.*/
size_t tracking_table_max_keys; /* Max number of keys in tracking table. */
/* Sort parameters - qsort_r() is only available under BSD so we
* have to take this state global, in order to pass it to sortCompare() */
int sort_desc;
int sort_alpha;
int sort_bypattern;
int sort_store;
/* Zip structure config, see redis.conf for more information */
size_t hash_max_ziplist_entries;
size_t hash_max_ziplist_value;
size_t set_max_intset_entries;
size_t zset_max_ziplist_entries;
size_t zset_max_ziplist_value;
size_t hll_sparse_max_bytes;
size_t stream_node_max_bytes;
long long stream_node_max_entries;
/* List parameters */
int list_max_ziplist_size;
int list_compress_depth;
/* time cache */
_Atomic time_t unixtime; /* Unix time sampled every cron cycle. */
time_t timezone; /* Cached timezone. As set by tzset(). */
int daylight_active; /* Currently in daylight saving time. */
mstime_t mstime; /* 'unixtime' in milliseconds. */
ustime_t ustime; /* 'unixtime' in microseconds. */
long long blocked_last_cron; /* Indicate the mstime of the last time we did cron jobs from a blocking operation */
/* Pubsub */
dict *pubsub_channels; /* Map channels to list of subscribed clients */
list *pubsub_patterns; /* A list of pubsub_patterns */
dict *pubsub_patterns_dict; /* A dict of pubsub_patterns */
int notify_keyspace_events; /* Events to propagate via Pub/Sub. This is an
xor of NOTIFY_... flags. */
/* Cluster */
int cluster_enabled; /* Is cluster enabled? */
mstime_t cluster_node_timeout; /* Cluster node timeout. */
char *cluster_configfile; /* Cluster auto-generated config file name. */
struct clusterState *cluster; /* State of the cluster */
int cluster_migration_barrier; /* Cluster replicas migration barrier. */
int cluster_slave_validity_factor; /* Slave max data age for failover. */
int cluster_require_full_coverage; /* If true, put the cluster down if
there is at least an uncovered slot.*/
int cluster_slave_no_failover; /* Prevent slave from starting a failover
if the master is in failure state. */
char *cluster_announce_ip; /* IP address to announce on cluster bus. */
int cluster_announce_port; /* base port to announce on cluster bus. */
int cluster_announce_bus_port; /* bus port to announce on cluster bus. */
int cluster_module_flags; /* Set of flags that Redis modules are able
to set in order to suppress certain
native Redis Cluster features. Check the
REDISMODULE_CLUSTER_FLAG_*. */
int cluster_allow_reads_when_down; /* Are reads allowed when the cluster
is down? */
int cluster_config_file_lock_fd; /* cluster config fd, will be flock */
/* Scripting */
lua_State *lua; /* The Lua interpreter. We use just one for all clients */
client *lua_client; /* The "fake client" to query Redis from Lua */
client *lua_caller; /* The client running EVAL right now, or NULL */
char* lua_cur_script; /* SHA1 of the script currently running, or NULL */
dict *lua_scripts; /* A dictionary of SHA1 -> Lua scripts */
unsigned long long lua_scripts_mem; /* Cached scripts' memory + oh */
mstime_t lua_time_limit; /* Script timeout in milliseconds */
mstime_t lua_time_start; /* Start time of script, milliseconds time */
int lua_write_dirty; /* True if a write command was called during the
execution of the current script. */
int lua_random_dirty; /* True if a random command was called during the
execution of the current script. */
int lua_replicate_commands; /* True if we are doing single commands repl. */
int lua_multi_emitted;/* True if we already proagated MULTI. */
int lua_repl; /* Script replication flags for redis.set_repl(). */
int lua_timedout; /* True if we reached the time limit for script
execution. */
int lua_kill; /* Kill the script if true. */
int lua_always_replicate_commands; /* Default replication type. */
int lua_oom; /* OOM detected when script start? */
/* Lazy free */
int lazyfree_lazy_eviction;
int lazyfree_lazy_expire;
int lazyfree_lazy_server_del;
int lazyfree_lazy_user_del;
/* Latency monitor */
long long latency_monitor_threshold;
dict *latency_events;
/* ACLs */
char *acl_filename; /* ACL Users file. NULL if not configured. */
unsigned long acllog_max_len; /* Maximum length of the ACL LOG list. */
sds requirepass; /* Remember the cleartext password set with the
old "requirepass" directive for backward
compatibility with Redis <= 5. */
/* Assert & bug reporting */
int watchdog_period; /* Software watchdog period in ms. 0 = off */
/* System hardware info */
size_t system_memory_size; /* Total memory in system as reported by OS */
/* TLS Configuration */
int tls_cluster;
int tls_replication;
int tls_auth_clients;
redisTLSContextConfig tls_ctx_config;
/* cpu affinity */
char *server_cpulist; /* cpu affinity list of redis server main/io thread. */
char *bio_cpulist; /* cpu affinity list of bio thread. */
char *aof_rewrite_cpulist; /* cpu affinity list of aof rewrite process. */
char *bgsave_cpulist; /* cpu affinity list of bgsave process. */
};
README.MD中有对类型进行简单解释:
All the server configuration and in general all the shared state is
defined in a global structure called `server`, of type `struct redisServer`.
A few important fields in this structure are:
* `server.db` is an array of Redis databases, where data is stored.
* `server.commands` is the command table.
* `server.clients` is a linked list of clients connected to the server.
* `server.master` is a special client, the master, if the instance is a replica.
There are tons of other fields. Most fields are commented directly inside
the structure definition.
设置命令表
接下来要做的是对Redis命令表进行排序。它们在redisCommandTable数组的全局变量中定义struct redisCommand。
只读表按源代码排序,以便按类别将命令分组,例如字符串命令,列表命令,设置命令等,以使程序员更容易扫描表中的类似命令。排序后的命令表由全局变量指向,用于通过 commandTable标准二进制搜索(lookupCommand(),返回指向a的指针redisCommand)查找Redis命令。
加载配置文件
1.2.1.1 中会全局初始化一份配置,而这个动作是用Redis.conf文件中用户配置的值覆盖之前初始化出来的配置。Redis将加载配置文件并调用initServerConfig()覆盖已经设置的任何默认值loadServerConfig()。此函数非常简单,它遍历配置文件中的每一行,并将与指令名称匹配的值转换为server结构中匹配成员的适当类型 。此时,Redis将被守护并从控制终端分离(如果已配置)。
initServer()
initServer()完成初始化由server开始的结构的工作 initServerConfig()。首先,它设置了信号处理(SIGHUP并且 SIGPIPE信号被忽略了—有机会通过添加在接收到SIGHUP时以其他守护程序的方式重新加载其配置文件的功能来改进Redis ),包括在服务器接收到时打印stacktrace a SIGSEGV(以及其他相关信号),请参阅segvHandler()。
创建了许多双向链接列表(请参阅参考资料adlist.h)来跟踪客户端,从属设备,监视器(发送MONITOR命令的客户端 )和无对象列表。
共享对象
很多人都知道 Redis内部维护[0-9999]的整数对象池。创建大量的整数类型redisObject 存在内存开销,每个redisObject内部结构至少占16字节,甚至超过了整数自身空间消耗。所以Redis内存维护一个[0-9999]的整数对象池,用于节约内存。 除了整数值对象,其他类型如list,hash,set,zset内部元素也可以使用整数对象池。
除了整数对象池,还可以共享许多不同命令,响应字符串和错误消息所需要的通用Redis对象,而不必每次都分配它们,从而节省了内存。例如:
shared.crlf = createObject(REDIS_STRING,sdsnew("\r\n"));
shared.ok = createObject(REDIS_STRING,sdsnew("+OK\r\n"));
shared.err = createObject(REDIS_STRING,sdsnew("-ERR\r\n"));
shared.emptybulk = createObject(REDIS_STRING,sdsnew("$0\r\n\r\n"));
事件循环
initServer()会创建核心事件循环,即调用aeCreateEventLoop()函数(参阅ae.c)。
ae.h提供了一个独立于平台的包装程序,用于设置I / O事件通知循环,该循环程序epoll在Linuxkqueue上,BSD上使用,并在各自的首选不可用时返回select。Redis的事件循环轮询新连接和I/O事件(从套接字读取请求并向套接字写入响应),在新事件到达时触发。这就是Redis如此快速响应的原因,它可以同时为成千上万的客户端提供服务,而不会阻塞单个请求的处理和响应。
Databases
initServer()初始化了一些redisDb对象,这些对象封装了特定Redis数据库的详细信息,包括跟踪即将到期的密钥,正在阻止的密钥(来自B{L,R}POP命令或I / O的密钥)以及正在受检查检查的密钥和设置。(默认情况下,有16个独立的数据库,可以将它们视为Redis服务器中的名称空间。)
TCP socket
initServer()这是Redis侦听连接的套接字(默认情况下绑定到端口6379)的位置。另一个Redis本地包装器anet.h定义anetTcpServer()了许多其他功能,这些功能简化了设置新套接字,绑定和侦听端口的通常复杂性。
serverCron
initServer()进一步为数据库和pub / sub分配各种字典和列表,重置统计信息和各种标志,并记下服务器启动时间的UNIX时间戳。serverCron() 向事件循环注册为时间事件,每100毫秒执行一次该功能。(这并不完全,因为最初 serverCron()将其设置为在1毫秒内运行,以使该功能随服务器启动立即开始,但随后将其设置为100毫秒执行一次。)
serverCron() 为Redis执行许多定期任务,包括详细记录数据库大小(使用的键和内存的数量)和已连接的客户端,调整哈希表的大小,关闭空闲/超时的客户端连接,执行任何后台保存或AOF重写,如果已满足所配置的保存条件(在这么多秒内更改了很多键),则启动后台保存。
在事件循环中注册连接处理器
initServer()通过注册套接字的描述符,并在acceptHandler()接受新连接时注册要调用的函数,从而将事件循环与服务器的TCP套接字挂钩。
打开AOF
initServer()会创建AOF文件,如果文件已存在,则直接打开。
备份主进程id
如果服务器配置为守护进程,则Redis现在将尝试写出一个pid文件(其路径是可配置的,但默认为 /var/run/redis.pid)。
此时,服务器已启动,Redis会将这个事实记录到其日志文件中。
恢复AOF/RDB数据
如果存在AOF或数据库转储文件(例如dump.rdb),则会将其加载,从而将服务器数据恢复到上一个会话。如果两者都存在,则AOF优先。
事件循环设置
最后,Redis在每次进入事件循环时都会注册一个要调用的函数beforeSleep()(因为该过程实际上在等待通知事件的过程中进入睡眠状态)。
进入事件循环
Redis通过调用aeMain()带有参数的进入主事件循环server.el(请记住,该成员包含指向的指针aeEventLoop)。如果每次循环都有任何时间或文件事件要处理,则将调用它们各自的处理程序函数。aeProcessEvents()封装此逻辑-时间事件由定制逻辑来处理,而文件事件是由底层处理epoll或 kqueue或select I/O事件通知系统。
由于Redis需要响应时间事件及文件I/O事件,因此它实现了自定义事件/轮询循环aeMain()。通过检查是否需要处理事件,并利用文件事件通知,事件循环可以有效地进入睡眠状态,直到有工作要做为止,并且不会使CPU陷入紧张的while循环中。
处理新连接
当有一个I/O事件与服务器正在侦听的套接字的文件描述符相关联时(即套接字具有等待读取或写入的数据),Redis注册为被调用。acceptHandler()创建一个客户端对象-指向中redisClient定义的结构的指针,表示新的客户端连接。
调用createClient()以分配和初始化客户端对象。默认情况下,它选择数据库0(因为每个服务器必须至少有一个Redis db),并将acceptHandler()生成的客户端文件描述符与客户端对象相关联。最后将客户端附加到所跟踪的客户端的全局列表server.clients中。事件循环中注册一个处理程序,该函数readQueryFromClient()用于何时从客户端连接读取数据。
从客户端读取命令
当客户端发出命令请求时,由主事件循环调用。它会尽可能多地读取命令(最多1024个字节)到临时缓冲区,然后将其附加到特定于客户端的缓冲区中,即:查询缓冲区。这使Redis可以处理有效载荷大于1024字节的命令,或者由于I/O原因而被拆分为多个读取事件的命令。然后调用processInputBuffer(),将客户端对象作为参数传递。
processInputBuffer()将来自客户端的原始查询解析为用于执行Redis命令的参数。首先必须解决客户机被B{L,R}POP 命令阻塞的可能性,并且在这种情况下尽早解救。然后,该函数将原始查询缓冲区解析为参数,创建每个的Redis字符串对象并将它们存储在客户端对象的数组中。该查询采用Redis协议的形式,processInputBuffer()实际上是一个协议解析器,processCommand()来完全解析查询。
processCommand()从客户端获取命令的参数并执行。在实际执行命令之前,它会执行许多检查-如果任何检查失败,它将错误消息附加到客户端对象的答复列表中并返回给调用方processInputBuffer()。在特殊情况QUIT下处理了 命令后(为了安全地关闭客户端),请将processCommand()设置在commandTable,这是在Redis的启动周期中先前设置的。如果这是一个未知命令,或者客户端错误地认为该命令是错误的。虽然不常用,但是Redis可以配置为在接受命令之前要求密码来认证客户端,这是Redis检查客户端是否经过认证的阶段,否则将设置错误。如果将Redis配置为使用最大内存量,那么它会在此时尝试释放内存(如果可能的话)(从空闲列表中释放对象并删除过期的密钥),否则,如果服务器超出限制,它不会处理命令REDIS_CMD_DENYOOM标志设置(主要是写,像SET,INCR,RPUSH, ZADD等),再次出现错误。Redis的最后一项检查是,只有在订阅了未解决的频道时,客户端才能发出SUBSCRIBE或 UNSUBSCRIBE命令,否则,这是一个错误。如果所有检查均已通过,则将通过call()使用客户端对象和命令对象作为参数调用来执行命令。
执行命令并响应
call(),struct redisCommandProc从对象的proc成员获取类型为的函数的指针。
像SET和ZADD这样的写命令会使服务器“变脏”,换句话说,服务器被标记为内存中的页面已更改。这对于自动保存过程非常重要,该过程可跟踪在一定时期内已更改了多少个Key写入AOF。feedAppendOnlyFile()如果启用了AOF的使用,该函数将调用 ,这会将命令缓冲区从客户端写到AOF,以便可以重播命令。(它将将相对密钥有效期设置为绝对有效期的命令转换为命令,但是否则,它基本上会复制从客户端传入的命令,请参见catAppendOnlyGenericCommand()。如果连接了任何从属,call()则将命令发送给每个从属以便从属在本地执行,可参阅replicationFeedSlaves()。同样,如果连接了任何客户端并发出了 MONITOR命令,Redis将发送该命令的表示形式,并带有时间戳,请参见 replicationFeedMonitors()。
控制权返回给调用方,该调用方processCommand()将客户端对象重置为后续命令。
如前所述,每个Redis命令过程本身负责设置要发送到客户端的响应。后readQueryFromClient()退出,并返回Redis的在以事件循环aeMain(),aeProcessEvents()将搭载在写缓冲区中等待响应,并将其复制到客户端连接的插座中。
客户端和服务器都返回到可以分别发出和处理更多Redis命令的状态。
1.3 INFO命令
通过给定可选的参数 section ,可以让命令只返回某一部分的信息:
-
server : 一般 Redis 服务器信息,包含以下域:
- redis_version : Redis 服务器版本
- redis_git_sha1 : Git SHA1
- redis_git_dirty : Git dirty flag
- os : Redis 服务器的宿主操作系统
- arch_bits : 架构(32 或 64 位)
- multiplexing_api : Redis 所使用的事件处理机制
- gcc_version : 编译 Redis 时所使用的 GCC 版本
- process_id : 服务器进程的 PID
- run_id : Redis 服务器的随机标识符(用于 Sentinel 和集群)
- tcp_port : TCP/IP 监听端口
- uptime_in_seconds : 自 Redis 服务器启动以来,经过的秒数
- uptime_in_days : 自 Redis 服务器启动以来,经过的天数
- lru_clock : 以分钟为单位进行自增的时钟,用于 LRU 管理
-
clients : 已连接客户端信息,包含以下域:
- connected_clients : 已连接客户端的数量(不包括通过从属服务器连接的客户端)
- client_longest_output_list : 当前连接的客户端当中,最长的输出列表
- client_longest_input_buf : 当前连接的客户端当中,最大输入缓存
- blocked_clients : 正在等待阻塞命令(BLPOP、BRPOP、BRPOPLPUSH)的客户端的数量
-
memory : 内存信息,包含以下域:
- used_memory : 由 Redis 分配器分配的内存总量,以字节(byte)为单位
- used_memory_human : 以人类可读的格式返回 Redis 分配的内存总量
- used_memory_rss : 从操作系统的角度,返回 Redis 已分配的内存总量(俗称常驻集大小)。这个值和 top 、 ps 等命令的输出一致。
- used_memory_peak : Redis 的内存消耗峰值(以字节为单位)
- used_memory_peak_human : 以人类可读的格式返回 Redis 的内存消耗峰值
- used_memory_lua : Lua 引擎所使用的内存大小(以字节为单位)
- mem_fragmentation_ratio : used_memory_rss 和 used_memory 之间的比率
- mem_allocator : 在编译时指定的, Redis 所使用的内存分配器。可以是 libc 、 jemalloc 或者 tcmalloc 。
在理想情况下, used_memory_rss 的值应该只比 used_memory 稍微高一点儿。
当 rss > used ,且两者的值相差较大时,表示存在(内部或外部的)内存碎片。
内存碎片的比率可以通过 mem_fragmentation_ratio 的值看出。
当 used > rss 时,表示 Redis 的部分内存被操作系统换出到交换空间了,在这种情况下,操作可能会产生明显的延迟。
当 Redis 释放内存时,分配器可能会,也可能不会,将内存返还给操作系统。
如果 Redis 释放了内存,却没有将内存返还给操作系统,那么 used_memory 的值可能和操作系统显示的 Redis 内存占用并不一致。
查看 used_memory_peak 的值可以验证这种情况是否发生。
persistence : RDB 和 AOF 的相关信息
stats : 一般统计信息
replication : 主/从复制信息
cpu : CPU 计算量统计信息
commandstats : Redis 命令统计信息
cluster : Redis 集群信息
keyspace : 数据库相关的统计信息