上一张了解了客户端以一个对象的方式来表明自己的身份以及当前所处的阶段,这章对应的会进行一个服务器的介绍.
Redis 服务器负责与多个客户端建立网络链接,处理客户端发送的命令请求,在数据库中保存客户端执行命令所产生的数据,并通过资源管理来位置服务器自身的运转
以客户端向服务器发送命令为例,来进行客户端与服务器交互的过程说明.
当客户端发出
SET KEY VAL
到服务器回复
ok
中间经历的过程如下:
1): 客户端向服务器发送命令请求 SET KEY VALUE
2): 服务器接收并处理客户端发来的命令请求 SET KEY VALUE,服务器做完处理后产生回复字符 ok
3): 服务器将回复字符返回给客户端.
4): 客户端接收服务器返回的字符,并打印给用户
后续内容将对这些过程进行详细的说明
当客户端发送 SET KEY VAl 时,客户端会将字符串转换成协议格式,再发送给服务器,
SET KEY VAl
*3\r\n$3\r\nSET\r\n$3\r\nKEY\r\n$5\r\nVALUE\r\n
从转换后的格式可以看出请求的协议格式非常简单粗暴,先标记命令的长度,再标记每个命令单词的长度,再附上对应的内容,用 \r\n 作分割.
当客户端发送出命令之后,服务器上的与连接客户端绑定的套接字会被置为可读状态,此时,服务器将调用命令请求处理器进行相应的处理.
前几章也进行对应的说明.
当接收到可读事件时, 处理器将会将命令先放置在 querybuf 中,然后再进行进一步的解析,存储到 argv和argc 之中,最后根据 argv[0] 的字符从字符表中找到对应的命令处理器,修改 redisClient 中的 cmd 指针指向,然后调用命令执行期,执行客户端指定的命令, cmd + argv + argc .
查找命令简单来说就是去字典中查找对应的键值对.
键为 字符串,值为 redisCommand 结构.
具体机构如下
struct redisCommand {
// 命令名字
char *name;
// 实现函数
redisCommandProc *proc;
// 参数个数
int arity;
// 字符串表示的 FLAG
char *sflags; /* Flags as string representation, one char per flag. */
// 实际 FLAG
int flags; /* The actual flags, obtained from the 'sflags' field. */
/* Use a function to determine keys arguments in a command line.
* Used for Redis Cluster redirect. */
// 从命令中判断命令的键参数。在 Redis 集群转向时使用。
redisGetKeysProc *getkeys_proc;
/* What keys should be loaded in background when calling this command? */
// 指定哪些参数是 key
int firstkey; /* The first argument that's a key (0 = no keys) */
int lastkey; /* The last argument that's a key */
int keystep; /* The step between first and last key */
// 统计信息
// microseconds 记录了命令执行耗费的总毫微秒数
// calls 是命令被执行的总次数
long long microseconds, calls;
};
通过代码中的下面片段,我们也能找到 sflags 的对应说明
char *f = c->sflags;
int retval1, retval2;
// 根据字符串 FLAG 生成实际 FLAG
while(*f != '\0') {
switch(*f) {
case 'w': c->flags |= REDIS_CMD_WRITE; break;
case 'r': c->flags |= REDIS_CMD_READONLY; break;
case 'm': c->flags |= REDIS_CMD_DENYOOM; break;
case 'a': c->flags |= REDIS_CMD_ADMIN; break;
case 'p': c->flags |= REDIS_CMD_PUBSUB; break;
case 's': c->flags |= REDIS_CMD_NOSCRIPT; break;
case 'R': c->flags |= REDIS_CMD_RANDOM; break;
case 'S': c->flags |= REDIS_CMD_SORT_FOR_SCRIPT; break;
case 'l': c->flags |= REDIS_CMD_LOADING; break;
case 't': c->flags |= REDIS_CMD_STALE; break;
case 'M': c->flags |= REDIS_CMD_SKIP_MONITOR; break;
case 'k': c->flags |= REDIS_CMD_ASKING; break;
default: redisPanic("Unsupported command flag"); break;
}
f++;
}
#define REDIS_CMD_WRITE 1 /* "w" flag */
#define REDIS_CMD_READONLY 2 /* "r" flag */
#define REDIS_CMD_DENYOOM 4 /* "m" flag */
#define REDIS_CMD_NOT_USED_1 8 /* no longer used flag */
#define REDIS_CMD_ADMIN 16 /* "a" flag */
#define REDIS_CMD_PUBSUB 32 /* "p" flag */
#define REDIS_CMD_NOSCRIPT 64 /* "s" flag */
#define REDIS_CMD_RANDOM 128 /* "R" flag */
#define REDIS_CMD_SORT_FOR_SCRIPT 256 /* "S" flag */
#define REDIS_CMD_LOADING 512 /* "l" flag */
#define REDIS_CMD_STALE 1024 /* "t" flag */
#define REDIS_CMD_SKIP_MONITOR 2048 /* "M" flag */
#define REDIS_CMD_ASKING 4096 /* "k" flag */
标识 | 意义 | 带有该标识的命令 |
---|---|---|
w | 写入命令,可能会修改数据库 | SET,RPUSH,DEL等 |
r | 只读命令,不会修改数据库 | GET,STRLEN,EXISTS等 |
m | 会占用大量内存,执行前要检查内存使用情况 | SET,RPUSH,APPEND,LPUSH,SADD等 |
a | 管理命令 | SAVE,BGSAVE,BGWRITEAOF等 |
p | 发布订阅相关命令 | PUBLISH,SUBSCRIBE,PUBSUB等 |
s | 不能在lua中使用 | BRPOP,BLPOP,BRPOPLPUSH,LPOP等 |
R | 随机命令,对于相同的数据集,可能会有不同的返回结果 | SPOP,SRANDMEMBER,SSCAN,RANDOMKEY等 |
S | 当在Lua脚本中使用,会在返回结果后进行一个排序,使得结果是有序的 | SINTER,SUNION,SDIFF,SMEMBERS,KEYS等 |
l | 可以在服务器载入过程中使用 | INFO,SHUTDONW,PUBLISH等 |
t | 允许从服务器器带有过期数据时使用的命令 | SALVEOF,PING,INFO等 |
M | 在监视器模式中不会被自动传播 | EXEC |
以 set 命令执行函数为例
{"set",setCommand,-3,"wm",0,NULL,1,1,1,0,0},
“wm” 从上表可知,该命令是写入命令,可能会修改数据库,会占用大量内存,执行前要检查内存使用情况.
/* SET key value [NX] [XX] [EX ] [PX ] */
void setCommand(redisClient *c) {
int j;
robj *expire = NULL;
int unit = UNIT_SECONDS;
int flags = REDIS_SET_NO_FLAGS;
// 设置选项参数
for (j = 3; j < c->argc; j++) {
char *a = c->argv[j]->ptr;
robj *next = (j == c->argc-1) ? NULL : c->argv[j+1];
if ((a[0] == 'n' || a[0] == 'N') &&
(a[1] == 'x' || a[1] == 'X') && a[2] == '\0') {
flags |= REDIS_SET_NX;
} else if ((a[0] == 'x' || a[0] == 'X') &&
(a[1] == 'x' || a[1] == 'X') && a[2] == '\0') {
flags |= REDIS_SET_XX;
} else if ((a[0] == 'e' || a[0] == 'E') &&
(a[1] == 'x' || a[1] == 'X') && a[2] == '\0' && next) {
unit = UNIT_SECONDS;
expire = next;
j++;
} else if ((a[0] == 'p' || a[0] == 'P') &&
(a[1] == 'x' || a[1] == 'X') && a[2] == '\0' && next) {
unit = UNIT_MILLISECONDS;
expire = next;
j++;
} else {
addReply(c,shared.syntaxerr);
return;
}
}
// 尝试对值对象进行编码
c->argv[2] = tryObjectEncoding(c->argv[2]);
setGenericCommand(c,flags,c->argv[1],c->argv[2],expire,unit,NULL,NULL);
}
然后再看命令执行器的实现.
从第三个数起就是一个选项参数配置,后面两行会进行实际的存储过程的进行,暂不深入
需要再说的是, argv[0] 是无关大小写的, set,SET SEt,sEt,无论是哪种形式,都不会有区别
unsigned int dictGenCaseHashFunction(const unsigned char *buf, int len) {
unsigned int hash = (unsigned int)dict_hash_function_seed;
while (len--)
hash = ((hash << 5) + hash) + (tolower(*buf++)); /* hash * 33 + c */
return hash;
}
hahs值的计算过程有一个tolower函数,进行大小写的统一
找到对应命令执行器后,将进行预检查:
简要的总结一下会执行的一些检查
c->cmd = c->lastcmd = lookupCommand(c->argv[0]->ptr);
if (!c->cmd) {
// 没找到指定的命令
flagTransaction(c);
addReplyErrorFormat(c,"unknown command '%s'",
(char*)c->argv[0]->ptr);
return REDIS_OK;
}
1): 检查 cmd 是否找到对应命令执行器.
else if ((c->cmd->arity > 0 && c->cmd->arity != c->argc) ||
(c->argc < -c->cmd->arity)) {
// 参数个数错误
flagTransaction(c);
addReplyErrorFormat(c,"wrong number of arguments for '%s' command",
c->cmd->name);
return REDIS_OK;
}
2): 检查参数是否正确
if (server.requirepass && !c->authenticated && c->cmd->proc != authCommand)
{
flagTransaction(c);
addReply(c,shared.noautherr);
return REDIS_OK;
}
3): 身份验证信息检查
if (server.maxmemory) {
// 如果内存已超过限制,那么尝试通过删除过期键来释放内存
int retval = freeMemoryIfNeeded();
// 如果即将要执行的命令可能占用大量内存(REDIS_CMD_DENYOOM)
// 并且前面的内存释放失败的话
// 那么向客户端返回内存错误
if ((c->cmd->flags & REDIS_CMD_DENYOOM) && retval == REDIS_ERR) {
flagTransaction(c);
addReply(c, shared.oomerr);
return REDIS_OK;
}
}
4): 进行内存检测
if (((server.stop_writes_on_bgsave_err &&
server.saveparamslen > 0 &&
server.lastbgsave_status == REDIS_ERR) ||
server.aof_last_write_status == REDIS_ERR) &&
server.masterhost == NULL &&
(c->cmd->flags & REDIS_CMD_WRITE ||
c->cmd->proc == pingCommand))
{
flagTransaction(c);
if (server.aof_last_write_status == REDIS_OK)
addReply(c, shared.bgsaveerr);
else
addReplySds(c,
sdscatprintf(sdsempty(),
"-MISCONF Errors writing to the AOF file: %s\r\n",
strerror(server.aof_last_write_errno)));
return REDIS_OK;
}
5): 如果 BGSAVE 命令错误,并且打开了 stop_writes_on_bgsave_err 功能,服务器又执行了一个写入命令 REDIS_CMD_WRITE, 那么将返回一个错误
/* Only allow SUBSCRIBE and UNSUBSCRIBE in the context of Pub/Sub */
// 在订阅于发布模式的上下文中,只能执行订阅和退订相关的命令
if ((dictSize(c->pubsub_channels) > 0 || listLength(c->pubsub_patterns) > 0)
&&
c->cmd->proc != subscribeCommand &&
c->cmd->proc != unsubscribeCommand &&
c->cmd->proc != psubscribeCommand &&
c->cmd->proc != punsubscribeCommand) {
addReplyError(c,"only (P)SUBSCRIBE / (P)UNSUBSCRIBE / QUIT allowed in this context");
return REDIS_OK;
}
6): 检查是否处于发布模式中,如果是将拒绝其他命令
if (server.loading && !(c->cmd->flags & REDIS_CMD_LOADING)) {
addReply(c, shared.loadingerr);
return REDIS_OK;
}
7): 当服务器正在进行数据载入,那么只会进行 REDIS_CMD_LOADING 标记的命令调用
if (server.lua_timedout &&
c->cmd->proc != authCommand &&
c->cmd->proc != replconfCommand &&
!(c->cmd->proc == shutdownCommand &&
c->argc == 2 &&
tolower(((char*)c->argv[1]->ptr)[0]) == 'n') &&
!(c->cmd->proc == scriptCommand &&
c->argc == 2 &&
tolower(((char*)c->argv[1]->ptr)[0]) == 'k'))
{
flagTransaction(c);
addReply(c, shared.slowscripterr);
return REDIS_OK;
}
8): 如果 LUA 脚本超时,那么只允许执行限定的操作
if (c->flags & REDIS_MULTI &&
c->cmd->proc != execCommand && c->cmd->proc != discardCommand &&
c->cmd->proc != multiCommand && c->cmd->proc != watchCommand)
{
// 在事务上下文中
// 除 EXEC 、 DISCARD 、 MULTI 和 WATCH 命令之外
// 其他所有命令都会被入队到事务队列中
queueMultiCommand(c);
addReply(c,shared.queued);
}
9): 如果当前处于事务上下文,则将事务的连续指令放入队列中
/* If cluster is enabled perform the cluster redirection here.
*
* 如果开启了集群模式,那么在这里进行转向操作。
*
* However we don't perform the redirection if:
*
* 不过,如果有以下情况出现,那么节点不进行转向:
*
* 1) The sender of this command is our master.
* 命令的发送者是本节点的主节点
*
* 2) The command has no key arguments.
* 命令没有 key 参数
*/
if (server.cluster_enabled &&
!(c->flags & REDIS_MASTER) &&
!(c->cmd->getkeys_proc == NULL && c->cmd->firstkey == 0))
{
int hashslot;
// 集群已下线
if (server.cluster->state != REDIS_CLUSTER_OK) {
flagTransaction(c);
addReplySds(c,sdsnew("-CLUSTERDOWN The cluster is down. Use CLUSTER INFO for more information\r\n"));
return REDIS_OK;
// 集群运作正常
} else {
int error_code;
clusterNode *n = getNodeByQuery(c,c->cmd,c->argv,c->argc,&hashslot,&error_code);
// 不能执行多键处理命令
if (n == NULL) {
flagTransaction(c);
if (error_code == REDIS_CLUSTER_REDIR_CROSS_SLOT) {
addReplySds(c,sdsnew("-CROSSSLOT Keys in request don't hash to the same slot\r\n"));
} else if (error_code == REDIS_CLUSTER_REDIR_UNSTABLE) {
/* The request spawns mutliple keys in the same slot,
* but the slot is not "stable" currently as there is
* a migration or import in progress. */
addReplySds(c,sdsnew("-TRYAGAIN Multiple keys request during rehashing of slot\r\n"));
} else {
redisPanic("getNodeByQuery() unknown error.");
}
return REDIS_OK;
// 命令针对的槽和键不是本节点处理的,进行转向
} else if (n != server.cluster->myself) {
flagTransaction(c);
// - :
// 例如 -ASK 10086 127.0.0.1:12345
addReplySds(c,sdscatprintf(sdsempty(),
"-%s %d %s:%d\r\n",
(error_code == REDIS_CLUSTER_REDIR_ASK) ? "ASK" : "MOVED",
hashslot,n->ip,n->port));
return REDIS_OK;
}
// 如果执行到这里,说明键 key 所在的槽由本节点处理
// 或者客户端执行的是无参数命令
}
}
/* Don't accept write commands if there are not enough good slaves and
* user configured the min-slaves-to-write option. */
// 如果服务器没有足够多的状态良好服务器
// 并且 min-slaves-to-write 选项已打开
if (server.repl_min_slaves_to_write &&
server.repl_min_slaves_max_lag &&
c->cmd->flags & REDIS_CMD_WRITE &&
server.repl_good_slaves_count < server.repl_min_slaves_to_write)
{
flagTransaction(c);
addReply(c, shared.noreplicaserr);
return REDIS_OK;
}
/* Don't accept write commands if this is a read only slave. But
* accept write commands if this is our master. */
// 如果这个服务器是一个只读 slave 的话,那么拒绝执行写命令
if (server.masterhost && server.repl_slave_ro &&
!(c->flags & REDIS_MASTER) &&
c->cmd->flags & REDIS_CMD_WRITE)
{
addReply(c, shared.roslaveerr);
return REDIS_OK;
}
10): 书中没有进行介绍,但是代码存在的检查,一个是集群判断,如果是集群将进行分发,当前服务器状态判断,当前服务器是否是slave 的判断
上述检查全部完成之后才进行一个调用
call(c,REDIS_CALL_FULL);
* Call() is the core of Redis execution of a command */
// 调用命令的实现函数,执行命令
void call(redisClient *c, int flags) {
// start 记录命令开始执行的时间
long long dirty, start, duration;
// 记录命令开始执行前的 FLAG
int client_old_flags = c->flags;
/* Sent the command to clients in MONITOR mode, only if the commands are
* not generated from reading an AOF. */
// 如果可以的话,将命令发送到 MONITOR
if (listLength(server.monitors) &&
!server.loading &&
!(c->cmd->flags & REDIS_CMD_SKIP_MONITOR))
{
replicationFeedMonitors(c,server.monitors,c->db->id,c->argv,c->argc);
}
/* Call the command. */
c->flags &= ~(REDIS_FORCE_AOF|REDIS_FORCE_REPL);
redisOpArrayInit(&server.also_propagate);
// 保留旧 dirty 计数器值
dirty = server.dirty;
// 计算命令开始执行的时间
start = ustime();
// 执行实现函数
c->cmd->proc(c);
// 计算命令执行耗费的时间
duration = ustime()-start;
// 计算命令执行之后的 dirty 值
dirty = server.dirty-dirty;
/* When EVAL is called loading the AOF we don't want commands called
* from Lua to go into the slowlog or to populate statistics. */
// 不将从 Lua 中发出的命令放入 SLOWLOG ,也不进行统计
if (server.loading && c->flags & REDIS_LUA_CLIENT)
flags &= ~(REDIS_CALL_SLOWLOG | REDIS_CALL_STATS);
/* If the caller is Lua, we want to force the EVAL caller to propagate
* the script if the command flag or client flag are forcing the
* propagation. */
// 如果调用者是 Lua ,那么根据命令 FLAG 和客户端 FLAG
// 打开传播(propagate)标志
if (c->flags & REDIS_LUA_CLIENT && server.lua_caller) {
if (c->flags & REDIS_FORCE_REPL)
server.lua_caller->flags |= REDIS_FORCE_REPL;
if (c->flags & REDIS_FORCE_AOF)
server.lua_caller->flags |= REDIS_FORCE_AOF;
}
/* Log the command into the Slow log if needed, and populate the
* per-command statistics that we show in INFO commandstats. */
// 如果有需要,将命令放到 SLOWLOG 里面
if (flags & REDIS_CALL_SLOWLOG && c->cmd->proc != execCommand)
slowlogPushEntryIfNeeded(c->argv,c->argc,duration);
// 更新命令的统计信息
if (flags & REDIS_CALL_STATS) {
c->cmd->microseconds += duration;
c->cmd->calls++;
}
/* Propagate the command into the AOF and replication link */
// 将命令复制到 AOF 和 slave 节点
if (flags & REDIS_CALL_PROPAGATE) {
int flags = REDIS_PROPAGATE_NONE;
// 强制 REPL 传播
if (c->flags & REDIS_FORCE_REPL) flags |= REDIS_PROPAGATE_REPL;
// 强制 AOF 传播
if (c->flags & REDIS_FORCE_AOF) flags |= REDIS_PROPAGATE_AOF;
// 如果数据库有被修改,那么启用 REPL 和 AOF 传播
if (dirty)
flags |= (REDIS_PROPAGATE_REPL | REDIS_PROPAGATE_AOF);
if (flags != REDIS_PROPAGATE_NONE)
propagate(c->cmd,c->db->id,c->argv,c->argc,flags);
}
/* Restore the old FORCE_AOF/REPL flags, since call can be executed
* recursively. */
// 将客户端的 FLAG 恢复到命令执行之前
// 因为 call 可能会递归执行
c->flags &= ~(REDIS_FORCE_AOF|REDIS_FORCE_REPL);
c->flags |= client_old_flags & (REDIS_FORCE_AOF|REDIS_FORCE_REPL);
/* Handle the alsoPropagate() API to handle commands that want to propagate
* multiple separated commands. */
// 传播额外的命令
if (server.also_propagate.numops) {
int j;
redisOp *rop;
for (j = 0; j < server.also_propagate.numops; j++) {
rop = &server.also_propagate.ops[j];
propagate(rop->cmd, rop->dbid, rop->argv, rop->argc, rop->target);
}
redisOpArrayFree(&server.also_propagate);
}
server.stat_numcommands++;
}
前半截执行了命令函数,后半截进行后续处理
// 执行实现函数
c->cmd->proc(c);
命令的执行,在命令表的配置中已经将 proc 指向了 SetCommand (如果以 SET 指令为例)
/* When EVAL is called loading the AOF we don't want commands called
* from Lua to go into the slowlog or to populate statistics. */
// 不将从 Lua 中发出的命令放入 SLOWLOG ,也不进行统计
if (server.loading && c->flags & REDIS_LUA_CLIENT)
flags &= ~(REDIS_CALL_SLOWLOG | REDIS_CALL_STATS);
/* If the caller is Lua, we want to force the EVAL caller to propagate
* the script if the command flag or client flag are forcing the
* propagation. */
// 如果调用者是 Lua ,那么根据命令 FLAG 和客户端 FLAG
// 打开传播(propagate)标志
if (c->flags & REDIS_LUA_CLIENT && server.lua_caller) {
if (c->flags & REDIS_FORCE_REPL)
server.lua_caller->flags |= REDIS_FORCE_REPL;
if (c->flags & REDIS_FORCE_AOF)
server.lua_caller->flags |= REDIS_FORCE_AOF;
}
/* Log the command into the Slow log if needed, and populate the
* per-command statistics that we show in INFO commandstats. */
// 如果有需要,将命令放到 SLOWLOG 里面
if (flags & REDIS_CALL_SLOWLOG && c->cmd->proc != execCommand)
slowlogPushEntryIfNeeded(c->argv,c->argc,duration);
// 更新命令的统计信息
if (flags & REDIS_CALL_STATS) {
c->cmd->microseconds += duration;
c->cmd->calls++;
}
/* Propagate the command into the AOF and replication link */
// 将命令复制到 AOF 和 slave 节点
if (flags & REDIS_CALL_PROPAGATE) {
int flags = REDIS_PROPAGATE_NONE;
// 强制 REPL 传播
if (c->flags & REDIS_FORCE_REPL) flags |= REDIS_PROPAGATE_REPL;
// 强制 AOF 传播
if (c->flags & REDIS_FORCE_AOF) flags |= REDIS_PROPAGATE_AOF;
// 如果数据库有被修改,那么启用 REPL 和 AOF 传播
if (dirty)
flags |= (REDIS_PROPAGATE_REPL | REDIS_PROPAGATE_AOF);
if (flags != REDIS_PROPAGATE_NONE)
propagate(c->cmd,c->db->id,c->argv,c->argc,flags);
}
/* Restore the old FORCE_AOF/REPL flags, since call can be executed
* recursively. */
// 将客户端的 FLAG 恢复到命令执行之前
// 因为 call 可能会递归执行
c->flags &= ~(REDIS_FORCE_AOF|REDIS_FORCE_REPL);
c->flags |= client_old_flags & (REDIS_FORCE_AOF|REDIS_FORCE_REPL);
/* Handle the alsoPropagate() API to handle commands that want to propagate
* multiple separated commands. */
// 传播额外的命令
if (server.also_propagate.numops) {
int j;
redisOp *rop;
for (j = 0; j < server.also_propagate.numops; j++) {
rop = &server.also_propagate.ops[j];
propagate(rop->cmd, rop->dbid, rop->argv, rop->argc, rop->target);
}
redisOpArrayFree(&server.also_propagate);
}
server.stat_numcommands++;
后续工作就是一些记录工作,如记录慢日志,执行 redisCommand 的计数器,如果开启 AOF 文件存盘进行后续的一个存盘行为,如果有从服务器,进行一个转发.
在后续工作完成之后,也就完成了一条命令的执行,紧接着找下一条命令去了.
以 SetCommand 为例
在
void setGenericCommand(redisClient *c, int flags, robj *key, robj *val, robj *expire, int unit, robj *ok_reply, robj *abort_reply)
函数的最后一行看一看到一个回复的字符串设置代码
// 设置成功,向客户端发送回复
// 回复的内容由 ok_reply 决定
addReply(c, ok_reply ? ok_reply : shared.ok);
继续跟进就能看到整体的一个缓存写流程
void addReply(redisClient *c, robj *obj) {
// 为客户端安装写处理器到事件循环
if (prepareClientToWrite(c) != REDIS_OK) return;
/* This is an important place where we can avoid copy-on-write
* when there is a saving child running, avoiding touching the
* refcount field of the object if it's not needed.
*
* 如果在使用子进程,那么尽可能地避免修改对象的 refcount 域。
*
* If the encoding is RAW and there is room in the static buffer
* we'll be able to send the object to the client without
* messing with its page.
*
* 如果对象的编码为 RAW ,并且静态缓冲区中有空间
* 那么就可以在不弄乱内存页的情况下,将对象发送给客户端。
*/
if (sdsEncodedObject(obj)) {
// 首先尝试复制内容到 c->buf 中,这样可以避免内存分配
if (_addReplyToBuffer(c,obj->ptr,sdslen(obj->ptr)) != REDIS_OK)
// 如果 c->buf 中的空间不够,就复制到 c->reply 链表中
// 可能会引起内存分配
_addReplyObjectToList(c,obj);
} else if (obj->encoding == REDIS_ENCODING_INT) {
/* Optimization: if there is room in the static buffer for 32 bytes
* (more than the max chars a 64 bit integer can take as string) we
* avoid decoding the object and go for the lower level approach. */
// 优化,如果 c->buf 中有等于或多于 32 个字节的空间
// 那么将整数直接以字符串的形式复制到 c->buf 中
if (listLength(c->reply) == 0 && (sizeof(c->buf) - c->bufpos) >= 32) {
char buf[32];
int len;
len = ll2string(buf,sizeof(buf),(long)obj->ptr);
if (_addReplyToBuffer(c,buf,len) == REDIS_OK)
return;
/* else... continue with the normal code path, but should never
* happen actually since we verified there is room. */
}
// 执行到这里,代表对象是整数,并且长度大于 32 位
// 将它转换为字符串
obj = getDecodedObject(obj);
// 保存到缓存中
if (_addReplyToBuffer(c,obj->ptr,sdslen(obj->ptr)) != REDIS_OK)
_addReplyObjectToList(c,obj);
decrRefCount(obj);
} else {
redisPanic("Wrong obj->encoding in addReply()");
}
}
第一行的代码将客户端置为可写状态,在下一次事件循环时候,就会将当前执行结果返回给客户端进程了
当服务器的回复为 +ok\r\n 时,客户端将进行解析,并输出 OK 字符至窗口.
综上也就完成了一条命令的一生
serverCron 通过 aeCreateTimeEvent 运行
if(aeCreateTimeEvent(server.el, 1, serverCron, NULL, NULL) == AE_ERR) {
redisPanic("Can't create the serverCron time event.");
exit(1);
}
每100毫秒执行一次,也就是一秒十帧,后续也将详细的对 serverCron 的工作进行一个详细介绍
在 redisServer 中两个记录当前服务器时间的缓存信息的属性
time_t unixtime;
long long mstime;
在 serverCron 中
前几行就进行了这两个值的更新
/* We take a cached value of the unix time in the global state because with
* virtual memory and aging there is to store the current time in objects at
* every object access, and accuracy is not needed. To access a global var is
* a lot faster than calling time(NULL) */
void updateCachedTime(void) {
server.unixtime = time(NULL);
server.mstime = mstime();
}
当然,两个值的精度由于是 100ms 进行的一次刷新,所以精度并不高,也就只能应用在不需要高精度的函数调用之上,
像键过期时间,添加慢查询日志这种需要高精度的功能来说,并不适用,在这些功能进行时间获取的时候会重新进行一个时间函数的调用.
服务器中会有一个 LRU 时钟, 每个键值对对象也会有一个 LRU 时钟,两者相减就是当前键值对象的空转时间,
关于服务器的 LRU 时钟的计算
#define REDIS_LRU_CLOCK_RESOLUTION 1000 /* LRU clock resolution in ms */
#define LRU_CLOCK() ((1000/server.hz <= REDIS_LRU_CLOCK_RESOLUTION) ? server.lruclock : getLRUClock())
通过以上代码可以得出结论, 服务器的 LRU 时间会每10秒进行一次记录,所以 对象的空转时间也可能会有一定的误差
run_with_period(100) trackOperationsPerSecond();
该段代码会以每100毫秒的频率进行执行,功能为抽样计算的方式估算记录服务器在最近一秒内的处理命令数量.
该值可以通过 INFO staus 命令进行一个查看
/* Using the following macro you can run code inside serverCron() with the
* specified period, specified in milliseconds.
* The actual resolution depends on server.hz. */
#define run_with_period(_ms_) if ((_ms_ <= 1000/server.hz) || !(server.cronloops%((_ms_)/(1000/server.hz))))
/* Add a sample to the operations per second array of samples. */
// 将服务器的命令执行次数记录到抽样数组中
void trackOperationsPerSecond(void) {
// 计算两次抽样之间的时间长度,毫秒格式
long long t = mstime() - server.ops_sec_last_sample_time;
// 计算两次抽样之间,执行了多少个命令
long long ops = server.stat_numcommands - server.ops_sec_last_sample_ops;
long long ops_sec;
// 计算距离上一次抽样之后,每秒执行命令的数量
ops_sec = t > 0 ? (ops*1000/t) : 0;
// 将计算出的执行命令数量保存到抽样数组
server.ops_sec_samples[server.ops_sec_idx] = ops_sec;
// 更新抽样数组的索引
server.ops_sec_idx = (server.ops_sec_idx+1) % REDIS_OPS_SEC_SAMPLES;
// 更新最后一次抽样的时间
server.ops_sec_last_sample_time = mstime();
// 更新最后一次抽样时的执行命令数量
server.ops_sec_last_sample_ops = server.stat_numcommands;
}
上述为具体实现内容,很简单.没有复杂的逻辑
size_t stat_peak_memory;
用来记录当前服务器内存峰值
/* Record the max memory used since the server was started. */
// 记录服务器的内存峰值
if (zmalloc_used_memory() > server.stat_peak_memory)
server.stat_peak_memory = zmalloc_used_memory();
通过
INFO memory
命令可以进行查询
/* We received a SIGTERM, shutting down here in a safe way, as it is
* not ok doing so inside the signal handler. */
// 服务器进程收到 SIGTERM 信号,关闭服务器
if (server.shutdown_asap) {
// 尝试关闭服务器
if (prepareForShutdown(0) == REDIS_OK) exit(0);
// 如果关闭失败,那么打印 LOG ,并移除关闭标识
redisLog(REDIS_WARNING,"SIGTERM received but errors trying to shut down the server, check the logs for more information");
server.shutdown_asap = 0;
}
当收到 SIGTERM 信号之后 通过该段代码判断,会进行一个服务器关闭行为
/* We need to do a few operations on clients asynchronously. */
// 检查客户端,关闭超时客户端,并释放客户端多余的缓冲区
clientsCron();
void clientsCron(void) {
/* Make sure to process at least 1/(server.hz*10) of clients per call.
*
* 这个函数每次执行都会处理至少 1/server.hz*10 个客户端。
*
* Since this function is called server.hz times per second we are sure that
* in the worst case we process all the clients in 10 seconds.
*
* 因为这个函数每秒钟会调用 server.hz 次,
* 所以在最坏情况下,服务器需要使用 10 秒钟来遍历所有客户端。
*
* In normal conditions (a reasonable number of clients) we process
* all the clients in a shorter time.
*
* 在一般情况下,遍历所有客户端所需的时间会比实际中短很多。
*/
// 客户端数量
int numclients = listLength(server.clients);
// 要处理的客户端数量
int iterations = numclients/(server.hz*10);
// 至少要处理 50 个客户端
if (iterations < 50)
iterations = (numclients < 50) ? numclients : 50;
while(listLength(server.clients) && iterations--) {
redisClient *c;
listNode *head;
/* Rotate the list, take the current head, process.
* This way if the client must be removed from the list it's the
* first element and we don't incur into O(N) computation. */
// 翻转列表,然后取出表头元素,这样一来上一个被处理的客户端会被放到表头
// 另外,如果程序要删除当前客户端,那么只要删除表头元素就可以了
listRotate(server.clients);
head = listFirst(server.clients);
c = listNodeValue(head);
/* The following functions do different service checks on the client.
* The protocol is that they return non-zero if the client was
* terminated. */
// 检查客户端,并在客户端超时时关闭它
if (clientsCronHandleTimeout(c)) continue;
// 根据情况,缩小客户端查询缓冲区的大小
if (clientsCronResizeQueryBuffer(c)) continue;
}
}
检查客户端和服务器之间的链接是否已经超时.
如果上次客户端命令请求,缓冲区大小超过了一定的长度,那么程序会释放客户端当前的缓冲区,并重新创建一个默认大小的输入缓冲区.
/* Handle background operations on Redis databases. */
// 对数据库执行各种操作
databasesCron();
/* This function handles 'background' operations we are required to do
* incrementally in Redis databases, such as active key expiring, resizing,
* rehashing. */
// 对数据库执行删除过期键,调整大小,以及主动和渐进式 rehash
void databasesCron(void) {
// 函数先从数据库中删除过期键,然后再对数据库的大小进行修改
/* Expire keys by random sampling. Not required for slaves
* as master will synthesize DELs for us. */
// 如果服务器不是从服务器,那么执行主动过期键清除
if (server.active_expire_enabled && server.masterhost == NULL)
// 清除模式为 CYCLE_SLOW ,这个模式会尽量多清除过期键
activeExpireCycle(ACTIVE_EXPIRE_CYCLE_SLOW);
/* Perform hash tables rehashing if needed, but only if there are no
* other processes saving the DB on disk. Otherwise rehashing is bad
* as will cause a lot of copy-on-write of memory pages. */
// 在没有 BGSAVE 或者 BGREWRITEAOF 执行时,对哈希表进行 rehash
if (server.rdb_child_pid == -1 && server.aof_child_pid == -1) {
/* We use global counters so if we stop the computation at a given
* DB we'll be able to start from the successive in the next
* cron loop iteration. */
static unsigned int resize_db = 0;
static unsigned int rehash_db = 0;
unsigned int dbs_per_call = REDIS_DBCRON_DBS_PER_CALL;
unsigned int j;
/* Don't test more DBs than we have. */
// 设定要测试的数据库数量
if (dbs_per_call > server.dbnum) dbs_per_call = server.dbnum;
/* Resize */
// 调整字典的大小
for (j = 0; j < dbs_per_call; j++) {
tryResizeHashTables(resize_db % server.dbnum);
resize_db++;
}
/* Rehash */
// 对字典进行渐进式 rehash
if (server.activerehashing) {
for (j = 0; j < dbs_per_call; j++) {
int work_done = incrementallyRehash(rehash_db % server.dbnum);
rehash_db++;
if (work_done) {
/* If the function did some work, stop here, we'll do
* more at the next cron loop. */
break;
}
}
}
}
}
执行了过期清除,在有需要时进行 rehash.
/* Start a scheduled AOF rewrite if this was requested by the user while
* a BGSAVE was in progress. */
// 如果 BGSAVE 和 BGREWRITEAOF 都没有在执行
// 并且有一个 BGREWRITEAOF 在等待,那么执行 BGREWRITEAOF
if (server.rdb_child_pid == -1 && server.aof_child_pid == -1 &&
server.aof_rewrite_scheduled)
{
rewriteAppendOnlyFileBackground();
}
先进行 RDB 和 AOF 的判断,如果都没有执行,再判断当前是否需要执行.
如果需要将执行被延迟的 BGWRITEAOF 命令
struct redisServer{
pid_t rdb_child_pid;
pid_t aof_child_pid;
};
在服务器对象中,有两个来维护当前持久化的子进程的进程ID.
当其中一个值非 -1 时,进行 wait3 函数调用,进行信号接收
if (server.rdb_child_pid != -1 || server.aof_child_pid != -1) {
int statloc;
pid_t pid;
// 接收子进程发来的信号,非阻塞
if ((pid = wait3(&statloc,WNOHANG,NULL)) != 0) {
int exitcode = WEXITSTATUS(statloc);
int bysignal = 0;
if (WIFSIGNALED(statloc)) bysignal = WTERMSIG(statloc);
// BGSAVE 执行完毕
if (pid == server.rdb_child_pid) {
backgroundSaveDoneHandler(exitcode,bysignal);
// BGREWRITEAOF 执行完毕
} else if (pid == server.aof_child_pid) {
backgroundRewriteDoneHandler(exitcode,bysignal);
} else {
redisLog(REDIS_WARNING,
"Warning, detected child with unmatched pid: %ld",
(long)pid);
}
updateDictResizePolicy();
}
}
然后根据不同的持久化方案进行后续的逻辑执行,如果 RDB 完成了,就用新RDB文件覆盖 RDB文件, AOF 就用新 AOF 替换旧 AOF
// 根据 AOF 政策,
// 考虑是否需要将 AOF 缓冲区中的内容写入到 AOF 文件中
/* AOF postponed flush: Try at every cron cycle if the slow fsync
* completed. */
if (server.aof_flush_postponed_start) flushAppendOnlyFile(0);
/* AOF write errors: in this case we have a buffer to flush as well and
* clear the AOF error in case of success to make the DB writable again,
* however to try every second is enough in case of 'hz' is set to
* an higher frequency. */
run_with_period(1000) {
if (server.aof_last_write_status == REDIS_ERR)
flushAppendOnlyFile(0);
}
AOF 是以命令追加的方式来实现的持久化功能,所以在 serverCron 中也做了相应的处理, 来判断是否要进行持久化
/* Close clients that need to be closed asynchronous */
// 关闭那些需要异步关闭的客户端
freeClientsInAsyncFreeQueue();
struct redisServer{
int cronloops;
}`
每当 serverCron 执行一次, cronloops 就+1.
// 增加 loop 计数器
server.cronloops++;
该属性目前的唯一作用就是进行 serverCron 每执行多少次,就进行某个函数调用的功能.
一个 Redis 服务器从启动到能够接受客户端的命令请求,需要经过一系列的初始化和设置过程.
void initServerConfig() {
int j;
// 服务器状态
// 设置服务器的运行 ID
getRandomHexChars(server.runid,REDIS_RUN_ID_SIZE);
// 设置默认配置文件路径
server.configfile = NULL;
// 设置默认服务器频率
server.hz = REDIS_DEFAULT_HZ;
// 为运行 ID 加上结尾字符
server.runid[REDIS_RUN_ID_SIZE] = '\0';
// 设置服务器的运行架构
server.arch_bits = (sizeof(long) == 8) ? 64 : 32;
// 设置默认服务器端口号
server.port = REDIS_SERVERPORT;
server.tcp_backlog = REDIS_TCP_BACKLOG;
server.bindaddr_count = 0;
server.unixsocket = NULL;
server.unixsocketperm = REDIS_DEFAULT_UNIX_SOCKET_PERM;
server.ipfd_count = 0;
server.sofd = -1;
server.dbnum = REDIS_DEFAULT_DBNUM;
server.verbosity = REDIS_DEFAULT_VERBOSITY;
server.maxidletime = REDIS_MAXIDLETIME;
server.tcpkeepalive = REDIS_DEFAULT_TCP_KEEPALIVE;
server.active_expire_enabled = 1;
server.client_max_querybuf_len = REDIS_MAX_QUERYBUF_LEN;
server.saveparams = NULL;
server.loading = 0;
server.logfile = zstrdup(REDIS_DEFAULT_LOGFILE);
server.syslog_enabled = REDIS_DEFAULT_SYSLOG_ENABLED;
server.syslog_ident = zstrdup(REDIS_DEFAULT_SYSLOG_IDENT);
server.syslog_facility = LOG_LOCAL0;
server.daemonize = REDIS_DEFAULT_DAEMONIZE;
server.aof_state = REDIS_AOF_OFF;
server.aof_fsync = REDIS_DEFAULT_AOF_FSYNC;
server.aof_no_fsync_on_rewrite = REDIS_DEFAULT_AOF_NO_FSYNC_ON_REWRITE;
server.aof_rewrite_perc = REDIS_AOF_REWRITE_PERC;
server.aof_rewrite_min_size = REDIS_AOF_REWRITE_MIN_SIZE;
server.aof_rewrite_base_size = 0;
server.aof_rewrite_scheduled = 0;
server.aof_last_fsync = time(NULL);
server.aof_rewrite_time_last = -1;
server.aof_rewrite_time_start = -1;
server.aof_lastbgrewrite_status = REDIS_OK;
server.aof_delayed_fsync = 0;
server.aof_fd = -1;
server.aof_selected_db = -1; /* Make sure the first time will not match */
server.aof_flush_postponed_start = 0;
server.aof_rewrite_incremental_fsync = REDIS_DEFAULT_AOF_REWRITE_INCREMENTAL_FSYNC;
server.pidfile = zstrdup(REDIS_DEFAULT_PID_FILE);
server.rdb_filename = zstrdup(REDIS_DEFAULT_RDB_FILENAME);
server.aof_filename = zstrdup(REDIS_DEFAULT_AOF_FILENAME);
server.requirepass = NULL;
server.rdb_compression = REDIS_DEFAULT_RDB_COMPRESSION;
server.rdb_checksum = REDIS_DEFAULT_RDB_CHECKSUM;
server.stop_writes_on_bgsave_err = REDIS_DEFAULT_STOP_WRITES_ON_BGSAVE_ERROR;
server.activerehashing = REDIS_DEFAULT_ACTIVE_REHASHING;
server.notify_keyspace_events = 0;
server.maxclients = REDIS_MAX_CLIENTS;
server.bpop_blocked_clients = 0;
server.maxmemory = REDIS_DEFAULT_MAXMEMORY;
server.maxmemory_policy = REDIS_DEFAULT_MAXMEMORY_POLICY;
server.maxmemory_samples = REDIS_DEFAULT_MAXMEMORY_SAMPLES;
server.hash_max_ziplist_entries = REDIS_HASH_MAX_ZIPLIST_ENTRIES;
server.hash_max_ziplist_value = REDIS_HASH_MAX_ZIPLIST_VALUE;
server.list_max_ziplist_entries = REDIS_LIST_MAX_ZIPLIST_ENTRIES;
server.list_max_ziplist_value = REDIS_LIST_MAX_ZIPLIST_VALUE;
server.set_max_intset_entries = REDIS_SET_MAX_INTSET_ENTRIES;
server.zset_max_ziplist_entries = REDIS_ZSET_MAX_ZIPLIST_ENTRIES;
server.zset_max_ziplist_value = REDIS_ZSET_MAX_ZIPLIST_VALUE;
server.hll_sparse_max_bytes = REDIS_DEFAULT_HLL_SPARSE_MAX_BYTES;
server.shutdown_asap = 0;
server.repl_ping_slave_period = REDIS_REPL_PING_SLAVE_PERIOD;
server.repl_timeout = REDIS_REPL_TIMEOUT;
server.repl_min_slaves_to_write = REDIS_DEFAULT_MIN_SLAVES_TO_WRITE;
server.repl_min_slaves_max_lag = REDIS_DEFAULT_MIN_SLAVES_MAX_LAG;
server.cluster_enabled = 0;
server.cluster_node_timeout = REDIS_CLUSTER_DEFAULT_NODE_TIMEOUT;
server.cluster_migration_barrier = REDIS_CLUSTER_DEFAULT_MIGRATION_BARRIER;
server.cluster_configfile = zstrdup(REDIS_DEFAULT_CLUSTER_CONFIG_FILE);
server.lua_caller = NULL;
server.lua_time_limit = REDIS_LUA_TIME_LIMIT;
server.lua_client = NULL;
server.lua_timedout = 0;
server.migrate_cached_sockets = dictCreate(&migrateCacheDictType,NULL);
server.loading_process_events_interval_bytes = (1024*1024*2);
// 初始化 LRU 时间
server.lruclock = getLRUClock();
// 初始化并设置保存条件
resetServerSaveParams();
appendServerSaveParams(60*60,1); /* save after 1 hour and 1 change */
appendServerSaveParams(300,100); /* save after 5 minutes and 100 changes */
appendServerSaveParams(60,10000); /* save after 1 minute and 10000 changes */
/* Replication related */
// 初始化和复制相关的状态
server.masterauth = NULL;
server.masterhost = NULL;
server.masterport = 6379;
server.master = NULL;
server.cached_master = NULL;
server.repl_master_initial_offset = -1;
server.repl_state = REDIS_REPL_NONE;
server.repl_syncio_timeout = REDIS_REPL_SYNCIO_TIMEOUT;
server.repl_serve_stale_data = REDIS_DEFAULT_SLAVE_SERVE_STALE_DATA;
server.repl_slave_ro = REDIS_DEFAULT_SLAVE_READ_ONLY;
server.repl_down_since = 0; /* Never connected, repl is down since EVER. */
server.repl_disable_tcp_nodelay = REDIS_DEFAULT_REPL_DISABLE_TCP_NODELAY;
server.slave_priority = REDIS_DEFAULT_SLAVE_PRIORITY;
server.master_repl_offset = 0;
/* Replication partial resync backlog */
// 初始化 PSYNC 命令所使用的 backlog
server.repl_backlog = NULL;
server.repl_backlog_size = REDIS_DEFAULT_REPL_BACKLOG_SIZE;
server.repl_backlog_histlen = 0;
server.repl_backlog_idx = 0;
server.repl_backlog_off = 0;
server.repl_backlog_time_limit = REDIS_DEFAULT_REPL_BACKLOG_TIME_LIMIT;
server.repl_no_slaves_since = time(NULL);
/* Client output buffer limits */
// 设置客户端的输出缓冲区限制
for (j = 0; j < REDIS_CLIENT_LIMIT_NUM_CLASSES; j++)
server.client_obuf_limits[j] = clientBufferLimitsDefaults[j];
/* Double constants initialization */
// 初始化浮点常量
R_Zero = 0.0;
R_PosInf = 1.0/R_Zero;
R_NegInf = -1.0/R_Zero;
R_Nan = R_Zero/R_Zero;
/* Command table -- we initiialize it here as it is part of the
* initial configuration, since command names may be changed via
* redis.conf using the rename-command directive. */
// 初始化命令表
// 在这里初始化是因为接下来读取 .conf 文件时可能会用到这些命令
server.commands = dictCreate(&commandTableDictType,NULL);
server.orig_commands = dictCreate(&commandTableDictType,NULL);
populateCommandTable();
server.delCommand = lookupCommandByCString("del");
server.multiCommand = lookupCommandByCString("multi");
server.lpushCommand = lookupCommandByCString("lpush");
server.lpopCommand = lookupCommandByCString("lpop");
server.rpopCommand = lookupCommandByCString("rpop");
/* Slow log */
// 初始化慢查询日志
server.slowlog_log_slower_than = REDIS_SLOWLOG_LOG_SLOWER_THAN;
server.slowlog_max_len = REDIS_SLOWLOG_MAX_LEN;
/* Debugging */
// 初始化调试项
server.assert_failed = "";
server.assert_file = "";
server.assert_line = 0;
server.bug_report_start = 0;
server.watchdog_period = 0;
}
主要是初始化了 服务器的运行ID, 默认平率,文件路径,运行架构,端口号, RDB和AOF条件,LRU时钟,命令表.
完成之后将进行下一阶段
// 检查用户是否指定了配置文件,或者配置选项
if (argc >= 2) {
int j = 1; /* First option to parse in argv[] */
sds options = sdsempty();
char *configfile = NULL;
/* Handle special options --help and --version */
// 处理特殊选项 -h 、-v 和 --test-memory
if (strcmp(argv[1], "-v") == 0 ||
strcmp(argv[1], "--version") == 0) version();
if (strcmp(argv[1], "--help") == 0 ||
strcmp(argv[1], "-h") == 0) usage();
if (strcmp(argv[1], "--test-memory") == 0) {
if (argc == 3) {
memtest(atoi(argv[2]),50);
exit(0);
} else {
fprintf(stderr,"Please specify the amount of memory to test in megabytes.\n");
fprintf(stderr,"Example: ./redis-server --test-memory 4096\n\n");
exit(1);
}
}
/* First argument is the config file name? */
// 如果第一个参数(argv[1])不是以 "--" 开头
// 那么它应该是一个配置文件
if (argv[j][0] != '-' || argv[j][1] != '-')
configfile = argv[j++];
/* All the other options are parsed and conceptually appended to the
* configuration file. For instance --port 6380 will generate the
* string "port 6380\n" to be parsed after the actual file name
* is parsed, if any. */
// 对用户给定的其余选项进行分析,并将分析所得的字符串追加稍后载入的配置文件的内容之后
// 比如 --port 6380 会被分析为 "port 6380\n"
while(j != argc) {
if (argv[j][0] == '-' && argv[j][1] == '-') {
/* Option name */
if (sdslen(options)) options = sdscat(options,"\n");
options = sdscat(options,argv[j]+2);
options = sdscat(options," ");
} else {
/* Option argument */
options = sdscatrepr(options,argv[j],strlen(argv[j]));
options = sdscat(options," ");
}
j++;
}
if (configfile) server.configfile = getAbsolutePath(configfile);
// 重置保存条件
resetServerSaveParams();
// 载入配置文件, options 是前面分析出的给定选项
loadServerConfig(configfile,options);
sdsfree(options);
// 获取配置文件的绝对路径
if (configfile) server.configfile = getAbsolutePath(configfile);
} else {
redisLog(REDIS_WARNING, "Warning: no config file specified, using the default config. In order to specify a config file use %s /path/to/%s.conf", argv[0], server.sentinel_mode ? "sentinel" : "redis");
}
在文件 redis.conf 中
有一系列的配置属性,进行一一的载入
不过如果在启动时候额外的输入了一些配置,那将以输入的配置结果为重,比如 redis.conf 中端口号默认为 6379, 那么当启动服务器时指定的10086 的端口号,那么启动后,服务器将以输入值为准.
当然代码中的默认值优先级也会一次低于配置文件以及命令输入.
这也是设计的基本规则.
// 创建并初始化服务器数据结构
initServer();
void initServer() {
int j;
// 设置信号处理函数
signal(SIGHUP, SIG_IGN);
signal(SIGPIPE, SIG_IGN);
setupSignalHandlers();
// 设置 syslog
if (server.syslog_enabled) {
openlog(server.syslog_ident, LOG_PID | LOG_NDELAY | LOG_NOWAIT,
server.syslog_facility);
}
// 初始化并创建数据结构
server.current_client = NULL;
server.clients = listCreate();
server.clients_to_close = listCreate();
server.slaves = listCreate();
server.monitors = listCreate();
server.slaveseldb = -1; /* Force to emit the first SELECT command. */
server.unblocked_clients = listCreate();
server.ready_keys = listCreate();
server.clients_waiting_acks = listCreate();
server.get_ack_from_slaves = 0;
server.clients_paused = 0;
// 创建共享对象
createSharedObjects();
adjustOpenFilesLimit();
server.el = aeCreateEventLoop(server.maxclients+REDIS_EVENTLOOP_FDSET_INCR);
server.db = zmalloc(sizeof(redisDb)*server.dbnum);
/* Open the TCP listening socket for the user commands. */
// 打开 TCP 监听端口,用于等待客户端的命令请求
if (server.port != 0 &&
listenToPort(server.port,server.ipfd,&server.ipfd_count) == REDIS_ERR)
exit(1);
/* Open the listening Unix domain socket. */
// 打开 UNIX 本地端口
if (server.unixsocket != NULL) {
unlink(server.unixsocket); /* don't care if this fails */
server.sofd = anetUnixServer(server.neterr,server.unixsocket,
server.unixsocketperm, server.tcp_backlog);
if (server.sofd == ANET_ERR) {
redisLog(REDIS_WARNING, "Opening socket: %s", server.neterr);
exit(1);
}
anetNonBlock(NULL,server.sofd);
}
/* Abort if there are no listening sockets at all. */
if (server.ipfd_count == 0 && server.sofd < 0) {
redisLog(REDIS_WARNING, "Configured to not listen anywhere, exiting.");
exit(1);
}
/* Create the Redis databases, and initialize other internal state. */
// 创建并初始化数据库结构
for (j = 0; j < server.dbnum; j++) {
server.db[j].dict = dictCreate(&dbDictType,NULL);
server.db[j].expires = dictCreate(&keyptrDictType,NULL);
server.db[j].blocking_keys = dictCreate(&keylistDictType,NULL);
server.db[j].ready_keys = dictCreate(&setDictType,NULL);
server.db[j].watched_keys = dictCreate(&keylistDictType,NULL);
server.db[j].eviction_pool = evictionPoolAlloc();
server.db[j].id = j;
server.db[j].avg_ttl = 0;
}
// 创建 PUBSUB 相关结构
server.pubsub_channels = dictCreate(&keylistDictType,NULL);
server.pubsub_patterns = listCreate();
listSetFreeMethod(server.pubsub_patterns,freePubsubPattern);
listSetMatchMethod(server.pubsub_patterns,listMatchPubsubPattern);
server.cronloops = 0;
server.rdb_child_pid = -1;
server.aof_child_pid = -1;
aofRewriteBufferReset();
server.aof_buf = sdsempty();
server.lastsave = time(NULL); /* At startup we consider the DB saved. */
server.lastbgsave_try = 0; /* At startup we never tried to BGSAVE. */
server.rdb_save_time_last = -1;
server.rdb_save_time_start = -1;
server.dirty = 0;
resetServerStats();
/* A few stats we don't want to reset: server startup time, and peak mem. */
server.stat_starttime = time(NULL);
server.stat_peak_memory = 0;
server.resident_set_size = 0;
server.lastbgsave_status = REDIS_OK;
server.aof_last_write_status = REDIS_OK;
server.aof_last_write_errno = 0;
server.repl_good_slaves_count = 0;
updateCachedTime();
/* Create the serverCron() time event, that's our main way to process
* background operations. */
// 为 serverCron() 创建时间事件
if(aeCreateTimeEvent(server.el, 1, serverCron, NULL, NULL) == AE_ERR) {
redisPanic("Can't create the serverCron time event.");
exit(1);
}
/* Create an event handler for accepting new connections in TCP and Unix
* domain sockets. */
// 为 TCP 连接关联连接应答(accept)处理器
// 用于接受并应答客户端的 connect() 调用
for (j = 0; j < server.ipfd_count; j++) {
if (aeCreateFileEvent(server.el, server.ipfd[j], AE_READABLE,
acceptTcpHandler,NULL) == AE_ERR)
{
redisPanic(
"Unrecoverable error creating server.ipfd file event.");
}
}
// 为本地套接字关联应答处理器
if (server.sofd > 0 && aeCreateFileEvent(server.el,server.sofd,AE_READABLE,
acceptUnixHandler,NULL) == AE_ERR) redisPanic("Unrecoverable error creating server.sofd file event.");
/* Open the AOF file if needed. */
// 如果 AOF 持久化功能已经打开,那么打开或创建一个 AOF 文件
if (server.aof_state == REDIS_AOF_ON) {
server.aof_fd = open(server.aof_filename,
O_WRONLY|O_APPEND|O_CREAT,0644);
if (server.aof_fd == -1) {
redisLog(REDIS_WARNING, "Can't open the append-only file: %s",
strerror(errno));
exit(1);
}
}
/* 32 bit instances are limited to 4GB of address space, so if there is
* no explicit limit in the user provided configuration we set a limit
* at 3 GB using maxmemory with 'noeviction' policy'. This avoids
* useless crashes of the Redis instance for out of memory. */
// 对于 32 位实例来说,默认将最大可用内存限制在 3 GB
if (server.arch_bits == 32 && server.maxmemory == 0) {
redisLog(REDIS_WARNING,"Warning: 32 bit instance detected but no memory limit set. Setting 3 GB maxmemory limit with 'noeviction' policy now.");
server.maxmemory = 3072LL*(1024*1024); /* 3 GB */
server.maxmemory_policy = REDIS_MAXMEMORY_NO_EVICTION;
}
// 如果服务器以 cluster 模式打开,那么初始化 cluster
if (server.cluster_enabled) clusterInit();
// 初始化复制功能有关的脚本缓存
replicationScriptCacheInit();
// 初始化脚本系统
scriptingInit();
// 初始化慢查询功能
slowlogInit();
// 初始化 BIO 系统
bioInit();
}
/* Function called at startup to load RDB or AOF file in memory. */
void loadDataFromDisk(void) {
// 记录开始时间
long long start = ustime();
// AOF 持久化已打开?
if (server.aof_state == REDIS_AOF_ON) {
// 尝试载入 AOF 文件
if (loadAppendOnlyFile(server.aof_filename) == REDIS_OK)
// 打印载入信息,并计算载入耗时长度
redisLog(REDIS_NOTICE,"DB loaded from append only file: %.3f seconds",(float)(ustime()-start)/1000000);
// AOF 持久化未打开
} else {
// 尝试载入 RDB 文件
if (rdbLoad(server.rdb_filename) == REDIS_OK) {
// 打印载入信息,并计算载入耗时长度
redisLog(REDIS_NOTICE,"DB loaded from disk: %.3f seconds",
(float)(ustime()-start)/1000000);
} else if (errno != ENOENT) {
redisLog(REDIS_WARNING,"Fatal error loading the DB: %s. Exiting.",strerror(errno));
exit(1);
}
}
}
根据是否配置了某种持久化方案来进行持久化数据的载入
// 运行事件处理器,一直到服务器关闭为止
aeSetBeforeSleepProc(server.el,beforeSleep);
aeMain(server.el);
/*
* 事件处理器的主循环
*/
void aeMain(aeEventLoop *eventLoop) {
eventLoop->stop = 0;
while (!eventLoop->stop) {
// 如果有需要在事件处理前执行的函数,那么运行它
if (eventLoop->beforesleep != NULL)
eventLoop->beforesleep(eventLoop);
// 开始处理事件
aeProcessEvents(eventLoop, AE_ALL_EVENTS);
}
}
初始化从此结束,一切从此开始
本章从 SET KEY VAL 开始讲述了一条命令的执行.
还介绍了 serverCron 和 整个数据库的初始化.
数据库的初始化分为:1)初始化服务器状态,2)载入配置表,3)初始化服务器数据结构,4)还原数据库状态,5)执行事件循环
从此,第二部分,单机数据库的实现也就学习完毕,也说下个人感受.
以现阶段的感受进行总结的话,可以这么形容.
不以规矩,不能成方圆
从一条命令开始,就感受到这个规则,每条命令通过客户端进程传到服务器进程时,并不是直接进行的解析,代码运行,而是会经过很多次的缓存,从服务器收到命令,到回复,实际上经过了好几个循环周期,通过不知多少的中间缓存,中间指针,各种处理器,各种命令处理器,这有必要吗?
答案是确定的,这一切都是肯定的,这都是架构的时候各个方面问题下抉择权衡的结果.
其实在之前的工作中,我也遇到类似的问题,虽然已知的很明确,但是未来肯定会有更多的扩展需求,实际上,我也是做了 Redis 类似的处理方案,用中间层来进行一个可能的扩展的需求的设计,实际上也漂亮的解决了后来遇到的问题,但是当时我并没有那么大的确定性,因为之前也没有如此深入的接触过成熟的开源代码,一度也想着修改方案,不过幸好,我还是坚持了,当然我写的是C++,用的 std::function ,但是本质肯定是相同的,无非是 redis 用的函数指针, 我用的函数模版,当然还可以用虚函数,运行时进行绑定,再退一步,三者从个自己实现上也可以是说是相同的,追究到底,实现还是通过函数指针,只不过上层的包裹方式和目的性不同罢了.
如果了解了 redis 的整体设计思路,那么 redis 的代码也没有让人抓闹的地方了,也可能是跟着书籍再看代码的缘故,代码还有完全的精确注释. 的确能让人在阅读的时候将精力完全放在体会整体架构上面,可以不用纠结细枝末节.