redis源码浅析--九-RDB持久化

环境说明:redis源码版本 5.0.3;我在阅读源码过程做了注释,git地址:https://gitee.com/xiaoangg/redis_annotation
参考书籍:《redis的设计与实现》

目录

一. RDB的创建与载入

二 自动间隔性保存

三 RDB文件结构


RDB是什么,解决了什么问题,是如何实现的

因为redis是内存数据库,一旦服务器进程意外退出,数据库中数据也会丢失;

RDB就是为了解决这个问题,提供的数据持久化功能;

RDB是将某个时间点的数据库保存到一个RDB文件中,RDB文件是一个经过压缩了的二进制文件,通过该文件可以还原数据库状态;

 

一. RDB的创建与载入

1.创建

有两个命令可以生成RDB文件:SAVE,BGSAVE

SAVE命令会堵塞redis服务进程,直到RDB文件创建完毕,对于内存比较大的实例会造成长时间的堵塞,线上不建议使用;

BGSAVE会派生出一个子进程,由子进程负责创建RDB文件,父进程继续处理命令请求;

2.载入

redis没有专门的命令用来载入RDB文件,只要redis服务启动时检测到了RDB文件,他就会自动载入

tips:因为AOF文件的更新频率通常比RDB文件的更新评论更高,所以如果开启了AOF持久化功能,那么服务会优先使用AOF文件来还原数据;


二 自动间隔性保存

用可以通过设置服务器配置的save选项,让服务器没间隔一段时间自动执行一次BGSAVE;

举个例子,如果在服务器配置项中写入如下配置:
save 900 1
save 300 10
save 60 10000
那么主要满足上述 三个条件中的任何一个,BGSAVE命令就会被执行:
在900秒内,对数据库修改了至少1次;
在300内至少对数据库修改了10次;
在60秒内,直达哟对数据库修改了10000次;

redisServer中RDB相关属性如下:

struct redisServer {
    //.......
    /* RDB persistence */
    //RDB 持久化相关属性
    long long dirty;                /* Changes to DB from the last save */ //计数器 距离上次 save/bgsave后,服务器进行了多少次修改
    long long dirty_before_bgsave;  /* Used to restore dirty on failed BGSAVE */
    pid_t rdb_child_pid;            /* PID of RDB saving child */ //RDB持久化子进程id
    struct saveparam *saveparams;   /* Save points array for RDB *///触发RDB持久的的条件数组
    int saveparamslen;              /* Number of saving points */
    char *rdb_filename;             /* Name of RDB file */
    int rdb_compression;            /* Use compression in RDB? */
    int rdb_checksum;               /* Use RDB checksum? */
    time_t lastsave;                /* Unix time of last successful save */ //unix时间戳,记录上一次执行save/bgsave成功的时间
    time_t lastbgsave_try;          /* Unix time of last attempted bgsave */
    time_t rdb_save_time_last;      /* Time used by last RDB save run. */
    time_t rdb_save_time_start;     /* Current RDB save start time. */
    int rdb_bgsave_scheduled;       /* BGSAVE when possible if true. */
    int rdb_child_type;             /* Type of save by active child. */
    int lastbgsave_status;          /* C_OK or C_ERR */
    int stop_writes_on_bgsave_err;  /* Don't allow writes if can't BGSAVE */
    int rdb_pipe_write_result_to_parent; /* RDB pipes used to return the state */
    int rdb_pipe_read_result_from_child; /* of each slave in diskless SYNC. */

    //......
}

/*
触发RDB持久化的条件
如( 
seconds: 900
change:1 
表示900秒内 数据修改过1一次
)
*/
struct saveparam {
    time_t seconds;   //秒
    int changes; //修改次数
};

1.设置保存条件

可以指定配置文件或者传入启动参数 save选项;如果没有设置,服务器将使用默认条件:
save 900 1
save 300 10
save 60 10000

设置的保存条件将会保存到redisServer结构体中的saveparam属性中(saveparam是个数组);

2. dirty计数器和lastsave属性

dirty计数器记录了上次save/bgsave成功后,数据库修改的次数(增、删、改);

lastsave值是个unix时间戳,记录上次save和bgsave成功的时间;

3 检查是否满足保存条件

检查是否满足保存条件的入口位于server.c/serverCron函数中;

默认每间隔100m就会执行一次;
判断是否满足保存条件的代码如下:
 

 /* If there is not a background saving/rewrite in progress check if
         * we have to save/rewrite now. */
        for (j = 0; j < server.saveparamslen; j++) {
            struct saveparam *sp = server.saveparams+j;

            /* Save if we reached the given amount of changes,
             * the given amount of seconds, and if the latest bgsave was
             * successful or if, in case of an error, at least
             * CONFIG_BGSAVE_RETRY_DELAY seconds already elapsed. */
            if (server.dirty >= sp->changes &&
                server.unixtime-server.lastsave > sp->seconds &&
                (server.unixtime-server.lastbgsave_try >
                 CONFIG_BGSAVE_RETRY_DELAY ||
                 server.lastbgsave_status == C_OK))
            {
                serverLog(LL_NOTICE,"%d changes in %d seconds. Saving...",
                    sp->changes, (int)sp->seconds);
                rdbSaveInfo rsi, *rsiptr;
                rsiptr = rdbPopulateSaveInfo(&rsi);
                rdbSaveBackground(server.rdb_filename,rsiptr);
                break;
            }
        }


三 RDB文件结构

下图是在RDB版本9的文件结构(如有错误欢迎指正

redis源码浅析--九-RDB持久化_第1张图片

RDB文件的开头是REDIS五个字符串,通过该开头可以快速检查载入的文件是否是RDB;

接下来是RDB_VERSION 记录的RDB文件版本(redis5.0.3的RDB版本是9),长度是4个字节带表的整数,所以该值是0009;

数据部分保存这多个非空数据库;记录了数据库的ID,数据库大小,有过期时间key的大小 等信息;
 

RDB存储实现位于rdb.c/rdbSaveRio,详细实现可以阅读源码;


//生成 RDB格式的数据库转储,并发送到指定的 流I/O
/* Produces a dump of the database in RDB format sending it to the specified
 * Redis I/O channel. On success C_OK is returned, otherwise C_ERR
 * is returned and part of the output, or all the output, can be
 * missing because of I/O errors.
 *
 * When the function returns C_ERR and if 'error' is not NULL, the
 * integer pointed by 'error' is set to the value of errno just after the I/O
 * error. */
int rdbSaveRio(rio *rdb, int *error, int flags, rdbSaveInfo *rsi) {
    dictIterator *di = NULL;
    dictEntry *de;
    char magic[10];
    int j;
    uint64_t cksum;
    size_t processed = 0;

    if (server.rdb_checksum)
        rdb->update_cksum = rioGenericUpdateChecksum;

    //RDB文件的最开头“REDIS”+4位RDB文件的版本号(当前版本是9) //所以magic是REDIS0009   
    snprintf(magic,sizeof(magic),"REDIS%04d",RDB_VERSION);
    if (rdbWriteRaw(rdb,magic,9) == -1) goto werr;
    
    //
    if (rdbSaveInfoAuxFields(rdb,flags,rsi) == -1) goto werr;

    for (j = 0; j < server.dbnum; j++) {
        redisDb *db = server.db+j;
        dict *d = db->dict;
        if (dictSize(d) == 0) continue;
        di = dictGetSafeIterator(d);

        //写入OPCODE(RDB_OPCODE_SELECTDB=254), 选中的数据库
        /* Write the SELECT DB opcode */
        if (rdbSaveType(rdb,RDB_OPCODE_SELECTDB) == -1) goto werr;
        if (rdbSaveLen(rdb,j) == -1) goto werr;

        /* Write the RESIZE DB opcode. We trim the size to UINT32_MAX, which
         * is currently the largest type we are able to represent in RDB sizes.
         * However this does not limit the actual size of the DB to load since
         * these sizes are just hints to resize the hash tables. */
        uint64_t db_size, expires_size;
        db_size = dictSize(db->dict);
        expires_size = dictSize(db->expires);
        if (rdbSaveType(rdb,RDB_OPCODE_RESIZEDB) == -1) goto werr;
        if (rdbSaveLen(rdb,db_size) == -1) goto werr;
        if (rdbSaveLen(rdb,expires_size) == -1) goto werr;

        //遍历数据库的所有key
        /* Iterate this DB writing every entry */
        while((de = dictNext(di)) != NULL) {
            sds keystr = dictGetKey(de);
            robj key, *o = dictGetVal(de);
            long long expire;

            initStaticStringObject(key,keystr);
            expire = getExpire(db,&key);
            if (rdbSaveKeyValuePair(rdb,&key,o,expire) == -1) goto werr;

            /* When this RDB is produced as part of an AOF rewrite, move
             * accumulated diff from parent to child while rewriting in
             * order to have a smaller final write. */
            if (flags & RDB_SAVE_AOF_PREAMBLE &&
                rdb->processed_bytes > processed+AOF_READ_DIFF_INTERVAL_BYTES)
            {
                processed = rdb->processed_bytes;
                aofReadDiffFromParent();
            }
        }
        dictReleaseIterator(di);
        di = NULL; /* So that we don't release it again on error. */
    }

    /* If we are storing the replication information on disk, persist
     * the script cache as well: on successful PSYNC after a restart, we need
     * to be able to process any EVALSHA inside the replication backlog the
     * master will send us. */
    if (rsi && dictSize(server.lua_scripts)) {
        di = dictGetIterator(server.lua_scripts);
        while((de = dictNext(di)) != NULL) {
            robj *body = dictGetVal(de);
            if (rdbSaveAuxField(rdb,"lua",3,body->ptr,sdslen(body->ptr)) == -1)
                goto werr;
        }
        dictReleaseIterator(di);
        di = NULL; /* So that we don't release it again on error. */
    }

    /* EOF opcode */
    if (rdbSaveType(rdb,RDB_OPCODE_EOF) == -1) goto werr;

    /* CRC64 checksum. It will be zero if checksum computation is disabled, the
     * loading code skips the check in this case. */
    cksum = rdb->cksum;
    memrev64ifbe(&cksum);
    if (rioWrite(rdb,&cksum,8) == 0) goto werr;
    return C_OK;

werr:
    if (error) *error = errno;
    if (di) dictReleaseIterator(di);
    return C_ERR;
}

 

你可能感兴趣的:(redis)