Redis AOF持久化流程分析与参数配置详解

上一篇文章中介绍了RDB的持久化方式,本文接着分析一下另外的一种持久化方式AOF。

Redis中的AOF

AOF功能在Redis中默认是关闭的,与RDB快照的方式不同,AOF是通过保存服务器执行的写命令来记录数据库状态的,也就是说AOF文件中保存的就是一条条的操作指令。

如何开启AOF

在redis.conf文件中,设置:appendonly yes(默认为no)

这份关于AOF的摘要说明也非常好,所以贴出来,建议大家阅读。


# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.

appendonly no

AOF文件可读性—Resp协议

AOF对比RDB的其中一个优势就在于aof文件的可读性,aof文件是根据RESP协议约定的数据格式进行保存。

客户端写入 set abc 123

127.0.0.1:6379> set a 77
OK

aof文件中生成的内容

*3   *表示后面有几组数据。
$3   $表示数据长度为3,set长度为3
set  执行的命令
$1	 表示a的长度1
a
$2	表示77的长度为2
77

我们可以再对比mset看看

127.0.0.1:6379> mset b 11 d 22
OK

*5
$4
mset
$1
b
$2
11
$1
d
$2
22

从这两个操作就可以看出aof文件相比rdb的二进制,可读性强很多。

AOF持久化具体实现

aof流程大致可以分为命令追加、文件写入、数据同步。

命令追加

服务器在执行完一个写命令之后,会以协议格式将被执行的写命令追加到服务器状态的aof_bug缓冲区的末尾。

文件写入

服务器通过不停的时间循环来调用flushAppendOnlyFile函数处理aof_buf缓冲区中内容的写入。

数据同步

当用户调用write函数向文件中写入数据时,操作系统为了提高性能,一般会把数据暂时保存在内存缓冲区里面,等缓冲区满或者超过了指定的时限之后,才真正把缓冲区中的数据写入磁盘。

这种做法虽然提高了效率,但也带来了数据安全性的问题,因为如果服务器发送故障,那么保存在缓冲区里面的数据就将丢失,所以系统内核又为我们提供了fsync这样的函数,通过调用fsync可以强制系统把缓冲区中的数据写入到磁盘上,保证数据安全性。

关于操作系统的缓冲区介绍,可以阅读我的另一篇文章 理解page cache的设计与作用

AOF数据同步策略

根据redis.conf中的配置项,redis提供了三种同步的策略。

# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".

# 默认配置为 appendfsync everysec
# appendfsync always
appendfsync everysec
# appendfsync no

appendfsync everysec

默认配置,表示每隔1秒钟,调用fsync()函数,将aof_buf缓冲区中的数据写入磁盘,这是在速度与数据安全两者之间的折中选项,就算出现了故障停机,也只丢失一秒钟的操作数据。

appendfsync always

这是3种同步中最慢的方式,不过也是最安全的方式,每次调用flushAppendOnlyFile函数都会执行fsync操作。所以当出现了故障停机时,丢失的是一个事件循环过程中产生的命令数据。

appendfsync no

这是3种同步中最不安全的方式,缓冲区中的数据何时写入磁盘完全后操作系统决定,但也因为这种方式不需要调用flushAppendOnlyFile函数,所以也是3种方式中最快的。

还是图片来的直接

Redis AOF持久化流程分析与参数配置详解_第1张图片

AOF文件的载入

当服务器启动时,如果即开启了RDB又开启了AOF,那么优先会从AOF文件中恢复数据,服务器只需要根据aof文件中记录的命令,从头到尾执行一遍即可。

Redis AOF持久化流程分析与参数配置详解_第2张图片

AOF的重写

AOF持久化是通过保存被执行的写命令来记录数据库当前状态的,所以随着服务器运行时间的流逝,AOF文件中的内容会越来越多,文件的体积也会越来越大,所以为了解决这个问题,Redis就提供了AOF文件重写的功能。

调用方式

执行BGREWRITEAOF命令

127.0.0.1:6379> BGREWRITEAOF
Background append only file rewriting started
127.0.0.1:6379>


服务端日志

1572:M 03 Nov 2020 19:27:33.134 * Background saving terminated with success
1572:M 03 Nov 2020 19:35:55.208 * Background append only file rewriting started by pid 1583
1572:M 03 Nov 2020 19:35:55.259 * AOF rewrite child asks to stop sending diffs.
1583:C 03 Nov 2020 19:35:55.259 * Parent agreed to stop sending diffs. Finalizing AOF...
1583:C 03 Nov 2020 19:35:55.259 * Concatenating 0.00 MB of AOF diff received from parent.
1583:C 03 Nov 2020 19:35:55.259 * SYNC append only file rewrite performed
1583:C 03 Nov 2020 19:35:55.259 * AOF rewrite: 4 MB of memory used by copy-on-write
1572:M 03 Nov 2020 19:35:55.292 * Background AOF rewrite terminated with success
1572:M 03 Nov 2020 19:35:55.292 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1572:M 03 Nov 2020 19:35:55.292 * Background AOF rewrite finished successfully

相关参数配置

# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

如果满足如下参数配置中的条件,那么服务端会自动调用BGREWRITEAOF,完成AOF的重写。

auto-aof-rewrite-percentage 100

auto-aof-rewrite-min-size 64mb

这个两个参数的表示:要重写的AOF文件最小大小为64mb,并且以上一次重写后的文件大小与当前文件大小进行比较,如果超过100%,则触发重写。

AOF重写的原理

现在来看一看,aof文件重写到底对文件中的数据做了哪些处理。

是否需要读取整个aof文件?

需要重写时的aof文件一般都已经非常大了,所以如果读取整个文件中的数据,效率肯定较低,所以实际上,redis在重写时根本没有读取aof文件,而是直接通过读取服务器当前的数据库状态来实现的,并且在读取时对于过期的键也可以忽略。

重写演示

1、客户端写入命令

在这里插入图片描述
2、查看aof中的文件内容

Redis AOF持久化流程分析与参数配置详解_第3张图片

3、客户端手动调用重写

Redis AOF持久化流程分析与参数配置详解_第4张图片
4、查看重写后的aof文件内容

Redis AOF持久化流程分析与参数配置详解_第5张图片

很明显可以看出来,aof文件中只需要保留最后一条set的操作即可,这就是文件重写的原理。

AOF重写流程

虽然aof重写时并没有读取整个aof文件,但当文件体积较大时依然会产生长时间的阻塞,从而造成服务端无法处理客户端的请求命令。

所以为了解决这个问题,同rdb的bgsave一样,重写也叫bgrewriteaof,就是把aof重写流程放到子进程中去执行,这样就不会影响服务端继续处理客户端的请求命令,同时子进程带有父进程的数据副本,这样也保证了不使用锁的情况下也能使数据安全。

但现在又带来了一个新问题,由于aof可以理解为是实时性的日志文件,与rdb快照不同,rdb只管某一时刻生成的数据样本,生成过程中产生的新数据是不关心的,而aof则必须处理在生成aof文件过程中,客户端发来的新命令请求。

于是Redis就又设计了一个AOF重写缓冲区的概念:在子进程重写过程中,客户端发来的命令会被服务端写入到重写缓冲区中,子进程再给父进程发一个信号,告诉父进程重写已经完成,父进程就会把重写缓冲区中写入到aof文件中。

父进程收到信号后的大致流程如下:

1、将AOF重写缓冲区中的所有内容写入到新的AOF文件中,这时新AOF文件所处于的数据库状态将和服务器当前的数据库状态一致。
2、将新的AOF文件进行改名,原子的覆盖现在有的AOF文件,完成新旧两个AOF文件的替换。

并且根据前面的AOF重写日志大致也能分析出:

1、AOF重写完成。
2、父子进程数据差异成功刷写到AOF中。
3、整个AOF完成。

1572:M 03 Nov 2020 19:35:55.292 * Background AOF rewrite terminated with success
1572:M 03 Nov 2020 19:35:55.292 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1572:M 03 Nov 2020 19:35:55.292 * Background AOF rewrite finished successfully

需要注意的是,父进程在处理信号后的执行过程中会阻塞客户端的请求,直到信号函数处理完毕为止。

Redis AOF持久化流程分析与参数配置详解_第6张图片

相关参数

AOF基本参数上面已经陆续介绍过了,最后再补充一个:no-appendfsync-on-rewrite。

# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.

no-appendfsync-on-rewrite no

通过名称直译过来就是:重写时不追加fsync。

默认为no,负负得正,也就是说在aof重写时可以调用fsync,这是什么意思呢?

实际上本质还是磁盘I/O的问题,Redis作者认为在AOF中如果你配置了every甚至always,那么就会产生大量的磁盘写入操作(调用sync),假设此时又遇到了bgsave或bgwriteaof,这将会导致主进程阻塞的时间更长,所以提供了这样一个配置,当为no时,表示不考虑磁盘写入的问题,依旧会调用fsync刷盘操作,这样数据可靠性得到保证,但可能会产生较高的阻塞延时问题,如果不能接受阻塞的问题,你可以配置成yes,那么就意味着在重写期间,即使你配置的是every或always,那么也只会调用write函数写入到操作系统缓存中,什么时候刷盘由操作系统自身决定(默认为30秒),这样可以有效减少阻塞问题,但数据的可靠性就无法保证了,最差情况下就会丢失30秒的数据。

AOF的优缺点

优点

  • 对于持久化要求比较高的场景,AOF比RDB更加合适。
  • AOF文件根据RESP协议生成,阅读性较好,可以修改。

缺点

  • 体积文件较大,恢复时间相比RDB也要长很多。
  • 开启AOF需要额外处理日志数据的aof_buf写入和磁盘同步,开启awayls模式,性能损耗会更多。
  • 服务端进程处理重写缓冲区中的数据时会阻塞客户端请求。

你可能感兴趣的:(Redis,redis)