因为Redis的数据都储存在内存中,当进程退出时,所有数据都将丢失。为了保证数据安全,Redis支持RDB和AOF两种持久化机制有效避免数据丢失问题。RDB可以看作在某一时刻Redis的快照(snapshot),非常适合灾难恢复。AOF则是写入操作的日志。本文主要讲解RDB、AOF和混合结合使用。
RDB就像是一台给Redis内存数据存储拍照的照相机,生成快照保存到磁盘的过程。触发RDB持久化分为手动触发和自动触发。Redis重启读取RDB速度快,但是无法做到实时持久化,因此一般用于数据冷备和复制传输。
1.阻塞: 使用save命令:此命令会使用Redis的主线程进程同步存储,阻塞当前的Redis服务器,造成服务不可用,直到RDB过程完成。无论当前服务器数据量大小,线上不要用。
2.非阻塞: 使用bgsave命令:此命令会通过fork()创建子进程,在后台进程存储。只有fork阶段会阻塞当前Redis服务器,不必到整个RDB过程结束,一般时间很短(因为不需要复制父进程的物理内存空间,只是将父进程的 虚拟内存 与 物理内存 映射关系复制到子进程中)。因此Redis内部涉及到RDB都采用bgsave命令。这里注意一点,无论RDB还是AOF,由于使用了写时复制,fork出来的子进程不需要拷贝父进程的物理内存空间,但是会复制父进程的空间内存页表。
一般我们是不会直接用命令生成RDB文件的,Redis支持自动触发RDB持久化机制,配置都在redis.conf文件里面
################################ SNAPSHOTTING ################################
# Save the DB to disk.
#
# save
#
# Redis will save the DB if both the given number of seconds and the given
# number of write operations against the DB occurred.
#
# Snapshotting can be completely disabled with a single empty string argument
# as in following example:
#
# save ""
#
# Unless specified otherwise, by default Redis will save the DB:
# * After 3600 seconds (an hour) if at least 1 key changed
# * After 300 seconds (5 minutes) if at least 100 keys changed
# * After 60 seconds if at least 10000 keys changed
900秒(15分钟)内至少1个key值改变(则进行数据库保存--持久化)
300秒(5分钟)内至少10个key值改变(则进行数据库保存--持久化)
60秒(1分钟)内至少10000个key值改变(则进行数据库保存--持久化)
只要满足其中的任何一种都可以触发rdb
save "" 关闭rdb
# You can set these explicitly by uncommenting the three following lines.
#
save 3600 1
save 300 100
save 60 10000
# By default Redis will stop accepting writes if RDB snapshots are enabled
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
如果是yes,当bgsave命令失败时Redis将停止写入操作,这样会让用户了解到数据没有被正确的存储到磁盘上。
否则没人会注意到这个问题,可能会造成灾难。
stop-writes-on-bgsave-error yes
# Compress string objects using LZF when dump .rdb databases?
# By default compression is enabled as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
是否对RDB文件进行压缩,但是在LZF压缩消耗更多CPU
rdbcompression yes
# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
是否对RDB文件进程校验
rdbchecksum yes
# Enables or disables full sanitation checks for ziplist and listpack etc when
# loading an RDB or RESTORE payload. This reduces the chances of a assertion or
# crash later on while processing commands.
# Options:
# no - Never perform full sanitation
# yes - Always perform full sanitation
# clients - Perform full sanitation only for user connections.
# Excludes: RDB files, RESTORE commands received from the master
# connection, and client connections which have the
# skip-sanitize-payload ACL flag.
# The default should be 'clients' but since it currently affects cluster
# resharding via MIGRATE, it is temporarily set to 'no' by default.
#
# sanitize-dump-payload no
# The filename where to dump the DB
配置文件名称,默认dump.rdb
dbfilename dump.rdb
# Remove RDB files used by replication in instances without persistence
# enabled. By default this option is disabled, however there are environments
# where for regulations or other security concerns, RDB files persisted on
# disk by masters in order to feed replicas, or stored on disk by replicas
# in order to load them for the initial synchronization, should be deleted
# ASAP. Note that this option ONLY WORKS in instances that have both AOF
# and RDB persistence disabled, otherwise is completely ignored.
#
# An alternative (and sometimes better) way to obtain the same effect is
# to use diskless replication on both master and replicas instances. However
# in the case of replicas, diskless is not always an option.
配置rdb文件存放的路劲,这个参数比较重要。
rdb-del-sync-files no
# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
配置rdb文件存放的路劲,这个参数比较重要。
dir /var/lib/redis/6379
当 Redis 需要保存 dump.rdb 文件时, 服务器执行以下操作:
将dump.rdb 文件拷贝到redis的安装目录的bin目录下,重启redis服务即可。在实际开发中,一般会考虑到物理机硬盘损坏情况,选择备份dump.rdb 。
RDB方式不能提供强一致性,如果Redis进程崩溃,那么两次RDB之间的数据也随之消失。那么AOF的出现很好的解决了数据持久化的实时性,AOF以独立日志的方式记录每次写命令,重启时再重新执行AOF文件中的命令来恢复数据。AOF会先把命令追加在AOF缓冲区,然后根据对应策略写入硬盘(appendfsync),具体参数后面有讲。接下来介绍一下AOF重写命令。
使用bgrewriteaof命令:Redis主进程fork子进程来执行AOF重写,这个子进程创建新的AOF文件来存储重写结果,防止影响旧文件。因为fork采用了写时复制机制,子进程不能访问在其被创建出来之后产生的新数据。Redis使用“AOF重写缓冲区”保存这部分新数据,最后父进程将AOF重写缓冲区的数据写入新的AOF文件中然后使用新AOF文件替换老文件。
和RDB一样,配置在redis.conf文件里,当然你也可以通过调用CONFIG SET命令设置。我们先看来看AOF相关配置:
############################## APPEND ONLY MODE ###############################
# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check https://redis.io/topics/persistence for more information.
appendonly no
# The name of the append only file (default: "appendonly.aof")
appendfilename "appendonly.aof"
# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".
# appendfsync always
appendfsync everysec
# appendfsync no
# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.
no-appendfsync-on-rewrite no
# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# An AOF file may be found to be truncated at the end during the Redis
# startup process, when the AOF data gets loaded back into memory.
# This may happen when the system where Redis is running
# crashes, especially when an ext4 filesystem is mounted without the
# data=ordered option (however this can't happen when Redis itself
# crashes or aborts but the operating system still works correctly).
#
# Redis can either exit with an error when this happens, or load as much
# data as possible (the default now) and start if the AOF file is found
# to be truncated at the end. The following option controls this behavior.
#
# If aof-load-truncated is set to yes, a truncated AOF file is loaded and
# the Redis server starts emitting a log to inform the user of the event.
# Otherwise if the option is set to no, the server aborts with an error
# and refuses to start. When the option is set to no, the user requires
# to fix the AOF file using the "redis-check-aof" utility before to restart
# the server.
#
# Note that if the AOF file will be found to be corrupted in the middle
# the server will still exit with an error. This option only applies when
# Redis will try to read more data from the AOF file but not enough bytes
# will be found.
aof-load-truncated yes
# When rewriting the AOF file, Redis is able to use an RDB preamble in the
# AOF file for faster rewrites and recoveries. When this option is turned
# on the rewritten AOF file is composed of two different stanzas:
#
# [RDB file][AOF tail]
#
# When loading, Redis recognizes that the AOF file starts with the "REDIS"
# string and loads the prefixed RDB file, then continues loading the AOF
# tail.
aof-use-rdb-preamble yes
appendonly no:指定是 否在每次更新操作后进行日志记录,Redis在默认情况下是异步的把数据写入磁盘,如果不开启,可能会在断电 时导致一段时间内的数据丢失。因为 | | | redis本身同步数据文件是按上面save条件来同步的,所以有的数据会在一段时间内只存在于内存中。默认为no
appendfilename appendonly.aof: 指定更新日志文件名,默认为appendonly.aof
appendfsync always :命令写入aof缓冲区后,每一次写入都需要写入磁盘,慢,安全
appendfsync everysec: 命令写入aof缓冲区后,然后有专门线程每秒执行写入磁盘,相对快,可能会丢失1-2s的数据, 推荐使用,redis的默认值
appendfsync no:命令写入aof缓冲区后,之后写入磁盘的操作由操作系统负责
no-appendfsync-on-rewrite no:指 定是否在后台aof文件rewrite期间调 用fsync,默认为no,表示要调用fsync(无论后台是否有子进程在刷盘)。Redis在后台写RDB文件或重写afo文件期间会存在大量磁盘IO,此时,在某些 linux系统中,调用fsync可能会阻塞;
auto-aof-rewrite-percentage 100 :Redis记录最近的一次AOF操作的文件大小,如果当前AOF文件大小增长超过这个百分比则触发一次重写,默认100
auto-aof-rewrite-min-size 64mb:触发自动重写的最低文件体积(小于64mb不自动重写)
aof-use-rdb-preamble yes:开启混合持久化,更快的AOF重写和启动时数据恢复
aof-load-truncated yes: 指定当发生AOF文件末尾截断时,加载文件还是报错退出; yes(末尾被截断的 AOF 文件将会被加载,并打印日志通知用户) ;no(服务器将报错并拒绝启动,这时用户需要使用redis-check-aof 工具修复AOF文件,再重新启动)
注意: 当aof-use-rdb-preamble 为yes时,触发AOF重写将不再是根据当前内容生成写命令。而是先生成RDB文件写到开头,再将RDB生成期间的发生的增量写命令附加到文件末尾。
你可以猜得到,写操作不断执行的时候AOF文件会越来越大。例如,如果你增加一个计数器100次,你的数据集里只会有一个键存储这最终值,但是却有100条记录在AOF中。其中99条记录在重建当前状态时是不需要的。
于是Redis支持一个有趣的特性:在后台重建AOF而不影响服务客户端。每当你发送BGREWRITEAOF时,Redis将会写入一个新的AOF文件,包含重建当前内存中数据集所需的最短命令序列。如果你使用的是Redis 2.2的AOF,你需要不时的运行BGREWRITEAOF命令。Redis 2.4可以自动触发日志重写(查看Redis 2.4中的示例配置文件以获得更多信息)。
开启aof,采用rdb和aof混合使用
aof文件最上面是rdb文件,下面是命令日志
aof重写
aof文件,上面是rdb文件,下面是重写后的命令日志
在 Linux 系统中,调用 fork 系统调用创建子进程时,并不会把父进程所有占用的内存页复制一份,而是与父进程共用相同的内存页,而当子进程或者父进程对内存页进行修改时才会进行复制
进程的内存可分为 虚拟内存 和 物理内存。
写时复制 的原理大概如下: