Redis trouble21 -- aof持久化导致redis命令阻塞

目录

1.异常日志

2.问题分析

 3.引起原因

 4.解决方案

5.appendfsync everysec不是1s


1.异常日志

 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
 * Starting automatic rewriting of AOF on 107914% growth
 * Background append only file rewriting started by pid 4143
 * AOF rewrite child asks to stop sending diffs.
 * Parent agreed to stop sending diffs. Finalizing AOF...
 * Concatenating 0.00 MB of AOF diff received from parent.
 * SYNC append only file rewrite performed
 * AOF rewrite: 2 MB of memory used by copy-on-write
 * Background AOF rewrite terminated with success
 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
 * Background AOF rewrite finished successfully
 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

2.问题分析

'配置文件配置'
appendonly yes # 开启aof
appendfsync everysec # 设置aof策略,每秒写入一次
aof-use-rdb-preamble yes #开启aof rdb混合使用
aof-load-truncated yes # redis启动加载aof文件时,忽略掉错误的命令,尽可能多的加载可用命令
aof-rewrite-incremental-fsync yes # 分批刷入aof文件,可以有效利用顺序IO
no-appendfsync-on-rewrite no # 保证数据尽可能少的丢失,设置为no,最多丢失2s数据,设置为yes,最多会丢失30s数据
auto-aof-rewrite-min-size 67108864 # aof文件大小 64M
auto-aof-rewrite-percentage 100 #(aof_current_size-aof_base_size)/aof_base_size与100%比较

'触发rewrite机制下边两条同时满足'
1.当前aof文件(aof_current_size)> 64MB
2.(aof_current_size-aof_base_size)/aof_base_size > 100%

结合监控分析
右边aof_delayed_fsync参数一致在持续增加,代表着aof在持续发生阻塞的情况
左边可以看到已经满足上述的aof进行rewrite的条件,aof在频繁的进行rewrite操作

Redis trouble21 -- aof持久化导致redis命令阻塞_第1张图片

 3.引起原因

查看了监控的命令,以及aof文件的命令总结以下原因
1.客户端是用redis来做队列,又怕数据丢失,选择了aof做持久化,队列中的key还都很大,基本上都是30k左右的值,虽然监控上看内存的值是没有很大
2.大量的大命令都堆积到了aof文件中,aof文件很快就达到了rewrite的触发条件,导致redis在不断的进行rewrite
3.又因设置了no-appendfsync-on-rewrite no,所以在rewrite期间,是不允许追加fsync的,再加上频繁的rewrite操作,就导致了aof的阻塞发生

 4.解决方案

对于redis来说,最好还是用来做缓存,用来做队列,还要使用aof来持久化是不建议的,上边就是很好的例子,建议将redis做队列的功能,更改为用kafka/rabbitmq/rocketmq等专业的队列中间件来实现,若想继续使用redis做的话,请关闭aof持久化,并减小参数值,避免redis的阻塞,至于数据丢失问题,可以外加数据补偿机制,如果redis宕机等以外情况发生可以自行重推数据

5.appendfsync everysec不是1s

no-appendfsync-on-rewrite no  /  appendfsync everysec
每秒落盘一次,实际上不是1s,看下边的逻辑图,主线程在对比时间判断的是2s,此时最多丢失2s数据

no-appendfsync-on-rewrite yes  /  appendfsync everysec  等价于  appendfsync no
那么buff中的数据只能等到linux的sync执行的时候才会落盘,默认间隔30s,此时最多丢失30s数据

Redis trouble21 -- aof持久化导致redis命令阻塞_第2张图片

你可能感兴趣的:(redis-troubles,redis,数据库,database)