下面是redis作者写的一篇文章,里面阐述了为什么redis不使用compact方法去合并aof文件!
文章来源:A few key problems in Redis persistence Saturday, 02 October 10
推荐先阅读: http://redis.io/topics/persistence
Redis: the strength is the data model, and the deficiency is the persistence.
We want two things that are hard to play well together:
snapshotting为什么使用cow方式,而不是下面的方式:
为了解决snapshotting可能丢失数据的问题,另外一个办法就是用AOF,但是使用它是有代价的:
那么redis是如何解决AOF越来越大的问题呢? redis使用的还是COW
What Redis does in order to compact the AOF is rewriting it from scratch. This means to do something very similar to writing the point in time snapshot but just into a format that happens to be a valid sequence of Redis commands. So Redis does not read the old AOF to rebuild it. It instead reads what we have in memory to write a perfect (as small as possible) AOF from scratch. When the new AOF is in place, we do an atomic rename syscall swapping the old with the new.
This is done in the child process, again, it's basically exactly the same problem of dealing with the point in time snapshot, but with the additional problem (that is easy to fix) of accumulating the new queries while the AOF rewrite is in progress, so that before to swap the old with the new, we will also append all the new operations accumulated in the meantime.
这种方式导致的一种最坏情况就是内存 X 2,那为什么不去compact aof files呢?
Segment it into small pieces, in different physical files: AOF.1, AOF.2, AOF.3 ... Every time the AOF is big enough we open a new file and continue, let a backgroud process to merge the old ones.
针对key/value, 如果仅仅有Set和Del,下面这种compact是可行的(去重):
This does not work when you have complex operations against aggregate data types.
To start, in order to even parse the AOF with the command line tool, this tool need to be Redis-complete. All the operations should be implemented, for instance intersections between sets, otherwise what I'll do when I encounter a SUNIONSTORE operation?
Also, in Redis most operations are not idempotent. Think about LPUSH for instance, the simplest of our list write operations. The only way to turn list operations into idempotent operations is to turn all of them into an MKLIST <key> ... all the elements ... operation. Not viable at all.
Our command line tool at best will be able to exploit SET and DEL operations in order to reduce the size of the file, but it will just loose against an always updated sorted set.
所以作者认为付出2 x memory的代价是值得的,至少目前这个是最好的解决办法!
To pay 2x of the memory in the worst case may not be so bad at this stage, because it is not trivial to find a really better solution with the Redis data model. Does this means we'll never improve this part of Redis? Absolutely not! We'll try hard to make it better, for instance possibly using a binary AOF format that is faster to write and to load, more compact. We may write a command line tool that is able to process the AOF multiple time only dealing with a subset of the keys at every pass in order to use less memory (but we have inter-key operations on aggregate data types, so this may only work well if such operations are not used). For sure we'll keep investigating.
Possibly new ideas will emerge. For instance currently I'm working at Redis Cluster. With the cluster it is possible to run many instances of Redis in the same computer in a transparent way. Every instance will save just his subset of data that can be much smaller than a single giant instance. Both snapshotting and AOF log rewrite will be simpler to perform.
Redis is young and there are open problems, but this is an interesting challenge I want to take ;)