文章来源: http://perfectmarket.com/blog/not_only_nosql_review_solution_evaluation_guide_chart
在原文下面,redis的作者也加了自己的comments,可以仔细查看一下!
You may think this is yet another blog on NoSQL (Not Only SQL) hype.
Yes, it is.
But if at this moment you are still struggling to find a NoSQL solution that works, read through to the end, and you may have decided what to do. (I will keep the answer to the end just for fun.) — For those of you who can't wait for the answer, you can skip to the chart below.
When I was involved in developing Perfect Market's content processing platform, I desperately tried to find an extremely fast — in terms of both latency and processing time — and scalable NoSQL database solution to support simple key-value (KV) lookup.
I had pre-determined requirements for the 'solution-to-be' before I started looking:
I started looking without any bias in mind since I had never seriously used any of the NoSQL solutions. With some recommendations from fellow co-workers, and after reading a bunch of blogs (yes, blogs), the journey of evaluation started with Tokyo Cabinet, then Berkeley DB library, MemcacheDB, Project Voldemort, Redis, and finally MongoDB.
There are other very popular alternatives, like Cassandra, HBase, CouchDB … you name it, but we haven't needed to try them yet because the one we selected worked so well. The result turned out to be pretty amazing and this blog post shares some details of my testing.
To explain which one was picked, and why it was picked, I took a suggestion from my co-worker Jay Budzik (CTO), and compiled a comparison chart for all solutions I have evaluated (below). Although this chart is an after-fact thing, well, it’s still helpful to show the rationale behind the and will be helpful to people who are still in decision-making process.
Please note that the chart is not 100% objective and scientific. It is a combination of the testing results and my gut feelings. It was funny that I started the evaluation process without any bias, but after testing out all of them, I may be biased (especially biased based on my use cases).
Another thing to note is that disk access is by far the slowest operation in these I/O intensive workloads. Compared to memory access it is milliseconds to nanoseconds. To handle a data set containing hundreds of millions of rows, you better give enough memory to your computer. If your computer only has 4GB of memory and you try to handle a 50GB data set and expect ultimate speed, you need to either toss your computer and use a better one, or toss out all of the following solutions because none of them will work.
Looking at this chart, you may start to guess which solution I picked. No rush, let me tell you more about each of them.
Tokyo Cabinet (TC) is a very nice piece of work and was the first one I evaluated. I still like it very much, although it was not ultimately selected for our application. The quality of the work was amazing. The hash table database is extremely fast on small data sets (below 20 million rows, I would say) and horizontal scalability is fairly good. The problem with TC is that when the data size increases, performance degradation is significant, for both reads and writes. Even worse, with large data sets performance is not consistent when accessing different parts of the data set. Accessing data inserted earlier appears to be faster than accessing data inserted later. I’m not an expert on TC, and do not have an explanation for this behavior, but the behavior made it impossible to use TC for our application. Using the TC B-Tree database option did not exhibit the same problem but overall performance was much slower.
Berkeley DB (BDB) and MemcacheDB (remote interface of BDB) are a pretty old combination. If you are familiar with BDB, and are not so demanding on speed and feature set, e.g., you are willing to wait for couple days to load a large data set into the database and you are happy with an OK but not excellent read speed, you can still use it. For us, the fact that it took so long to load the initial data set made it less good.
Project Voldemort was the only Java-based and “cloud” style solution I evaluated. I had very high expectations before I started due to all the hype, but the result turned out to be a little disappointing, and here is why:
Since data was bloated too much and sometimes crashes happened, the data loading process did not even finish. With only one quarter of the data set populated, it got an OK read speed but not excellent. At that time I thought I better give up on this. Otherwise, besides the above listed tuning, JVM may turn more of my hair gray, although I worked for Sun for five years.
Redis is an excellent caching solution and we almost adopted it in our system. Redis stores the whole hash table in memory and has a background thread that saves a snapshot of the hash table onto the disk based on a preset time interval. If the system is rebooted, it can load the snapshot from disk into memory and have the cache warmed at startup. It takes a couple of minutes to restore 20GB of data depending on your disk speed. This is a great idea and Redis was a decent implementation.
But for our use-cases it did not fit well. The background saving process still bothered me, especially when the hash table got bigger. I had a fear that it may negatively impact read speed. Using logging style persistence instead of saving the whole snapshot could mitigate the impact of these dig dumps, but the data size will be bloated if frequently, which eventually may negatively affect restore time. The single-threaded model does not sound that scalable either, although, in my testing, it scaled pretty well horizontally with a few hundred concurrent reads.
Another thing that bothered me with Redis was that the whole data set must fit into physical memory. It would not be easy to manage this in our diversified environment in different phases of the product lifecycle. Redis’ recent release on VM might mitigate this problem though.
MongoDB is by far the solution I love the most, among all the solutions I have evaluated, and was the winner out of the evaluation process and is currently used in our platform.
MongoDB provides distinct and superior insertion speed probably due to deferred writes and fast file extension with multiple files per collection structure. As long as you give enough memory to your box, hundred of millions of rows can be inserted in hours, not days. I would post exact numbers here but it would be too specific to be useful. But trust me — MongoDB offers very fast bulk inserts.
MongoDB uses memory mapped files and usually it takes only nanoseconds to resolve minor page faults to get file system cached pages mapped into MongoDB’s memory space. Compared to other solutions, MongoDB will not compete with page cache since they are same memory for read-only blocks. With other solutions, if you allocate too much memory for the tool itself, then the box may fall short on page cache, and usually it’s not easy or there may not be an efficient way to have the tool’s cache fully pre-warmed (you definitely don’t want to read every row beforehand!).
For MongoDB, it’s very easy to do some simple tricks (copy, cat or whatever) to have all data loaded in page cache. Once in that state, MongoDB is just like Redis, which performs super well on random reads.
In one of the tests I did, MongoDB showed overall 400,000 QPS with 200 concurrent clients doing constant random reads on a large data set (hundred millions of rows). In the test, data was pre-warmed in page cache. In later tests, MongoDB also showed great random read speed under moderate write load. For a relatively big payload, we compress it and then save it in MongoDB to further reduce data size so more stuff can fit into memory.
MongoDB provides a handy client (similar to MySQL’s) which is very easy to use. It also provides advanced query features, and features for handling big documents, but we don’t use any of them. MongoDB is very stable and almost zero maintenance, except you may need to monitor memory usage when data grows. MongoDB has rich client support in different languages, which makes it very easy to use. I will not go through the laundry list here but I think you get the point.
Although MongoDB is the solution for most NoSQL use cases, it’s not the only solution for all NoSQL needs. If you only need to handle small data sets, Tokyo Cabinet is pretty neat. If you need to handle huge data sets (petabytes), and have a lot of machines, and if latency is not an issue, and you are not pursuing ultimate response time, Cassandra, HBase might be a good fit.
Lastly, if you still need to deal with transactions, don’t bother with NoSQL, use Oracle.
— Jun Xu
Principal Software Engineer