This talk goes over various performance tuning techniques used in real world examples from our various implementations of MongoDB at Shutterfly. We will cover various techniques including usage of the profiler, query tuning, monitoring for performance, data-modeling, data locality. I will also discuss our implementation of Facebook Flashcache for MongoDB.
Presented by Kenny Gorman
We’re live-blogging from MongoSV today. Here’s a link to the entire series of posts.
Kenny is getting started, talking about performance tuning based on experience at Shutterfly. They have 8 MongoDB clusters in production with ~30 servers. Not cloud based: all own hardware and datacenters.
MongoDB performance tuning is similar to traditional RDBMS tuning. Looking at queries, indexes, etc. If performance isn’t good on a single server than don’t look to sharding, reading from replicas, etc. Single server performance is critical.
Modeling is key. Schema design can be really important for performance (recommends talks later on by Eliot & Kyle).
Know when to stop tuning: prioritize what is important/adequate for the business/application. What needs to be fast? Build tuning into dev. lifecycle, don’t wait until there’s an issue. Tuning is “personal”: need to know your problem/domain.
MongoDB is really fast when read only, writes start to impact performance. Important consideration during design phase.
The profiler. Writes to db.system.profile collection. Recommendation is to turn it on and leave it on: low overhead. Look for full scans (nreturned vs nscanned) and updates (ideally you want fastmod - in place updates. Look for moved & key updates).
Should graph response times over time (from the system.profile collection). Shows performance over time of db. To look at the profiling data just do `show profile` from the shell.
Showing examples of data from the profiler: here’s an example where nscanned is 10000 and nreturned is 1: we need an index! Another example where need to move the document due to an update (keyword “moved” in the profile doc.). Now showing an example using $inc - you’ll see “fastmod” in the profile document - that’s good!
Now talking about explain(). Use during development, don’t wait. This actually runs the query when you call it. When you find a bad op using the profiler, run explain on it to get more info: shows index usage, yields, covered indexes, nscanned vs nreturned. Another recommendation: run explain() twice to see difference when data is in memory. Showing the difference between a query w/ and w/o an index in terms of explain.
Now talking about covered indexes: need to do a projection that says we don’t need _id: `db.test.find({userid: 10}, {_id: 0, userid: 1})`. When you don’t need _id it’s possible to respond to the query using the index only.
Architecture tips: split on functional areas first to different replica set clusters, then worry about sharding those (possibly). Do reads off of slaves when you can, but be sure your app can handle inconsistent reads first. Also, use slaves for maintenance (index compaction, etc.). Move reports & backups to slaves, too. One mongod instance per machine: keeps things simple for introspection.
Emphasizing the importance of minimizing writes.
Now we’re talking about data locality. When you’re doing a query it’s best if the results are as dense as possible (as few blocks on disk). How do you maintain this? Here’s an example of how to see this: need to include `$diskLoc` in your query document, and finish with a `.showDiskLoc()` (analogous to `.explain()`).
Total performance is a function of write performance. Keep an eye on lock % and queue size: how much is the DB waiting for writes. A trick (for pre 2.0 when data > RAM) is to do read before write: spend more time in read lock rather than write lock. Tune for fastmod’s: reduce moves (maybe by pre-padding documents). Evaluate indexes for key changes, minimize # of indexes if unused. Look for places to do inserts instead of updates.
What about scaling reads? They scale easily if writes are tuned. Identify reads that can be performed on slaves. Make sure you have enough RAM for indexes - can check the mongostat “faults” column for cache misses. Minimize I/O per query (back to data locality).
Tools: mongostat (look for faults & lock % / queue lengeth). currentOp() to see what’s waiting. mtop to get a picture of current session level information. iostat to see how much physical I/O is going on. Do load testing before going live. Use MMS (or some other monitoring system).
What if you still need more performance after doing all of this tuning? One option is to use SSDs. Shutterfly uses Facebook’s flashcache: kernel module to cache data on SSD. Designed for MySQL/InnoDB. SSD in front of a disk, but exposed as a single mount point. This only makes sense when you have lots of physical I/O. Shutterfly saw a speedup of 500% w/ flashcache. A benefit is that you can delay sharding: less complexity.
http://www.10gen.com/presentations/mongosv-2011/performance-tuning-and-scalability
Further Reading: http://www.mongodb.org/display/DOCS/Database+Profiler
http://www.mongodb.org/display/DOCS/Explain
http://www.mongodb.org/display/DOCS/Optimization