MongoDB Tutorial: MapReduce

I don’t consider myself the right person to write detailed tutorials as I usually tend to omit a lot of details . But I’d like to try out a different approach: I’ll share with you the best materials I have found and used myself to learn about a specific feature. Please do let me know if you’ll find this approach useful.

Today will take a look at MongoDB MapReduce. As is normal (at least for making sure that we are getting rid of all future RTFM advice) we will start with the ☞ official documents. In MongoDB MapReduce case, the official documentation will provide us with details about:

  • the complete command syntax
  • specs for map and reduce functions
  • as a bonus a couple of basic examples

There are also a couple of important aspects that you’ll have to keep in mind while implementing your own MongoDB MapReduce functions:

  1. The MapReduce engine may invoke reduce functions iteratively; thus, these functions must be idempotent. That is, the following must hold for your reduce function:

    for all k,vals : reduce( k, [reduce(k,vals)] ) == reduce(k,vals)

  2. Currently, the return value from a reduce function cannot be an array (it’s typically an object or a number).
  3. If you need to perform an operation only once, use a finalize function.

Knowing the basics, what I’ve found to work well for me was to take a look at a simple but close to real life example. In this case I have chosen the ☞ following piece of code which implements a basic text search.

I have also found very useful to take a look at how SQL translates to MapReduce in MongoDB.

Just to make sure that I got things straight by now, I used the 3rd part of Kyle Banker’sMongoDB aggregation tutorial: MapReduce basics.

The last step in learning about MapReduce in MongoDB was to take a look at some real usecases. Depending on your programming language preference, I’d recommend one of these two MongoDB MapReduce usecases:

  • Ruby: Visualizing log files with MongoDB, MapReduce, Ruby & Google Charts: ☞ part 1and ☞ part 2
  • Perl: Using MongoDB and MapReduce on Apache Access Logs 

Summarizing our short tutorial on MongoDB MapReduce:

In case you have other materials on MongoDB MapReduce that you consider essential please share them with us!

 

转载自:MongoDB Tutorial: MapReduce

你可能感兴趣的:(mapreduce)