Overview of MapReduce Algorithm Design

Although the programming model of MapReduce framework force one to express algorithms in terms of a small set of rigidly defined components, there are many tools at one's disposal to shape the flow of computation. Ultimately, this boils down to effectively use of the following techniques:

  1. Constructing complex keys and values that bring together data necessary for a computation. 
  2. Executing user-specified initialization and termination code in either the mapper or reducer. For example, in-mapping combining depends on emission of intermediate key-value pairs in the map task termination code.
  3. Preserving state across multiple inputs in the mapper and reducer. 
  4. Controlling the sort order of intermediate keys with built-in or user-defined sorters. 
  5. Controlling the partitioning of the intermediate key space with built-in or user-defined partitioners.

你可能感兴趣的:(mapreduce)