hadoop 输出MultipleOutputs学习及应用情境


MultipleOutputs可以轻易的将输出数据输出为多个。

案例一:writing to additional outputs other than the job default output.


案例二:to write data to different files provided by user


举例:

* Usage pattern for job submission:
* <pre>
*
* Job job = new Job();
*
* FileInputFormat.setInputPath(job, inDir);
* FileOutputFormat.setOutputPath(job, outDir);
*
* job.setMapperClass(MOMap.class);
* job.setReducerClass(MOReduce.class);
* ...
*
* // Defines additional single text based output 'text' for the job
* MultipleOutputs.addNamedOutput(job, "text", TextOutputFormat.class, LongWritable.class, Text.class);
*
* // Defines additional sequence-file based output 'sequence' for the job
* MultipleOutputs.addNamedOutput(job, "seq", SequenceFileOutputFormat.class, LongWritable.class, Text.class);
* ...
*
* job.waitForCompletion(true);
* ...

* </pre>
* <p>
* Usage in Reducer:
* <pre>
* <K, V> String generateFileName(K k, V v) {
* return k.toString() + "_" + v.toString();
* }
*
* public class MOReduce extends
* Reducer&lt;WritableComparable, Writable,WritableComparable, Writable&gt; {
* private MultipleOutputs mos;
* public void setup(Context context) {
* ...
* mos = new MultipleOutputs(context);
* }
*
* public void reduce(WritableComparable key, Iterator&lt;Writable&gt; values,
* Context context)
* throws IOException {
* ...
* mos.write("text", , key, new Text("Hello"));
* mos.write("seq", LongWritable(1), new Text("Bye"), "seq_a");
* mos.write("seq", LongWritable(2), key, new Text("Chau"), "seq_b");
* mos.write(key, new Text("value"), generateFileName(key, new Text("value")));
* ...
* }
*
* public void cleanup(Context) throws IOException {
* mos.close();
* ...
* }
*
* }
* </pre>

你可能感兴趣的:(hadoop)