继承关系1
1. java.lang.Object |__ org.apache.hadoop.mapreduce.JobContext |__org.apache.hadoop.mapreduce.TaskAttemptContext |__ org.apache.hadoop.mapreduce.TaskInputOutputContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> |__org.apache.hadoop.mapreduce.MapContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> |__ org.apache.hadoop.mapreduce.Mapper.Context Description: public class Mapper.Context extends MapContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> Constructor Summary: Mapper.Context(Configuration conf, TaskAttemptID taskid, RecordReader<KEYIN,VALUEIN> reader, RecordWriter<KEYOUT,VALUEOUT> writer, OutputCommitter committer, StatusReporter reporter, InputSplit split) Method Summary: Methods inherited from class org.apache.hadoop.mapreduce.MapContext getCurrentKey, getCurrentValue, getInputSplit, nextKeyValue Methods inherited from class org.apache.hadoop.mapreduce.TaskInputOutputContext getCounter, getCounter, getOutputCommitter, progress, setStatus, write Methods inherited from class org.apache.hadoop.mapreduce.TaskAttemptContext getStatus, getTaskAttemptID Methods inherited from class org.apache.hadoop.mapreduce.JobContext getCombinerClass, getConfiguration, getCredentials, getGroupingComparator, getInputFormatClass, getJar, getJobID, getJobName, getMapOutputKeyClass, getMapOutputValueClass, getMapperClass, getNumReduceTasks, getOutputFormatClass, getOutputKeyClass, getOutputValueClass, getPartitionerClass, getReducerClass, getSortComparator, getWorkingDirectory Methods inherited from class java.lang.Object clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait 2. java.lang.Object org.apache.hadoop.mapreduce.JobContext |_ org.apache.hadoop.mapreduce.TaskAttemptContext |_ org.apache.hadoop.mapreduce.TaskInputOutputContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> |_ org.apache.hadoop.mapreduce.ReduceContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> |_ org.apache.hadoop.mapreduce.Reducer.Context Description: public class Reducer.Contextextends ReduceContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT> Constructor Summary: Reducer.Context(Configuration conf, TaskAttemptID taskid, RawKeyValueIterator input, Counter inputKeyCounter, Counter inputValueCounter, RecordWriter<KEYOUT,VALUEOUT> output, OutputCommitter committer, StatusReporter reporter, RawComparator<KEYIN> comparator, Class<KEYIN> keyClass, Class<VALUEIN> valueClass) Method Summary: Methods inherited from class org.apache.hadoop.mapreduce.ReduceContext getCurrentKey, getCurrentValue, getValues, nextKey, nextKeyValue Methods inherited from class org.apache.hadoop.mapreduce.TaskInputOutputContext getCounter, getCounter, getOutputCommitter, progress, setStatus, write Methods inherited from class org.apache.hadoop.mapreduce.TaskAttemptContext getStatus, getTaskAttemptID Methods inherited from class org.apache.hadoop.mapreduce.JobContext getCombinerClass, getConfiguration, getCredentials, getGroupingComparator, getInputFormatClass, getJar, getJobID, getJobName, getMapOutputKeyClass, getMapOutputValueClass, getMapperClass, getNumReduceTasks, getOutputFormatClass, getOutputKeyClass, getOutputValueClass, getPartitionerClass, getReducerClass, getSortComparator, getWorkingDirectory Methods inherited from class java.lang.Object clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
继承关系2
Code
1.MaxTemperatureMapper.java
1 import java.io.IOException; 2 3 import org.apache.hadoop.mapreduce.Mapper; 4 import org.apache.hadoop.io.LongWritable; 5 import org.apache.hadoop.io.IntWritable; 6 import org.apache.hadoop.io.Text; 7 8 public class MaxTemperatureMapper 9 extends Mapper<LongWritable, Text, Text, IntWritable> { 10 11 @Override 12 public void map(LongWritable key, Text, value, Context context) 13 throws IOException, InterruptedExceptioin { 14 15 String line = value.toString(); 16 String year = line.subString(15,19); 17 int airTemperature = Integer.parseInt(line.subString(87,92)); 18 context.write(new Text(year), new IntWritable(airTemperature)); 19 } 20 }
2.MaxTemperatureMapperTest.java
1 import java.io.IOException; 2 import org.apache.hadoop.io.LongWritable; 3 import org.apache.hadoop.io.IntWritable; 4 import org.apache.hadoop.io.Text; 5 import org.junit.Test; 6 import org.apache.hadoop.mrunit.mapreduce.MapDriver; 7 8 public class MaxTemperatureMapperTest { 9 10 @Test 11 public void processesValidRecord() throws IOException { 12 Text value = new Text("0043011990999991950051518004+68750+023550FM-12+0382" + 13 // Year ^^^^ 14 "99999V0203201N00261220001CN9999999N9-00111+99999999999"); 15 // Temperature ^^^^^ 16 new MapDriver<LongWritable, Text, Text, IntWritable>() 17 .withMapper(new MaxTemperatureMapper()) 18 .withInput(new LongWritable(1), value) 19 .withOutput(new Text("1950"), new IntWritable(-11)) 20 .runTest(); 21 } 22 }
注意一些deprecated的class和methods:
org.apache.hadoop.mrunit.MapDriver<K1,V1,K2,V2>被弃用应该可以理解,此类是为mapreduce的旧API(比如org.apache.hadoop.mapred)写的,比如其中一个方法
MapDriver<K1,V1,K2,V2> |
withMapper(org.apache.hadoop.mapred.Mapper<K1,V1,K2,V2> m) |
mapreduce的新API为org.apache.hadoop.mapreduce.*; 与之对应MRUnit的MapDriver(包括ReduceDriver)为:
org.apache.hadoop.mrunit.mapreduce.MapDriver<K1,V1,K2,V2>, 同样的,上述方法变为:
MapDriver<K1,V1,K2,V2> |
withCounters(org.apache.hadoop.mapreduce.Counters ctrs) |
MapDriverBase class中的T withInputValue(V1 val) 被弃用,改为T withInput(K1 key, V1 val) ,还有很多,不详列。
执行步骤:
注意: 需要下载MRUnit并编译,在/home/user/.bashrc下设置MRUnit_HOME变量, 之后修改$HADOOP_HOME/libexec/hadoop-config.sh,将$MRUnit_HOME/lib/*.jar添加进去, 之后source $HADOOP_HOME/libexec/hadoop-config.sh,再执行下面操作:
javac -d class/ MaxTemperatureMapper.java MaxTemperatureMapperTest.java jar -cvf test.jar -C class ./ java -cp test.jar:$CLASSPATH org.junit.runner.JUnitCore MaxTemperatureMapperTest # or yarn -cp test.jar:$CLASSPATH org.junit.runner.JUnitCore MaxTemperatureMapperTest