[Hadoop] 新API容易遇到的一个问题: expected LongWritable recieved Text

我们在之前一篇WordCount的文章里面使用了下面这条语句:

 

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

这本身不起眼的一句话,其实有一个容易错的地方。

 

如果你遇到

写道
Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable, recieved org.apache.hadoop.io.Text

 这样的错误,那基本上就是因为前面设置了TextInputFormat造成的。

下面是SO上面的一个解释:

StackOverflow 写道
You are using TextOutputFormat which emits LongWritable key and Text value by default, but you are emitting Text as key and IntWritable as value. You need to tell this to the famework.

解决方案:

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);

 或者一开始就不要这样设置,类似如下即可:

public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
		// TODO Auto-generated method stub
		
		Configuration conf = new Configuration();
		Job job = new Job();
		job.setJarByClass(Dedup.class);
		job.setMapperClass(Map.class);
		job.setCombinerClass(Reduce.class);
		job.setReducerClass(Reduce.class);
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(Text.class);
//		job.setInputFormatClass(TextInputFormat.class);
//		job.setOutputFormatClass(TextOutputFormat.class);
//		job.setMapOutputKeyClass(Text.class);
//		job.setMapOutputValueClass(Text.class);
		FileInputFormat.addInputPath(job, new Path("/home/hadoop/DataSet/Hadoop/Dedup-1"));
		FileOutputFormat.setOutputPath(job, new Path("/home/hadoop/DataSet/Hadoop/Dedup-output"));
		System.out.println(job.waitForCompletion(true));
	}

  

你可能感兴趣的:(hadoop)