2014-04-04 15:39:08,521 WARN org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles - Bulk load operation did not find any files to load in directory hdfs:// Does it contain files in subdirectories that correspond to column family names?
这一定是Reduce的问题了,去看看HFileOutputFormat.configureIncrementalLoad(job, htable); 到底做了什么。
job.setOutputKeyClass(ImmutableBytesWritable.class); job.setOutputValueClass(KeyValue.class); job.setOutputFormatClass(HFileOutputFormat.class); // Based on the configured map output class, set the correct reducer to properly // sort the incoming values. // TODO it would be nice to pick one or the other of these formats. if (KeyValue.class.equals(job.getMapOutputValueClass())) { job.setReducerClass(KeyValueSortReducer.class); } else if (Put.class.equals(job.getMapOutputValueClass())) { job.setReducerClass(PutSortReducer.class); } else if (Text.class.equals(job.getMapOutputValueClass())) { job.setReducerClass(TextSortReducer.class); } else { LOG.warn("Unknown map output value type:" + job.getMapOutputValueClass()); }
getOutputValueClass mapreduce.job.output.value.class setOutputValueClass mapreduce.job.output.value.class setMapOutputValueClass mapreduce.map.output.value.class getMapOutputValueClass mapreduce.map.output.value.class /** * Set the value class for the map output data. This allows the user to * specify the map output value class to be different than the final output * value class. * * @param theClass the map output value class. * @throws IllegalStateException if the job is submitted */ public void setMapOutputValueClass(Class<?> theClass ) throws IllegalStateException { ensureState(JobState.DEFINE); conf.setMapOutputValueClass(theClass); } /** * Get the value class for the map output data. If it is not set, use the * (final) output value class This allows the map output value class to be * different than the final output value class. * * @return the map output value class. */ public Class<?> getMapOutputValueClass() { Class<?> retv = getClass(JobContext.MAP_OUTPUT_VALUE_CLASS, null, Object.class); if (retv == null) { retv = getOutputValueClass(); } return retv; }
- getMapOutputValueClass的值,在没有setMapOutputValueClass时,将使用setOutputValueClass的值。
- 允许map output value的class(即getMapOutputValueClass)和最终output value的(Reduceo output value的)class(即getOutputValueClass)不同。泛型类PutSortReducer<ImmutableBytesWritable, Put, ImmutableBytesWritable, KeyValue>说明map output value的class为Put,最终的为KeyValue。
- 上述同样适用于KeyClass。
rm -rf /tmp/hbase-root*
<property> <name>hbase.zookeeper.property.dataDir</name> <value>/tmp/hbase-root</value> default <description>Property from ZooKeeper's config zoo.cfg. The directory where the snapshot is stored. </description> </property>