我的hadoop版本hadoop-2.6.5。我的Es版本elasticsearch-6.4.3。首先保证你的hadoop集群可以跑wordcount例子。
以下为pom文件,注意这里要加上
org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.Error: Multiple ES-Hadoop versions detected in the classpath; please use only one
jar:file:/usr/local/hadoop/hadoop-2.6.5/share/hadoop/yarn/lib/elasticsearch-hadoop-6.4.3.jar
jar:file:/home/bigdata/tmp/nm-local-dir/usercache/root/appcache/application_1551077535613_0047/filecache/10/job.jar/job.jar
org.elasticsearch elasticsearch-hadoop 6.4.3 provided
接下来需要把elasticsearch-hadoop-6.4.3.jar,这个jar包放入到hadoop目录下的yarn的lib下
我的lib位置为/usr/local/hadoop/hadoop-2.6.5/share/hadoop/yarn/lib
必须放入,必须放入,必须放入!!!否则会出现以下错误
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.elasticsearch.hadoop.mr.EsOutputFormat not found
接下来上代码
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException { Configuration conf = new Configuration(); conf.setBoolean("mapred.map.tasks.speculative.execution", false); conf.setBoolean("mapred.reduce.tasks.speculative.execution", false); conf.set("es.nodes", "localhost:9200"); conf.set("es.resource", "my_index/my_type"); conf.set("es.mapping.id", "id"); conf.set("es.input.json", "yes"); Job job = Job.getInstance(conf, "hadoop es write test"); job.setJarByClass(HdfsToES.class); job.setMapperClass(HdfsToES.MyMapper.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(EsOutputFormat.class); job.setMapOutputKeyClass(NullWritable.class); job.setMapOutputValueClass(Text.class); // 设置输入路径 FileInputFormat.setInputPaths(job, new Path ("hdfs://node01:8020/xxxx/xxxx.json")); job.waitForCompletion(true); }
public static class MyMapper extends Mapper
Text valueout = new Text();
valueout.set(jsonObject2.toJSONString().trim());
context.write(NullWritable.get(), valueout);
}
}
}
以下为错误展示:
org.apache.hadoop.mapred.YarnChild: Exception running child : org.elasticsearch.hadoop.serialization.EsHadoopSerializationException: org.codehaus.jackson.JsonParseException: Unexpected character ('c' (code 99)): was expecting double-quote to start field name
出现以上错误请使用job.setMapOutputValueClass(Text.class);
不要使用job.setMapOutputValueClass(BytesWritable.class);
或者job.setMapOutputValueClass(LinkedMapWritable.class);
org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.io.ArrayWritable.
或者
org.elasticsearch.hadoop.serialization.EsHadoopSerializationException: org.codehaus.jackson.JsonParseException: Unexpected character ('b' (code 98)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: [B@69c79f09; line: 1, column: 3]
出现以上错误请使用job.setMapOutputValueClass(Text.class);不要使用LinkedMapWritable
使用json存放数据(我是用的alibaba的fastjson),
然后Text valueout = new Text();
valueout.set(jsonObject2.toJSONString().trim());
context.write(NullWritable.get(), valueout);
出现错误不要着急,更不要盲目的反复,反复,反复。试验csdn博主给的解决方式,往往版本不同,环境不同,解决方法就不一样。别人适用的解决方法并不适合你的情况。多查资料,多动脑