hive写入ES报错 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

HIVE整合ES场景:

ES版本:6.5.2

hive版本:2.3.4

插件包下载地址:https://repo.maven.apache.org/maven2/org/elasticsearch/elasticsearch-hadoop-hive/

整合过程网上很多资料,下载与自己ES版本匹配的插件包,然后将elasticsearch-hadoop-hive-6.5.2.jar拷贝到hive安装目录lib下。因为我的hive下已经有

commons-httpclient-3.0.1.jar文件了,所以不用做重复拷贝了。

执行hivesql将表数据写入到elasticsearch报错如下:

2020-05-30 14:08:12】org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
	at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:379)
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257)
	at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

这里日志显示的不是很明确,到yarn管理页面找到具体的MR任务看日志报找不到es的类错误,日志如下:

 

Caused by: java.lang.ClassNotFoundException: org.elasticsearch.hadoop.hive.EsHiveInputFormat
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
	... 60 more

2020-05-30 14:56:01,845 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1590680711550_0090_m_000000_0: Error: java.lang.RuntimeException: Failed to load plan: hdfs://SINOTRUK/tmp/hive/hive/c7f92979-e7cc-4a39-beba-d8007ddcc817/hive_2020-05-30_14-55-44_617_4414312671758408146-2/-mr-10002/7db2e153-f315-4f63-9886-0774d2f85bf0/map.xml
	at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:481)
	at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:313)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:394)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:665)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:658)
	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:692)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:175)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:444)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: org.elasticsearch.hadoop.hive.EsHiveInputFormat

因为是执行的是mapreduce任务,显然只是将(elasticsearch-hadoop-hive-6.5.2.jar)和(commons-httpclient-3.0.1.jar)包放在某一个节点是不行的。所以需要将(elasticsearch-hadoop-hive-6.5.2.jar)和(commons-httpclient-3.0.1.jar)包拷贝到每个hadoop安装目录的lib下。重启hadoop和hive即可。

如果没有拷贝(commons-httpclient-3.0.1.jar)包,执行任务会报如下错误:

2020-05-30 15:22:51,047 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1590680711550_0092_m_000000_1: Error: java.lang.ClassNotFoundException: org.apache.commons.httpclient.Credentials
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransportFactory.create(CommonsHttpTransportFactory.java:40)
	at org.elasticsearch.hadoop.rest.NetworkClient.selectNextNode(NetworkClient.java:102)
	at org.elasticsearch.hadoop.rest.NetworkClient.(NetworkClient.java:85)
	at org.elasticsearch.hadoop.rest.NetworkClient.(NetworkClient.java:61)
	at org.elasticsearch.hadoop.rest.RestClient.(RestClient.java:94)
	at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:615)
	at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:173)
	at org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:58)
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:762)
	at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:101)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
	at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:137)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
	at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.closeOp(VectorMapOperator.java:900)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)

总结:因为hivesql提交后,yarn会把MR任务分发到某个节点执行,如果该节点的hadoop下没有(elasticsearch-hadoop-hive-6.5.2.jar)和(commons-httpclient-3.0.1.jar)这两个包就会抛出异常。

建议: 将(elasticsearch-hadoop-hive-6.5.2.jar)和(commons-httpclient-3.0.1.jar)这两个包拷贝到集群的hadoop安装目录的lib下和hive安装目录的lib下。

你可能感兴趣的:(Hadoop,elasticsearch,hive)