[DONE] Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask的一种出错原因解析并解决

声明

首先声明,此文章解决的是因各个节点中yarn-site.xml配置冲突而产生的问题,如果非此原因,请珍惜宝贵时间!

问题说明

在执行hive insert时候,出现了如下错误

hive (default)> select * from emp order by sal; 
...
2019-04-09 09:56:49,924 Stage-1 map = 0%,  reduce = 0%
2019-04-09 09:56:58,231 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.32 sec
2019-04-09 09:57:58,271 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.32 sec
2019-04-09 09:58:59,020 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.32 sec
.....
Ended Job = job_1554732847545_0001 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
....
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 3.32 sec   HDFS Read: 6682 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 3 seconds 320 msec

问题查找

而在查询log时候,显示的问题是:
> 无法为容器分配内存,大致说的是
> 容器需要申请3096M的内存,而slave1只有 2048的可申请内存
>当时(真分布式,master+slave1) 节点配置为

master 10G 4核	


  	
	yarn.nodemanager.resource.memory-mb
	8192



	
	yarn.scheduler.maximum-allocation-mb
	8192

slave1 6G 4核

	
	yarn.nodemanager.resource.memory-mb
	4096



	
	yarn.scheduler.maximum-allocation-mb
	4096

问题思考

网上找了一圈,各种调整参数,都没有解决。
最后看到一个帖子,说来说去还是配置的问题
于是重新回到配置上来看.
对比两个节点中yarn-site.xml的配置信息可以看出两个问题

1.master 中10G的内存在系统占去1.5G 各种进程又占去2G之后,剩余的可分配物理内存并不足8G,因而导致容器请求最大8G内存时,master由于实际可分配内存不足,无法分配次容器,只能交给别的节点去完成。slave1同理。
2.yarn.scheduler.maximum-allocation-mb配置导致冲突: master =8192 slave1 = 4096 ,

  • 我理解的是: 假设在master 上申请容器的大小为6144M 因为第一点的原因,master 本身就是内存不足无法分配。因而master将任务划分给slave1去完成,而因为slave1设置的可分配物理内存只有4096M,更是无法完成此任务。

问题解决

针对以上两个问题,提出以下解决措施

  • 第一个问题:为节点分配合理的资源,由于YARN不会智能的探测节点的物理内存总量,所以需要综合考虑其他进程,确定自己机器中实际可以分配的物理资源,
    关于如何确定,请参考:HADOOP YARN中内存和CPU资源的调度和隔离以及YARN的内存和CPU配置
  • 第二个问题:解决第一个问题后,此问题基本也解决了。余下就是尽量让各个节点中的配置保持同步。

其他

以下是执行命令以及报错的全过程

hive (default)> select * from emp order by sal; 
Query ID = yeluo_20190408221546_79c1c3e3-72f8-4398-b2d6-02f8f6ddf09d
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1554732847545_0001, Tracking URL = http://master:8088/proxy/application_1554732847545_0001/
Kill Command = /usr/local/cloud/hadoop/bin/mapred job  -kill job_1554732847545_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2019-04-08 22:16:21,858 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_1554732847545_0001 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

原创不易,转载请附上原文链接!

你可能感兴趣的:(hadoop)