Cloudera Manager中Uber模式下MapReduce任务执行无法加载Native Libraries

2019独角兽企业重金招聘Python工程师标准>>> hot3.png

##问题现象 Cloudera Manager(以下简称CM)安装CDH,在Hive中执行任务,MapReduce任务使用Uber模式运行,报如下错误:

hive> select count(*) from test;
Query ID = hdfs_20161013090909_7dcecca0-86d6-4fdf-b60c-e493a9c9f1ac
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1476374786958_0001, Tracking URL = http://cm:8088/proxy/application_1476374786958_0001/
Kill Command = /opt/cloudera/parcels/CDH-5.5.4-1.cdh5.5.4.p0.9/lib/hadoop/bin/hadoop job  -kill job_1476374786958_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2016-10-13 09:09:46,374 Stage-1 map = 0%,  reduce = 0%
2016-10-13 09:09:47,440 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_1476374786958_0001 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1476374786958_0001_m_000000 (and more) from job job_1476374786958_0001

Task with the most failures(1): 
-----
Task ID:
  task_1476374786958_0001_m_000000

URL:
  http://cm:8088/taskdetails.jsp?jobid=job_1476374786958_0001&tipid=task_1476374786958_0001_m_000000
-----
Diagnostic Messages for this Task:
Error: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
hive> 

##日志分析 查看Job的详细日志为:

2016-10-13 08:55:48,228 FATAL [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Error running local (uberized) 'child' : java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
	at org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method)
	at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63)
	at org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:133)
	at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:150)
	at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:165)
	at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:114)
	at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:97)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1606)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1486)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:460)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:388)
	at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:302)
	at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:187)
	at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:230)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

问题分析: Mapreduce任务进入Uber模式,但是ApplicationMaster没有加载Native Libraries.

查看CM上关于am的env参数:

  • 参数yarn.app.mapreduce.am.admin.user.env的值为:LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native:$JAVA_LIBRARY_PATH
  • 参数yarn.app.mapreduce.am.env并未设置。

##解决思路

###无效方法

  1. 在CM上配置参数yarn.app.mapreduce.am.env为LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native:$JAVA_LIBRARY_PATH
  2. 在CM上配置hadoop-env.sh增加 export HADOOP_COMMON_HOME=/opt/cloudera/parcels/CDH-5.5.4-1.cdh5.5.4.p0.9/lib/hadoop 报错依旧.
  3. 在Hive客户端中设置参数:set yarn.app.mapreduce.am.env=$HADOOP_COMMON_HOME/lib/native:$JAVA_LIBRARY_PATH

###有效方法

  1. 在CM上设置参数:yarn.app.mapreduce.am.command-opts为" -Djava.net.preferIPv4Stack=true -Djava.library.path=/opt/cloudera/parcels/CDH-5.5.4-1.cdh5.5.4.p0.9/lib/hadoop/lib/native"
  2. 在Hive客户端里制定参数:set yarn.app.mapreduce.am.command-opts=-Djava.library.path=/opt/cloudera/parcels/CDH-5.5.4-1.cdh5.5.4.p0.9/lib/hadoop/lib/native

存在问题:如果在Hive中set参数,貌似参数值不能出现空格,也就是说不能使用多个-D的方式。

转载于:https://my.oschina.net/yulongblog/blog/758405

你可能感兴趣的:(java,大数据,python)