转载请注明出处:https://blog.csdn.net/l1028386804/article/details/93750832
今天,基于Hadoop3.2.0搭建了Hadoop集群,对NameNode和Yarn做了HA,但是在运行Hadoop自带的WordCount程序时报错了,具体报错信息为:
2019-06-26 16:08:50,513 INFO mapreduce.Job: Job job_1561536344763_0001 failed with state FAILED due to: Application application_1561536344763_0001 failed 2 times due to AM Container for appattempt_1561536344763_0001_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2019-06-26 16:08:48.218]Exception from container-launch.
Container id: container_1561536344763_0001_02_000001
Exit code: 1
[2019-06-26 16:08:48.287]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
[2019-06-26 16:08:48.288]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
For more detailed output, check the application tracking page: http://binghe104:8088/cluster/app/application_1561536344763_0001 Then click on links to logs of each attempt.
. Failing the application.
在网上搜索了半天,基本上都说的是classpath的问题,于是我也先设置下classpath,具体操作如下:
在命令行输入如下命令查看Yarn的classpath
-bash-4.1$ yarn classpath
/usr/local/hadoop-3.2.0/etc/hadoop:/usr/local/hadoop-3.2.0/share/hadoop/common/lib/*:/usr/local/hadoop-3.2.0/share/hadoop/common/*:/usr/local/hadoop-3.2.0/share/hadoop/hdfs:/usr/local/hadoop-3.2.0/share/hadoop/hdfs/lib/*:/usr/local/hadoop-3.2.0/share/hadoop/hdfs/*:/usr/local/hadoop-3.2.0/share/hadoop/mapreduce/lib/*:/usr/local/hadoop-3.2.0/share/hadoop/mapreduce/*:/usr/local/hadoop-3.2.0/share/hadoop/yarn:/usr/local/hadoop-3.2.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-3.2.0/share/hadoop/yarn/*
注:查看对应的classpath的值
如果上述输出的类环境变量为空,继续下面的步骤。
添加:
mapreduce.application.classpath
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
添加:
yarn.application.classpath
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
sudo vim /etc/profile
在文件末尾添加如下信息:
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
然后是系统环境变量生效
source /etc/profile
但是到这里,并没有解决问题!没有解决问题!没有解决问题!没有解决问题!
静下心来好好分析下问题,从日志中可以看出:由于跑AM的container退出了,并没有为任务去RM获取资源,所以,这里怀疑是AM和RM通信有问题;一台是备RM,一台活动的RM,在YARN内部,当MR去活动的RM为任务获取资源的时候没问题,但是去备RM获取时就会出现这个问题。于是找到了解决问题的方向,接下来,就在yarn-site.xml进行相应的配置。
打开yarn-site.xml,添加如下配置:
yarn.resourcemanager.address.rm1
binghe103:8032
yarn.resourcemanager.scheduler.address.rm1
binghe103:8030
yarn.resourcemanager.webapp.address.rm1
binghe103:8088
yarn.resourcemanager.resource-tracker.address.rm1
binghe103:8031
yarn.resourcemanager.admin.address.rm1
binghe103:8033
yarn.resourcemanager.ha.admin.address.rm1
binghe103:23142
yarn.resourcemanager.address.rm2
binghe104:8032
yarn.resourcemanager.scheduler.address.rm2
binghe104:8030
yarn.resourcemanager.webapp.address.rm2
binghe104:8088
yarn.resourcemanager.resource-tracker.address.rm2
binghe104:8031
yarn.resourcemanager.admin.address.rm2
binghe104:8033
yarn.resourcemanager.ha.admin.address.rm2
binghe104:23142
具体如下图所示:
整个yarn-site.xml的所有配置如下:
yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.cluster-id
yrc
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.hostname.rm1
binghe103
yarn.resourcemanager.hostname.rm2
binghe104
yarn.resourcemanager.zk-address
binghe105:2181,binghe106:2181,binghe107:2181
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.resourcemanager.address.rm1
binghe103:8032
yarn.resourcemanager.scheduler.address.rm1
binghe103:8030
yarn.resourcemanager.webapp.address.rm1
binghe103:8088
yarn.resourcemanager.resource-tracker.address.rm1
binghe103:8031
yarn.resourcemanager.admin.address.rm1
binghe103:8033
yarn.resourcemanager.ha.admin.address.rm1
binghe103:23142
yarn.resourcemanager.address.rm2
binghe104:8032
yarn.resourcemanager.scheduler.address.rm2
binghe104:8030
yarn.resourcemanager.webapp.address.rm2
binghe104:8088
yarn.resourcemanager.resource-tracker.address.rm2
binghe104:8031
yarn.resourcemanager.admin.address.rm2
binghe104:8033
yarn.resourcemanager.ha.admin.address.rm2
binghe104:23142