阿里云ECS上搭建Hadoop集群环境——计算时出现“java.lang.IllegalArgumentException: java.net.UnknownHostException”错误的解决

转载请注明出处:http://blog.csdn.net/dongdong9223/article/details/81257870
本文出自【我是干勾鱼的博客】

Ingredient:

  • Hadoop:hadoop-2.9.1.tar.gz(Apache Hadoop Releases Downloads, All previous releases of Hadoop are available from the Apache release archive site)

之前在阿里云ECS上搭建Hadoop集群环境——启动时报错“java.net.BindException: Cannot assign requested address”问题的解决中讲述了使用ECS搭建Hadoop时内外网信息的配置。

在阿里云的ECS中搭建起Hadoop之后,进行计算时,会很长时间卡在:

18/07/28 10:04:05 INFO mapreduce.Job: Running job: job_1532742376217_0001

这里插一句:

root@iZ2ze72w***************:/opt/hadoop/hadoop-2.9.1# pwd
/opt/hadoop/hadoop-2.9.1

注意这里面的服务器信息:

root@iZ2ze72w***************

中的id:

iZ2ze72w***************

后面会提到。

查看日志信息:

root@iZ2ze72w***************:/opt/hadoop/hadoop-2.9.1# tail -f logs/yarn-root-resourcemanager-iZ2ze.log

会发现报错:

2018-07-28 10:06:35,469 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt: Error trying to assign container token and NM token to an updated container container_1532742376217_0001_01_000001
java.lang.IllegalArgumentException: java.net.UnknownHostException: iZuf67wb***************
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:443)
at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:309)
at org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:256)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.updateContainerAndNMToken(SchedulerApplicationAttempt.java:657)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.pullNewlyAllocatedContainers(SchedulerApplicationAttempt.java:735)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.getAllocation(FiCaSchedulerApp.java:711)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocate(CapacityScheduler.java:1000)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl A M C o n t a i n e r A l l o c a t e d T r a n s i t i o n . t r a n s i t i o n ( R M A p p A t t e m p t I m p l . j a v a : 1145 ) a t o r g . a p a c h e . h a d o o p . y a r n . s e r v e r . r e s o u r c e m a n a g e r . r m a p p . a t t e m p t . R M A p p A t t e m p t I m p l AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:1145) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:1145)atorg.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImplAMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:1138)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access 500 ( S t a t e M a c h i n e F a c t o r y . j a v a : 46 ) a t o r g . a p a c h e . h a d o o p . y a r n . s t a t e . S t a t e M a c h i n e F a c t o r y 500(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory 500(StateMachineFactory.java:46)atorg.apache.hadoop.yarn.state.StateMachineFactoryInternalStateMachine.doTransition(StateMachineFactory.java:487)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:908)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:115)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager A p p l i c a t i o n A t t e m p t E v e n t D i s p a t c h e r . h a n d l e ( R e s o u r c e M a n a g e r . j a v a : 958 ) a t o r g . a p a c h e . h a d o o p . y a r n . s e r v e r . r e s o u r c e m a n a g e r . R e s o u r c e M a n a g e r ApplicationAttemptEventDispatcher.handle(ResourceManager.java:958) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager ApplicationAttemptEventDispatcher.handle(ResourceManager.java:958)atorg.apache.hadoop.yarn.server.resourcemanager.ResourceManagerApplicationAttemptEventDispatcher.handle(ResourceManager.java:939)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:201)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:127)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: iZuf67wb***************
… 20 more

先不说错误原因,有没有发现错误信息中:

java.lang.IllegalArgumentException: java.net.UnknownHostException: iZuf67wb***************

这里面有一个我省略了几位信息的id:

iZuf67wb***************

有没有发现与之前提到的账号信息:

iZ2ze72w***************

很类似?没错,这两个id都是阿里云ECS服务器的id,不同是因为一个是Master服务器的id,另一个是Slover服务器的id。从报错的特点来看:

java.lang.IllegalArgumentException: java.net.UnknownHostException: iZuf67wb***************

这是识别不出id跟网络的匹配。如果你已经在:

/etc/hosts

文件中配置了Master、Slave两台服务器的ip与对应信息的话,并且也按照阿里云ECS上搭建Hadoop集群环境——启动时报错“java.net.BindException: Cannot assign requested address”问题的解决中叙述的规则设置了内外网ip,仍然出现了这个错误,那么问题很有可能是出在:

新添加的Master、Slave服务器ip域名,与ECS服务器原有本地域名,这两者在“/etc/hosts”文件中的先后顺序上。

也就是说,你在“/etc/hosts”文件中增加的ip域名信息的位置,位于ECS服务器原有的本地域名(例如“iZuf67wb***************”这种域名信息)的后面,导致Hadoop系统启动过程中会先使用本地域名信息去操作,却又识别不出这个信息的正确意义(至于为什么识别不出就不太清楚了),所以就报错了。

那么正确的做法就是:

将新添加的Master、Slave服务器ip域名(例如“test7972”),放置在ECS服务器原有本地域名(例如“iZuf67wb***************”)的前面。但是注意ECS服务器原有本地域名(例如“iZuf67wb***************”)不能被删除,因为操作系统别的地方还会使用到。

这样Hadoop系统启动过程中就不会报这个错误了。

你可能感兴趣的:(Hadoop,走进Hadoop)