首先这是我自己的测试环境:3台机器,今天早晨来启动hadoop集群的时候,发现怎么我的2台机器的datanode不能启动。
提示:
java.io.IOException: Failed on local exception: java.io.EOFException; Host Details :错误没贴完,后面的意思是host是这个supertool/127.0.0.1,而datanode找的是supertool-super.9.local/127.0.0.1,这是什么原因?不知道,supertool-super.9.local这个主机是怎么产生的不知道!!!
解决方法:在网上查了一下,说是host文件必须是ip和主机名对应,不能用localhost,但是我之前启动我的hadoop集群没有这种问题啊!因此我的所有host文件都是正常的。参考:http://blog.csdn.net/codepeak/article/details/13170147
没有查到解决方法,我怒了,直接重新格式化了namenode,但是新的问题又来到了
提示:
FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-1546815463-192.168.5.53-1389928412605 (storage id DS-790586005-127.0.1.1-50010-1389070756744) service to steven/192.168.5.53:9000 java.io.IOException: Incompatible clusterIDs in /home/hadoop/dfs/data: namenode clusterID = CID-c237cc41-10c5-4905-b37e-8722ddac8cc1; datanode clusterID = CID-b7181014-b79f-4391-92d7-3c4e105c6c4e at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:837) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:808) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:280) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:222) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664) at java.lang.Thread.run(Thread.java:744)
解决方法:通过在网上查找资料得出:每次namenode format会重新创建一个namenodeId,而tmp/dfs/data下包含了上次format下的id,namenode format清空了namenode下的数据,但是没有晴空datanode下的数据,导致启动时失败,所要做的就是每次fotmat前,清空tmp一下 的所有目录.
结果:删除${HADOOP_HOME}/tmp下的所有内容即可。参考:http://www.cnblogs.com/agilework/archive/2013/06/08/3127286.html
当删除掉tmp目录下的所有内容后,启动datanode没有问题了。。。
这里还要多说一个问题:
hadoop安装中遇到的错误:mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid ache path : file:/tmp/hadoop-hadoop/nodemanager/local/usercache_DEL_1382455571973 2013-10-22 23:46:56,109 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker 2013-10-22 23:46:56,111 FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Failed to initialize mapreduce.shuffle java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:98) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) at解决方法:
urg, my copy/paste was botched up but hopefully this still makes sense. The value mapreduce.shuffle is now mapreduce_shuffle and the name yarn.nodemanager.aux-services.mapreduce.shuffle.class is now yarn.nodemanager.aux-services.mapreduce_shuffle.class
是yarn.site.xml参数配置的问题:
yarn.nodemanager.aux-services.mapreduce_shuffle.class部分的错误
正确的是这样:
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>mapreduce_shuffle</value>
<description>shuffle service that needs to be set for Map Reduce to run </description>
</property>
参考:http://blog.csdn.net/bamuta/article/details/12995139