hadoop启动时,别的进程正常,节点datanode进程启动后又自己停了,以下datanode日志:
2014-09-26 10:20:14,225 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /tmpdir/dfs/data/in_use.lock acquired by nodename
2376@sn
2014-09-26 10:20:14,237 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (D
atanode Uuid unassigned) service to nn/192.168.1.105:9000. Exiting.
java.io.IOException: Incompatible clusterIDs in /tmpdir/dfs/data: namenode clusterID = CID-ed60dc6d-e40c-436e-99c9-8b2c55493e68; dat
anode clusterID = CID-4bade8cd-f4b4-472f-b64a-3b101c04d952
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:477)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:226)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:254)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:974)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:945)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:278)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
at java.lang.Thread.run(Thread.java:745)
2014-09-26 10:20:14,243 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering
> (Datanode Uuid unassigned) service to nn/192.168.1.105:9000
2014-09-26 10:20:14,251 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
2014-09-26 10:20:16,252 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-09-26 10:20:16,257 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2014-09-26 10:20:16,263 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at sn/192.168.1.106
************************************************************/
通过与下面问题【2】相同的方法解决,不知道两个问题有什么共同点
最近一直在学习hadoop,安装了好多次,启动时出现的报错也是很多,对此做个小结:
format格式化hdfs时报错一般与 hostname解析、${dfs.name.dir}/current/VERSION、 和JAVA_HOME有关系
以下几个典型的例子:
1、Hadoop启动时,格式化hdfs时报错,
原因: 以前以 root的身份 format过一次,所在以 dfs.name.dir下边 产生了一些权限, 而hadoop用户又对些没有权限,所以报错,
解决方法:删除${dfs.name.dir}/current 这个current目录。 或者更 改所有者。
2、 以前hadoop启动是正常的,各进程也正常,重新格式化hdfs文件系统hadoop namenode -format后,再启动start-all.sh,也没有报错信息,但是jps发现datanode没有启动,查找datanode日志如下:
# vim hadoop-hduser-datanode-cm134.jaybing.com.log
2014-08-10 10:16:06,693 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = cm134.jaybing.com/202.106.199.38
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2-cdh3u5
STARTUP_MSG: build = git://ubuntu-slave02/var/lib/jenkins/workspace/CDH3u5-Full-RC/build/cdh3/hadoop20/0.20.2-cdh3u5/source -r 302
33064aaf5f2492bc687d61d72956876102109; compiled by 'jenkins' on Fri Oct 5 17:21:34 PDT 2012
************************************************************/
2014-08-10 10:16:08,098 INFO org.apache.hadoop.security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-i
nstalling.
2014-08-10 10:16:09,453 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /ha
doop/tmp/dfs/data: namenode namespaceID = 2024141122; datanode namespaceID = 1824410798
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:238)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:153)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:423)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:314)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1683)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1623)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1641)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1767)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1784)
2014-08-10 10:16:09,461 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at cm134.jaybing.com/202.106.199.38
************************************************************/
解决方法:
${dfs.data.dir}/data/current/VERSION 修改这个路径下的VERSION文件,
把VERSION文件中的namespaceID 值改成,上面错误日志中的namespaceID值,
再重启datanode应该就好了。
注意:datanode节点数有很多,要一 一全部修改;
如果是测试环境 的话,直接把data文件删除了,也可以。
3、新版本2.4.1中,profile/hadoop-env.sh中均己设置 JAVA_HOME, java -version也正常。
在0.20.2版本中 没遇到这个错误。
启动时报错:
[root@nn ~]# start-all.sh
Starting namenodes on []
localhost: Error: JAVA_HOME is not set and could not be found.
localhost: Error: JAVA_HOME is not set and could not be found.
...
starting yarn daemons
starting resourcemanager, logging to /home/lihanhui/open-source/hadoop-2.1.0-beta/logs/yarn-admin-resourcemanager-localhost.out
localhost: Error: JAVA_HOME is not set and could not be found
直接命令行执行export JAVA_HOME=/PATH/TO/JDK也无法解决问题:
最终在 hadoop-2.4.1/etc/hadoop/libexec/hadoop-config.sh 这个配置文件中搜到报错信息“JAVA_HOME is not set and could not be found”
于是在这个配置文件中, export JAVA_HOME=/PATH/JDK
问题得己解决、