启动hadoop 2.6遇到的datanode无法启动

1.问题

用./start-dfs.sh启动hdfs,并没有DataNode节点,用jps查看只有

9235 NameNode
9646 Jps

9535 SecondaryNameNode

2.查看日志

注意查看.log的文件,这是相关日志,而不是看.out文件,把日志路径的out改为log,部分日志


2018-03-12 11:06:44,986 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 50020
2018-03-12 11:06:45,030 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened IPC server at /0.0.0.0:50020
2018-03-12 11:06:45,048 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Refresh request received for nameservices: null
2018-03-12 11:06:45,074 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting BPOfferServices for nameservices:
2018-03-12 11:06:45,088 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool (Datanode Uuid unassigned) service to hadoop000/192.168.0.105:8020 starting to offer service
2018-03-12 11:06:45,100 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2018-03-12 11:06:45,100 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2018-03-12 11:06:45,694 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/hadoop/app/tmp/dfs/data/in_use.lock acquired by nodename 6085@hadoop000
2018-03-12 11:06:45,696 WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: Incompatible clusterIDs in /home/hadoop/app/tmp/dfs/data: namenode clusterID = CID-cce16c3d-5dbd-43c7-8b35-7905bc90309e; datanode clusterID = CID-06da8613-d7f4-4bfa-ac2e-55104c7a265f
2018-03-12 11:06:45,697 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to hadoop000/192.168.0.105:8020. Exiting. 
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1394)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1355)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:228)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)

at java.lang.Thread.run(Thread.java:748)

从日志上看,加粗的部分说明了问题

datanode的clusterID 和 namenode的clusterID 不匹配


3.问题产生

当我们执行文件系统格式化时,会在namenode数据文件夹(即配置文件中dfs.name.dir在本地系统的路径)中保存一个current/VERSION文件,记录namespaceID,标志了所有格式化的namenode版本。如果我们频繁的格式化namenode,那么datanode中保存(即dfs.data.dir在本地系统的路径)的current/VERSION文件只是你地第一次格式化时保存的namenode的ID,因此就会造成namenode和datanode之间的ID不一致。

4.解决方法

根据日志中的路径, cd  /home/hadoop/app/tmp/dfs/data

可以看到 data和name两个文件夹,将name/current下的VERSION中的clusterID复制到data/current下的VERSION中,覆盖掉原来的clusterID

然后重启,启动后执行jps,查看进程

9235 NameNode
9337 DataNode
9646 Jps

9535 SecondaryNameNode


这样子我们的hdfs就启动成功了




你可能感兴趣的:(大数据)