hadoop系统重启过后,namenode不能启动问题

问题描述:hadoop系统重启过后,执行sbin/start_dfs.sh启动脚本

[hadoop@ruozedata001 hadoop]$ jps
6033 Jps
5304 SecondaryNameNode
5119 DataNode
[hadoop@ruozedata001 hadoop]$ 
namenode始终起不来

查看namenode日志,报错如下

tail -F  /home/hadoop/software/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-ruozedata001.log
 	2019-07-05 10:39:35,828 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: Maximum size of an xattr: 16384
2019-07-05 10:39:35,828 WARN org.apache.hadoop.hdfs.server.common.Storage: Storage directory /data/tmp/hadoop-hadoop/dfs/name does not exist
2019-07-05 10:39:35,829 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /data/tmp/hadoop-hadoop/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:314)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1063)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:767)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:609)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:670)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:838)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:817)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1538)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1606)
2019-07-05 10:39:35,832 INFO org.mortbay.log: Stopped [email protected]:50070
2019-07-05 10:39:35,933 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2019-07-05 10:39:35,933 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2019-07-05 10:39:35,933 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2019-07-05 10:39:35,933 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /data/tmp/hadoop-hadoop/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:314)

这个开始的解决方法是执行 hadoop namenode -format 将namenode重新再格式化下(生产上不建议使用,因为此操作会把NameNode说有数据删除),但是后面想想不对,这样每次都format,那不是玩完了~~然后就搜了下,发现是因为临时文件/tmp会被删除掉,如下所示:/tmp/hadoop-hadoop/dfs/目录下,就不存在namenode文件夹

[hadoop@ruozedata001 dfs]$ cd /tmp/hadoop-hadoop/dfs/
[hadoop@ruozedata001 dfs]$ pwd
/tmp/hadoop-hadoop/dfs
[hadoop@ruozedata001 dfs]$ ll
total 0
drwx------ 2 hadoop hadoop  6 Jul  5 09:54 data
drwxrwxr-x 3 hadoop hadoop 20 Jul  5 10:39 namesecondary
[hadoop@ruozedata001 dfs]$ 

解决方法就是修改配置文件core-site.xml,添加hadoop.tmp.dir属性:把存放目重新定定义到不会被删除的地方


	
			hadoop.tmp.dir
		  /data/tmp/hadoop-${user.name}
	

然后再次执行sbin/start_dfs.sh启动脚本,就能够正常启动namenode进程了。

建议:为避免上述故障,在配置部署hadoop环境的时候就直接将配置文件core-site.xml的hadoop.tmp.dir目录重新定义到不会删除的地方。

你可能感兴趣的:(hadoop)