NameNode故障后,可以采用如下两种方法恢复数据:
(1) kill -9 NameNode进程
[test@hadoop151 ~]$ jps
3764 DataNode
4069 NodeManager
3654 NameNode
7738 Jps
[test@hadoop151 ~]$ kill -9 3654
[test@hadoop151 ~]$ jps
3764 DataNode
4069 NodeManager
7770 Jps
(2) 删除 NameNode 存储的数据(/opt/module/hadoop-2.7.2/data/tmp/dfs/name)
[test@hadoop151 ~]$ rm -rf /opt/module/hadoop-2.7.2/data/tmp/dfs/name/*
(3) 拷贝 SecondaryNameNode 中数据到原 NameNode 存储数据目录
[test@hadoop151 ~]$ scp -r test@hadoop153:/opt/module/hadoop-2.7.2/data/tmp/dfs/namesecondary/* /opt/module/hadoop-2.7.2/data/tmp/dfs/name/
edits_0000000000000000135-0000000000000000136 100% 42 0.0KB/s 00:00
fsimage_0000000000000000288.md5 100% 62 0.1KB/s 00:00
edits_0000000000000000285-0000000000000000286 100% 42 0.0KB/s 00:00
edits_0000000000000000155-0000000000000000225 100% 5459 5.3KB/s 00:00
edits_0000000000000000277-0000000000000000278 100% 42 0.0KB/s 00:00
edits_0000000000000000137-0000000000000000138 100% 42 0.0KB/s 00:00
edits_0000000000000000283-0000000000000000284 100% 42 0.0KB/s 00:00
fsimage_0000000000000000290.md5 100% 62 0.1KB/s 00:00
edits_0000000000000000287-0000000000000000288 100% 42 0.0KB/s 00:00
edits_0000000000000000003-0000000000000000010 100% 1024KB 1.0MB/s 00:00
edits_0000000000000000266-0000000000000000268 100% 97 0.1KB/s 00:00
edits_0000000000000000273-0000000000000000274 100% 42 0.0KB/s 00:00
edits_0000000000000000281-0000000000000000282 100% 42 0.0KB/s 00:00
edits_0000000000000000227-0000000000000000228 100% 42 0.0KB/s 00:00
edits_0000000000000000279-0000000000000000280 100% 42 0.0KB/s 00:00
edits_0000000000000000269-0000000000000000270 100% 42 0.0KB/s 00:00
edits_0000000000000000271-0000000000000000272 100% 42 0.0KB/s 00:00
edits_0000000000000000289-0000000000000000290 100% 42 0.0KB/s 00:00
edits_0000000000000000001-0000000000000000002 100% 42 0.0KB/s 00:00
fsimage_0000000000000000288 100% 2327 2.3KB/s 00:00
edits_0000000000000000120-0000000000000000120 100% 1024KB 1.0MB/s 00:00
VERSION 100% 204 0.2KB/s 00:00
fsimage_0000000000000000290 100% 2327 2.3KB/s 00:00
edits_0000000000000000011-0000000000000000118 100% 12KB 12.3KB/s 00:00
edits_0000000000000000275-0000000000000000276 100% 42 0.0KB/s 00:00
edits_0000000000000000229-0000000000000000249 100% 1449 1.4KB/s 00:00
edits_0000000000000000139-0000000000000000154 100% 1005 1.0KB/s 00:00
edits_0000000000000000121-0000000000000000134
(4) 重新启动 NameNode
[test@hadoop151 ~]$ hadoop-daemon.sh start namenode
starting namenode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-test-namenode-hadoop151.out
[test@hadoop151 ~]$ jps
7844 NameNode
3764 DataNode
4069 NodeManager
7884 Jps
(1) 修改hdfs-site.xml配置文件,在文件中增加如下配置
dfs.namenode.checkpoint.period</name>
120</value>
</property>
dfs.namenode.name.dir</name>
/opt/module/hadoop-2.7.2/data/tmp/dfs/name</value>
</property>
更改完一台机器上的配置后,分发脚本:
[test@hadoop151 hadoop]$ xsync hdfs-site.xml
fname=hdfs-site.xml
pdir=/opt/module/hadoop-2.7.2/etc/hadoop
------------------- hadoop151 --------------
sending incremental file list
sent 36 bytes received 12 bytes 96.00 bytes/sec
total size is 1333 speedup is 27.77
------------------- hadoop152 --------------
sending incremental file list
hdfs-site.xml
sent 716 bytes received 43 bytes 1518.00 bytes/sec
total size is 1333 speedup is 1.76
------------------- hadoop153 --------------
sending incremental file list
hdfs-site.xml
sent 716 bytes received 43 bytes 1518.00 bytes/sec
total size is 1333 speedup is 1.76
(2) kill -9 NameNode进程
[test@hadoop151 ~]$ jps
7844 NameNode
3764 DataNode
7973 Jps
4069 NodeManager
[test@hadoop151 ~]$ kill -9 7844
[test@hadoop151 ~]$ jps
7988 Jps
3764 DataNode
4069 NodeManager
(3) 删除 NameNode 存储的数据(/opt/module/hadoop-2.7.2/data/tmp/dfs/name)
[test@hadoop151 ~]$ rm -rf /opt/module/hadoop-2.7.2/data/tmp/dfs/name/*
(4) 如果 SecondaryNameNode 不和 NameNode 在一个主机节点上,需要将SecondaryNameNode 存储数据的目录拷贝到 NameNode 存储数据的平级目录,并删除 in_use.lock 文件
[test@hadoop151 dfs]$ scp -r test@hadoop153:/opt/module/hadoop-2.7.2/data/tmp/dfs/namesecondary ./
in_use.lock 100% 14 0.0KB/s 00:00
edits_0000000000000000135-0000000000000000136 100% 42 0.0KB/s 00:00
fsimage_0000000000000000293.md5 100% 62 0.1KB/s 00:00
edits_0000000000000000285-0000000000000000286 100% 42 0.0KB/s 00:00
edits_0000000000000000155-0000000000000000225 100% 5459 5.3KB/s 00:00
edits_0000000000000000277-0000000000000000278 100% 42 0.0KB/s 00:00
edits_0000000000000000137-0000000000000000138 100% 42 0.0KB/s 00:00
fsimage_0000000000000000295.md5 100% 62 0.1KB/s 00:00
edits_0000000000000000283-0000000000000000284 100% 42 0.0KB/s 00:00
fsimage_0000000000000000295 100% 2327 2.3KB/s 00:00
edits_0000000000000000287-0000000000000000288 100% 42 0.0KB/s 00:00
edits_0000000000000000003-0000000000000000010 100% 1024KB 1.0MB/s 00:00
edits_0000000000000000266-0000000000000000268 100% 97 0.1KB/s 00:00
edits_0000000000000000273-0000000000000000274 100% 42 0.0KB/s 00:00
edits_0000000000000000281-0000000000000000282 100% 42 0.0KB/s 00:00
edits_0000000000000000294-0000000000000000295 100% 42 0.0KB/s 00:00
edits_0000000000000000227-0000000000000000228 100% 42 0.0KB/s 00:00
edits_0000000000000000279-0000000000000000280 100% 42 0.0KB/s 00:00
edits_0000000000000000269-0000000000000000270 100% 42 0.0KB/s 00:00
edits_0000000000000000271-0000000000000000272 100% 42 0.0KB/s 00:00
edits_0000000000000000289-0000000000000000290 100% 42 0.0KB/s 00:00
edits_0000000000000000001-0000000000000000002 100% 42 0.0KB/s 00:00
edits_0000000000000000120-0000000000000000120 100% 1024KB 1.0MB/s 00:00
edits_0000000000000000291-0000000000000000292 100% 42 0.0KB/s 00:00
VERSION 100% 204 0.2KB/s 00:00
edits_0000000000000000011-0000000000000000118 100% 12KB 12.3KB/s 00:00
edits_0000000000000000275-0000000000000000276 100% 42 0.0KB/s 00:00
edits_0000000000000000229-0000000000000000249 100% 1449 1.4KB/s 00:00
edits_0000000000000000139-0000000000000000154 100% 1005 1.0KB/s 00:00
fsimage_0000000000000000293 100% 2327 2.3KB/s 00:00
edits_0000000000000000121-0000000000000000134 100% 1177 1.2KB/s 00:00
edits_0000000000000000250-0000000000000000265 100% 1021 1.0KB/s 00:00
[test@hadoop151 dfs]$ ll
总用量 12
drwx------ 3 test test 4096 2月 3 19:08 data
drwxrwxr-x 2 test test 4096 2月 3 19:29 name
drwxrwxr-x 3 test test 4096 2月 3 19:29 namesecondary
[test@hadoop151 dfs]$ cd namesecondary/
[test@hadoop151 namesecondary]$ ll
总用量 8
drwxrwxr-x 2 test test 4096 2月 3 19:29 current
-rw-rw-r-- 1 test test 14 2月 3 19:29 in_use.lock
[test@hadoop151 namesecondary]$ rm -rf in_use.lock
[test@hadoop151 namesecondary]$ ll
总用量 4
drwxrwxr-x 2 test test 4096 2月 3 19:29 current
(5) 导入检查点数据(等待一会ctrl+c结束掉)
[test@hadoop151 hadoop]$ hdfs namenode -importCheckpoint
(6) 启动 NameNode
[test@hadoop151 hadoop]$ hadoop-daemon.sh start namenode
starting namenode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-test-namenode-hadoop151.out