如果所有的文件满足最小副本的要求,那么就认为文件系统是健康的。
(HDFS is considered healthy if—and only if—all files have a minimum number of replicas available)
hadoop提供了fsck tool来对整个文件系统或者单独的文件、目录来进行健康状态的检查。
hdfs fsck <path>
[-list-corruptfileblocks |
[-move | -delete | -openforwrite]
[-files [-blocks [-locations | -racks]]]
[-includeSnapshots]
[-storagepolicies] [-blockId <blk_Id>]
参数 | 描述 |
---|---|
指定诊断的路径(既可以是目录也可以是文件) | |
-delete | 删除损坏的文件 |
-files | 打印出被诊断的文件 |
-files -blocks | 打印被诊断的文件的块信息 |
-files -blocks -locations | 打印每个块的位置 |
-files -blocks -racks | 打印数据块的网络拓扑结构 |
-includeSnapshots | 如果指定路径包含快照的路径或者快照在该路径下,则包含快照数据 |
-list-corruptfileblocks | 打印出丢失的块列表以及块所属的文件 |
-move | 将损坏的文件移动至/lost+found目录 |
-openforwrite | 打印出正在被写入的文件 |
-storagepolicies | 打印块的存储策略摘要 |
-blockId | 打印出有关该块的信息 |
Please Note:
1. By default fsck ignores files opened for write, use -openforwrite to report such files. They are usually tagged CORRUPT or HEALTHY depending on their block allocation status
2. Option -includeSnapshots should not be used for comparing stats, should be used only for HEALTH check, as this may contain duplicates if the same file present in both original fs tree and inside snapshots.
hadoop fsck / 这个命令可以检查整个文件系统的健康状况,但是要注意它不会主动恢复备份缺失的block,这个是由NameNode单独的线程异步处理的。
使用hdfs fsck / -openforwrite命令检查HDFS健康状态,输出如图。
参数描述:
下面介绍每一个选项的含义和用法。
[hadoop@dev ~]$ hdfs fsck /hivedata/warehouse/liuxiaowen.db/lxw_product_names/ -list-corruptfileblocks
The filesystem under path '/hivedata/warehouse/liuxiaowen.db/lxw_product_names/' has 0 CORRUPT files
[hadoop@dev ~]$ hdfs fsck /hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00168 -move
FSCK started by hadoop (auth:SIMPLE) from /172.16.212.17 for path /hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00168 at Thu Aug 13 09:36:35 CST 2015
.Status: HEALTHY
Total size: 13497058 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 13497058 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 2.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 15
Number of racks: 1
FSCK ended at Thu Aug 13 09:36:35 CST 2015 in 1 milliseconds
The filesystem under path '/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00168' is HEALTHY
[hadoop@dev ~]$ hdfs fsck /hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00168 -delete
FSCK started by hadoop (auth:SIMPLE) from /172.16.212.17 for path /hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00168 at Thu Aug 13 09:37:58 CST 2015
.Status: HEALTHY
Total size: 13497058 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 13497058 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 2.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 15
Number of racks: 1
FSCK ended at Thu Aug 13 09:37:58 CST 2015 in 1 milliseconds
The filesystem under path '/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00168' is HEALTHY
[hadoop@dev ~]$ hdfs fsck /hivedata/warehouse/liuxiaowen.db/lxw_product_names/ -files
FSCK started by hadoop (auth:SIMPLE) from /172.16.212.17 for path /hivedata/warehouse/liuxiaowen.db/lxw_product_names/ at Thu Aug 13 09:39:38 CST 2015
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/ dir
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/_SUCCESS 0 bytes, 0 block(s): OK
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00000 13583807 bytes, 1 block(s): OK
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00001 13577427 bytes, 1 block(s): OK
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00002 13588601 bytes, 1 block(s): OK
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00003 13479213 bytes, 1 block(s): OK
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00004 13497012 bytes, 1 block(s): OK
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00005 13557451 bytes, 1 block(s): OK
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00006 13580267 bytes, 1 block(s): OK
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00007 13486035 bytes, 1 block(s): OK
/hivedata/warehouse/liuxiaowen.db/lxw_product_names/part-00008 13481498 bytes, 1 block(s): OK
...
[hadoop@dev ~]$ hdfs fsck /hivedata/warehouse/liuxiaowen.db/lxw_product_names/ -openforwrite
FSCK started by hadoop (auth:SIMPLE) from /172.16.212.17 for path /hivedata/warehouse/liuxiaowen.db/lxw_product_names/ at Thu Aug 13 09:41:28 CST 2015
....................................................................................................
....................................................................................................
.Status: HEALTHY
Total size: 2704782548 B
Total dirs: 1
Total files: 201
Total symlinks: 0
Total blocks (validated): 200 (avg. block size 13523912 B)
Minimally replicated blocks: 200 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 2.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 15
Number of racks: 1
FSCK ended at Thu Aug 13 09:41:28 CST 2015 in 10 milliseconds
The filesystem under path '/hivedata/warehouse/liuxiaowen.db/lxw_product_names/' is HEALTHY
[hadoop@dev ~]$ hdfs fsck /logs/site/2015-08-08/lxw1234.log -files -blocks
FSCK started by hadoop (auth:SIMPLE) from /172.16.212.17 for path /logs/site/2015-08-08/lxw1234.log at Thu Aug 13 09:45:59 CST 2015
/logs/site/2015-08-08/lxw1234.log 7408754725 bytes, 56 block(s): OK
0. BP-1034052771-172.16.212.130-1405595752491:blk_1075892982_2152381 len=134217728 repl=2
1. BP-1034052771-172.16.212.130-1405595752491:blk_1075892983_2152382 len=134217728 repl=2
2. BP-1034052771-172.16.212.130-1405595752491:blk_1075892984_2152383 len=134217728 repl=2
3. BP-1034052771-172.16.212.130-1405595752491:blk_1075892985_2152384 len=134217728 repl=2
4. BP-1034052771-172.16.212.130-1405595752491:blk_1075892997_2152396 len=134217728 repl=2
5. BP-1034052771-172.16.212.130-1405595752491:blk_1075892998_2152397 len=134217728 repl=2
6. BP-1034052771-172.16.212.130-1405595752491:blk_1075892999_2152398 len=134217728 repl=2
7. BP-1034052771-172.16.212.130-1405595752491:blk_1075893000_2152399 len=134217728 repl=2
8. BP-1034052771-172.16.212.130-1405595752491:blk_1075893001_2152400 len=134217728 repl=2
9. BP-1034052771-172.16.212.130-1405595752491:blk_1075893002_2152401 len=134217728 repl=2
10. BP-1034052771-172.16.212.130-1405595752491:blk_1075893003_2152402 len=134217728 repl=2
11. BP-1034052771-172.16.212.130-1405595752491:blk_1075893004_2152403 len=134217728 repl=2
12. BP-1034052771-172.16.212.130-1405595752491:blk_1075893005_2152404 len=134217728 repl=2
13. BP-1034052771-172.16.212.130-1405595752491:blk_1075893006_2152405 len=134217728 repl=2
14. BP-1034052771-172.16.212.130-1405595752491:blk_1075893007_2152406 len=134217728 repl=2
...
其中,/logs/site/2015-08-08/lxw1234.log 7408754725 bytes, 56 block(s): 表示文件的总大小和block数;
前面的0. 1. 2.代表该文件的block索引,56的文件块,就从0-55;
BP-1034052771-172.16.212.130-1405595752491:blk_1075892982_2152381表示block id;
len=134217728 表示该文件块大小;
repl=2 表示该文件块副本数;
[hadoop@dev ~]$ hdfs fsck /logs/site/2015-08-08/lxw1234.log -files -blocks -locations
FSCK started by hadoop (auth:SIMPLE) from /172.16.212.17 for path /logs/site/2015-08-08/lxw1234.log at Thu Aug 13 09:45:59 CST 2015
/logs/site/2015-08-08/lxw1234.log 7408754725 bytes, 56 block(s): OK
0. BP-1034052771-172.16.212.130-1405595752491:blk_1075892982_2152381 len=134217728 repl=2 [172.16.212.139:50010, 172.16.212.135:50010]
1. BP-1034052771-172.16.212.130-1405595752491:blk_1075892983_2152382 len=134217728 repl=2 [172.16.212.140:50010, 172.16.212.133:50010]
2. BP-1034052771-172.16.212.130-1405595752491:blk_1075892984_2152383 len=134217728 repl=2 [172.16.212.136:50010, 172.16.212.141:50010]
3. BP-1034052771-172.16.212.130-1405595752491:blk_1075892985_2152384 len=134217728 repl=2 [172.16.212.133:50010, 172.16.212.135:50010]
4. BP-1034052771-172.16.212.130-1405595752491:blk_1075892997_2152396 len=134217728 repl=2 [172.16.212.142:50010, 172.16.212.139:50010]
5. BP-1034052771-172.16.212.130-1405595752491:blk_1075892998_2152397 len=134217728 repl=2 [172.16.212.133:50010, 172.16.212.139:50010]
6. BP-1034052771-172.16.212.130-1405595752491:blk_1075892999_2152398 len=134217728 repl=2 [172.16.212.141:50010, 172.16.212.135:50010]
7. BP-1034052771-172.16.212.130-1405595752491:blk_1075893000_2152399 len=134217728 repl=2 [172.16.212.144:50010, 172.16.212.142:50010]
8. BP-1034052771-172.16.212.130-1405595752491:blk_1075893001_2152400 len=134217728 repl=2 [172.16.212.133:50010, 172.16.212.138:50010]
9. BP-1034052771-172.16.212.130-1405595752491:blk_1075893002_2152401 len=134217728 repl=2 [172.16.212.140:50010, 172.16.212.134:50010]
...
和打印出的文件块信息相比,多了一个文件块的位置信息:[172.16.212.139:50010, 172.16.212.135:50010]
[hadoop@dev ~]$ hdfs fsck /logs/site/2015-08-08/lxw1234.log -files -blocks -locations -racks
FSCK started by hadoop (auth:SIMPLE) from /172.16.212.17 for path /logs/site/2015-08-08/lxw1234.log at Thu Aug 13 09:45:59 CST 2015
/logs/site/2015-08-08/lxw1234.log 7408754725 bytes, 56 block(s): OK
0. BP-1034052771-172.16.212.130-1405595752491:blk_1075892982_2152381 len=134217728 repl=2 [/default-rack/172.16.212.139:50010, /default-rack/172.16.212.135:50010]
1. BP-1034052771-172.16.212.130-1405595752491:blk_1075892983_2152382 len=134217728 repl=2 [/default-rack/172.16.212.140:50010, /default-rack/172.16.212.133:50010]
2. BP-1034052771-172.16.212.130-1405595752491:blk_1075892984_2152383 len=134217728 repl=2 [/default-rack/172.16.212.136:50010, /default-rack/172.16.212.141:50010]
3. BP-1034052771-172.16.212.130-1405595752491:blk_1075892985_2152384 len=134217728 repl=2 [/default-rack/172.16.212.133:50010, /default-rack/172.16.212.135:50010]
4. BP-1034052771-172.16.212.130-1405595752491:blk_1075892997_2152396 len=134217728 repl=2 [/default-rack/172.16.212.142:50010, /default-rack/172.16.212.139:50010]
5. BP-1034052771-172.16.212.130-1405595752491:blk_1075892998_2152397 len=134217728 repl=2 [/default-rack/172.16.212.133:50010, /default-rack/172.16.212.139:50010]
...
和前面打印出的信息相比,多了机架信息:[/default-rack/172.16.212.139:50010, /default-rack/172.16.212.135:50010]