hadoop fsck
Usage: DFSck <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]
<path> 检查这个目录中的文件是否完整
-move 破损的文件移至/lost+found目录
-delete 删除破损的文件
-openforwrite 打印正在打开写操作的文件
-files 打印正在check的文件名
-blocks 打印block报告 (需要和-files参数一起使用)
-locations 打印每个block的位置信息(需要和-files参数一起使用)
-racks 打印位置信息的网络拓扑图 (需要和-files参数一起使用)
hadoop fsck /
用这个命令可以检查整个文件系统的健康状况,但是要注意它不会主动恢复备份缺失的block,这个是由NameNode单独的线程异步处理的。
....................................................................................................
.................................
/user/distribute-hadoop-boss/tmp/pgv/20090813/1000000103/input/JIFEN.QQ.COM.2009-08-13-18.30: Replica placement policy is violated for blk_7596595208988121840_5377589. Block should be additionally replicated on 1 more rack(s).
....................................................
/user/distribute-hadoop-boss/tmp/pgv/20090813/1000000310/input/PAY.QQ.COM.2009-08-13-20.30: Replica placement policy is violated for blk_8146588794511444453_5379501. Block should be additionally replicated on 1 more rack(s).
...............
....................................................................................................
....................................................................................................
.........................................................................................Status: HEALTHY
Total size: 5042961147529 B (Total open files size: 1610612736 B)
Total dirs: 723
Total files: 128089 (Files currently being written: 2)
Total blocks (validated): 171417 (avg. block size 29419259 B) (Total open file blocks (not validated): 24)
Minimally replicated blocks: 171417 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 476 (0.2776854 %)
Default replication factor: 3 缺省的备份参数3
Average block replication: 3.000146
Corrupt blocks: 0 破损的block数0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 107
Number of racks: 4
The filesystem under path '/' is HEALTHY
hadoop fsck /user/distribute-hadoop-boss/tmp/pgv/20090813/1000000103/input/JIFEN.QQ.COM.2009-08-13-18.30 -files -blocks -locations -racks
打印出了这个文件每个block的详细信息包括datanode的机架信息。
/user/distribute-hadoop-boss/tmp/pgv/20090813/1000000103/input/JIFEN.QQ.COM.2009-08-13-18.30 74110492 bytes, 2 block(s): Replica placement policy is violated for blk_7596595208988121840_5377589. Block should be additionally replicated on 1 more rack(s). 这个block虽然有三份拷贝,但是都在一个rack里,应该有一个副本放在不同的机架,详细见上一节(副本放置策略)
0. blk_-4839761191731553520_5377588 len=67108864 repl=3 [/lg/dminterface0/172.16.236.158:50010, /lg/dminterface1/172.16.218.108:50010, /lg/dminterface1/172.16.236.36:50010]
1. blk_7596595208988121840_5377589 len=7001628 repl=3 [/lg/dminterface2/172.16.236.51:50010, /lg/dminterface2/172.16.218.217:50010, /lg/dminterface2/172.16.218.200:50010]
三份拷贝的datanode信息,都在/lg/dminterface2里
Status: HEALTHY
Total size: 74110492 B
Total dirs: 0
Total files: 1
Total blocks (validated): 2 (avg. block size 37055246 B)
Minimally replicated blocks: 2 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 1 (50.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 107
Number of racks: 4
The filesystem under path '/user/distribute-hadoop-boss/tmp/pgv/20090813/1000000103/input/JIFEN.QQ.COM.2009-08-13-18.30' is HEALTHY