HDFS-FSCK命令输出

在HDFS中,提供了fsck命令,用于检查HDFS上文件和目录的健康状态、获取文件的block信息和位置信息等。

fsck命令必须由HDFS超级用户来执行,普通用户无权限。

查看文件中损坏的块(-list-corruptfileblocks)

hdfs fsck  /data/test/cdh9  -list-corruptfileblocks

将损坏的文件移动至/lost+found目录(-move)

hdfs fsck /data/test/cdh9  -move

删除损坏的文件(-delete)

hdfs fsck /data/test/cdh9  -delete

检查并打印正在被打开执行写操作的文件(-openforwrite)

hdfs fsck /data/test/cdh9  -openforwrite

打印文件的Block报告(-blocks)

并列出所有文件状态(-files)

文件块的位置信息(-locations)

文件块位置所在的机架信息(-racks)

例:hdfs fsck  /data/test/cdh9  -files -blocks -locations

Connecting to namenode via http://xxxx:50070/fsck?ugi=hdfs&files=1&blocks=1&locations=1&path=%2Fdata%2Ftest%2Fcdh9
FSCK started by hdfs (auth:SIMPLE) from /xxxx for path /data/test/cdh9 at Mon Jul 04 09:36:44 CST 2022
/data/test/cdh9 3700657201 bytes, 28 block(s):  OK

0. BP-1582775312-10.193.81.148-1656473793148:blk_1073745470_4646 len=134217728 Live_repl=3 [DatanodeInfoWithStorage[10.193.224.30:50010,DS-0ca68924-ede5-4a24-83be-c2e7488fa568,DISK], DatanodeInfoWithStorage[10.193.224.27:50010,DS-d455ccc0-ca31-491b-a86e-cb177a603a7a,DISK], DatanodeInfoWithStorage[10.193.82.172:50010,DS-c4a619ec-6443-47c1-9139-c5374038b98f,DISK]]

1. BP-1582775312-10.193.81.148-1656473793148:blk_1073745471_4647 len=134217728 Live_repl=3 [DatanodeInfoWithStorage[10.193.224.30:50010,DS-27fc81c4-e599-4d40-b9a6-efb568273046,DISK], DatanodeInfoWithStorage[10.193.82.172:50010,DS-d6e167dd-483d-437c-a8e1-1b75af772c24,DISK], DatanodeInfoWithStorage[10.193.81.148:50010,DS-33f4cf6c-2a46-4180-bdae-ac0888375737,DISK]]

 

参数含义:

  1. BP-1582775312-10.193.81.148-1656473793148 这是Block Pool ID。块池是属于单个名称空间的一组块。为简单起见,您可以说名称节点管理的所有块都在同一个块池下。 Block Pool形成如下:
    String bpid = "" + rand + "-"+ ip + "-" + Time.now();
    Where: 
    rand = Some random number
    ip = IP address of the Name Node
    Time.now() - Current system time
    
    在此处阅读有关块池的信息:Apache Hadoop 3.3.3 – HDFS Federation
  2. blk_1073745470_4646: 块的块号。 HDFS中的每个块都有一个唯一的标识符。 块ID形成为:
    blk__
    Where: 
    blockid = ID of the block
    genstamp = an incrementing integer that records the version of a particular block
    
    在这里阅读关于代邮票:Cloudera Blog
  3. LEN = 134217728      块长度:块中的字节数
  4. REPL = 3        该块有3个副本
  5. [DatanodeInfoWithStorage[10.193.224.30:50010,DS-0ca68924-ede5-4a24-83be-c2e7488fa568,DISK] 哪里:
    10.193.224.30 => IP address of the Data Node holding this block
    50010 => Data streaming port
    DS-0ca68924-ede5-4a24-83be-c2e7488fa568 => Storage ID. It is an internal ID of the Data Node. It is assigned, when the Data Node registers with Name Node
    DISK => storageType. It is DISK here. Storage type can be: RAM_DISK, SSD, DISK and ARCHIVE
    

第5点的描述适用于剩余的2个区块:

DatanodeInfoWithStorage[10.193.224.27:50010,DS-d455ccc0-ca31-491b-a86e-cb177a603a7a,DISK], DatanodeInfoWithStorage[10.193.82.172:50010,DS-c4a619ec-6443-47c1-9139-c5374038b98f,DISK]]

参考:hdfs fsck命令查看HDFS文件对应的文件块信息(Block)和位置信息(Locations) - 走看看

 

你可能感兴趣的:(Hadoop,HDFS,大数据)