Hadoop-文件put的过程

主要通过日志分析文件put的过程。

操作命令:

hadoop fs -put test /

NameNode日志:

2018-12-24 22:20:52,051 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocate blk_1073741828_1004, replicas=192.168.228.130:9866 for /test._COPYING_

2018-12-24 22:20:52,119 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* blk_1073741828_1004 is COMMITTED but not COMPLETE(numNodes= 0 <  minimum = 1) in file /test._COPYING_

2018-12-24 22:20:52,526 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /test._COPYING_ is closed by DFSClient_NONMAPREDUCE_-230280993_1

解读:
namenode会分配一个名为blk_1073741828_1004的Block。
然后提交。

DataNode日志:

2018-12-25 06:20:43,302 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1758452078-192.168.228.128-1545657860161:blk_1073741828_1004 src: /192.168.228.128:40704 dest: /192.168.228.130:9866

2018-12-25 06:20:43,314 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.228.128:40704, dest: /192.168.228.130:9866, bytes: 44, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-230280993_1, offset: 0, srvID: 8f7f078d-2a2c-41a9-bc1e-e9fd46e8c190, blockid: BP-1758452078-192.168.228.128-1545657860161:blk_1073741828_1004, duration(ns): 8279872

2018-12-25 06:20:43,314 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1758452078-192.168.228.128-1545657860161:blk_1073741828_1004, type=LAST_IN_PIPELINE terminating

解读:
DataNode从master接收名为blk_1073741828_1004的块。
并创建offset、blockid、srvid等。

存储位置:

[root@slave2 subdir0]# pwd
/data/hadoop/hdfs/dn/current/BP-1758452078-192.168.228.128-1545657860161/current/finalized/subdir0/subdir0
[root@slave2 subdir0]# cat blk_1073741828
1111
2222222222
33333333333333
444444444444

你可能感兴趣的:(Hadoop-文件put的过程)