Hadoop退出安全模式
在hdfs-site.xml中设置安全阀值属性,属性值默认为0.999f,如果设为1则不进行安全检查
dfs.safemode.threshold.pct
0.999f
Specifies the percentage of blocks that should satisfy
the minimal replication requirement defined by dfs.replication.min.
Values less than or equal to 0 mean not to wait for any particular
percentage of blocks before exiting safemode.
Values greater than 1 will make safe mode permanent.
因为是在配置文件中进行硬修改,不利于管理员操作和修改,因此不推荐此方式
在安全模式下输入指令:
hadoop dfsadmin -safemode leave
即可退出安全模式。
hadoop dfsadmin -safemode enter
即可进入安全模式.
问题:HDFS启动后一直处于安全状态
解决过程:
1.查看hadoop namenode的启动日志
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = hadoop001/192.168.137.141
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.6.0-cdh5.7.0
STARTUP_MSG: classpath = …
The number of live datanodes 1 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
2019-01-16 21:23:02,807 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2019-01-16 21:23:02,807 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2019-01-16 21:23:32,808 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2019-01-16 21:23:32,808 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2019-01-16 21:23:48,686 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hadoop (auth:SIMPLE) cause:org.apache.hadoop.hdfs.server.name
node.SafeModeException: Log not rolled. Name node is in safe mode.
The reported blocks 549 needs additional 4 blocks to reach the threshold 0.9990 of total blocks 553.
The number of live datanodes 1 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
2019-01-16 21:23:48,687 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000, call org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from 19
2.168.137.141:53271 Call#84 Retry#0: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Log not rolled. Name node is in safe mode.
The reported blocks 549 needs additional 4 blocks to reach the threshold 0.9990 of total blocks 553.
The number of live datanodes 1 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
2019-01-16 21:23:51,355 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hadoop (auth:SIMPLE) cause:org.apache.hadoop.hdfs.server.name
node.SafeModeException: Cannot create directory /tmp/hive/hadoop/b8d6b3ec-3b4b-4e3c-8b9e-6d8d328203aa. Name node is in safe mode.
The reported blocks 549 needs additional 4 blocks to reach the threshold 0.9990 of total blocks 553.
The number of live datanodes 1 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
2019-01-16 21:23:51,355 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.mkdirs from 192.168.137.141:
53272 Call#4 Retry#0: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /tmp/hive/hadoop/b8d6b3ec-3b4b-4e3c-8b9e-6d8d328203aa. Name node
is in safe mode.
The reported blocks 549 needs additional 4 blocks to reach the threshold 0.9990 of total blocks 553.
The number of live datanodes 1 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
2019-01-16 21:24:02,809 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2019-01-16 21:24:02,811 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 2 millisecond(s).
2019-01-16 21:24:30,425 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: initializing replication queues
发现是因为缺少blocks,block数量没有达到所有块的0.9990的阈值(needs additional 4 blocks to reach the threshold 0.9990 of total blocks 553)
为什么会丢块呢?
2.查看hadoop namenode的启动日志
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = hadoop001/192.168.137.141
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.6.0-cdh5.7.0
STARTUP_MSG: classpath = …
2019-01-16 20:14:33,776 INFO org.apache.hadoop.hdfs.server.common.Storage: Locking is disabled for /home/hadoop/app/tmp/dfs/data/current/BP-848574762-192.168.137.141-152
8737264517
2019-01-16 20:14:33,780 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Setting up storage: nsid=1759064178;bpid=BP-848574762-192.168.137.141-1528737264517;lv=-56;
nsInfo=lv=-60;cid=CID-2ccbdca6-0244-4b19-8107-f20a1a74cfe0;nsid=1759064178;c=0;bpid=BP-848574762-192.168.137.141-1528737264517;dnuuid=3d4c88f5-6ffd-4b12-a90f-bba9b8e65b7
4
2019-01-16 20:14:34,167 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added new volume: DS-23637e15-c56f-4cc7-aecf-f1a2288cb71e
2019-01-16 20:14:34,167 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added volume - /home/hadoop/app/tmp/dfs/data/current, StorageType: DISK
2019-01-16 20:14:34,251 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Registered FSDatasetState MBean
2019-01-16 20:14:34,262 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Volume reference is released.
2019-01-16 20:14:34,263 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding block pool BP-848574762-192.168.137.141-1528737264517
2019-01-16 20:14:34,277 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning block pool BP-848574762-192.168.137.141-1528737264517 on volume /home/hadoop/app/tmp/dfs/data/current…
2019-01-16 20:14:34,392 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Cached dfsUsed found for /home/hadoop/app/tmp/dfs/data/current/BP-848574762-192.168.137.141-1528737264517/current: 739151872
2019-01-16 20:14:34,408 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken to scan block pool BP-848574762-192.168.137.141-1528737264517 on /home/hadoop/app/tmp/dfs/data/current: 128ms
2019-01-16 20:14:34,409 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Total time to scan all replicas for block pool BP-848574762-192.168.137.141-1528737264517: 146ms
2019-01-16 20:14:34,416 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding replicas to map for block pool BP-848574762-192.168.137.141-1528737264517 on volume /home/hadoop/app/tmp/dfs/data/current…
2019-01-16 20:14:34,980 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to add replicas to map for block pool BP-848574762-192.168.137.141-1528737264517 on volume /home/hadoop/app/tmp/dfs/data/current: 564ms
2019-01-16 20:14:34,980 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Total time to add all replicas to map: 568ms
2019-01-16 20:14:36,271 INFO org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/home/hadoop/app/tmp/dfs/data, DS-23637e15-c56f-4cc7-aecf-f1a2288cb71e): no suitable block pools found to scan. Waiting 846698096 ms.
2019-01-16 20:14:36,280 INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: Periodic Directory Tree Verification scan starting at 1547658816280ms with interval of 21600000ms
2019-01-16 20:14:36,295 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-848574762-192.168.137.141-1528737264517 (Datanode Uuid null) service to hadoop001/192.168.137.141:9000 beginning handshake with NN
2019-01-16 20:14:36,491 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-848574762-192.168.137.141-1528737264517 (Datanode Uuid null) service to hadoop001/192.168.137.141:9000 successfully registered with NN
2019-01-16 20:14:36,491 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: For namenode hadoop001/192.168.137.141:9000 using BLOCKREPORT_INTERVAL of 21600000msec CACHEREPORT_INTERVAL of 10000msec Initial delay: 0msec; heartBeatInterval=3000
datanode的启动日志没有发现问题
下图为与datanode检查相关的参数设置
查看官网dfs.namenode.safemode.threshold-pct参数的默认值为0.999f
http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
3.使用hdfs fsck /命令查看当前的hdfs的块的状态
[hadoop@hadoop001 hadoop]$ hdfs fsck /
Connecting to namenode via http://hadoop001:50070
FSCK started by hadoop (auth:SIMPLE) from /192.168.137.141 for path / at Wed Jan 16 22:39:38 CST 2019
…
/out/part-00000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744061_3237. Target Replicas is 3 but found 1 replica(s).
.
/out/part-00001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744060_3236. Target Replicas is 3 but found 1 replica(s).
…
/spark/checkpointdata/checkpoint-1543419775000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745968_5150. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419775000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745967_5149. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419780000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745970_5152. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419780000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745969_5151. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419785000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745972_5154. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419785000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745971_5153. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419790000: CORRUPT blockpool BP-848574762-192.168.137.141-1528737264517 block blk_1073745974
/spark/checkpointdata/checkpoint-1543419790000: MISSING 1 blocks of total size 4924 B…
/spark/checkpointdata/checkpoint-1543419790000.bk: CORRUPT blockpool BP-848574762-192.168.137.141-1528737264517 block blk_1073745973
/spark/checkpointdata/checkpoint-1543419790000.bk: MISSING 1 blocks of total size 4943 B…
/spark/checkpointdata/checkpoint-1543419795000: CORRUPT blockpool BP-848574762-192.168.137.141-1528737264517 block blk_1073745976
/spark/checkpointdata/checkpoint-1543419795000: MISSING 1 blocks of total size 4924 B…
/spark/checkpointdata/checkpoint-1543419795000.bk: CORRUPT blockpool BP-848574762-192.168.137.141-1528737264517 block blk_1073745975
/spark/checkpointdata/checkpoint-1543419795000.bk: MISSING 1 blocks of total size 4946 B…
/spark/checkpointdata/receivedBlockMetadata/log-1543419757424-1543419817424: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745960_5159. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-167/_partitioner: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744412_3588. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-167/part-00000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744410_3586. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-167/part-00001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744411_3587. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-180/_partitioner: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744419_3595. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-180/part-00000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744417_3593. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-180/part-00001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744418_3594. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092435000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744405_3581. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092435000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744401_3577. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092440000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744407_3583. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092440000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744406_3582. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092445000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744413_3589. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092445000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744409_3585. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092450000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744415_3591. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092450000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744414_3590. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092455000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744420_3596. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092455000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744416_3592. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/receivedBlockMetadata/log-1538092380422-1538092440422: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744365_3541. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/receivedBlockMetadata/log-1538092445001-1538092505001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744408_3597. Target Replicas is 3 but found 1 replica(s).
…
…
…
…
…
…Status: CORRUPT
Total size: 730163442 B
Total dirs: 113
Total files: 571
Total symlinks: 0
Total blocks (validated): 553 (avg. block size 1320367 B)
CORRUPT FILES: 4
MISSING BLOCKS: 4
MISSING SIZE: 19737 B
CORRUPT BLOCKS: 4
Minimally replicated blocks: 549 (99.27667 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 27 (4.882459 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 1
Average block replication: 0.99276674
Corrupt blocks: 4
Missing replicas: 54 (8.780488 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Wed Jan 16 22:39:39 CST 2019 in 626 milliseconds
The filesystem under path ‘/’ is CORRUPT
发现其中有4个blocks损坏
4.执行命令hdfs fsck / -delete,即将损坏的blocks删除(注意: 这种方式会出现数据丢失,损坏的block会被删掉)
[hadoop@hadoop001 hadoop]$ hdfs fsck / -delete
Connecting to namenode via http://hadoop001:50070
FSCK started by hadoop (auth:SIMPLE) from /192.168.137.141 for path / at Wed Jan 16 22:44:36 CST 2019
…
/out/part-00000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744061_3237. Target Replicas is 3 but found 1 replica(s).
.
/out/part-00001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744060_3236. Target Replicas is 3 but found 1 replica(s).
…
/spark/checkpointdata/checkpoint-1543419775000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745968_5150. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419775000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745967_5149. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419780000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745970_5152. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419780000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745969_5151. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419785000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745972_5154. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419785000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745971_5153. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419790000: CORRUPT blockpool BP-848574762-192.168.137.141-1528737264517 block blk_1073745974
/spark/checkpointdata/checkpoint-1543419790000: MISSING 1 blocks of total size 4924 B…
/spark/checkpointdata/checkpoint-1543419790000.bk: CORRUPT blockpool BP-848574762-192.168.137.141-1528737264517 block blk_1073745973
/spark/checkpointdata/checkpoint-1543419790000.bk: MISSING 1 blocks of total size 4943 B…
/spark/checkpointdata/checkpoint-1543419795000: CORRUPT blockpool BP-848574762-192.168.137.141-1528737264517 block blk_1073745976
/spark/checkpointdata/checkpoint-1543419795000: MISSING 1 blocks of total size 4924 B…
/spark/checkpointdata/checkpoint-1543419795000.bk: CORRUPT blockpool BP-848574762-192.168.137.141-1528737264517 block blk_1073745975
/spark/checkpointdata/checkpoint-1543419795000.bk: MISSING 1 blocks of total size 4946 B…
/spark/checkpointdata/receivedBlockMetadata/log-1543419757424-1543419817424: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745960_5159. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-167/_partitioner: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744412_3588. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-167/part-00000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744410_3586. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-167/part-00001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744411_3587. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-180/_partitioner: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744419_3595. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-180/part-00000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744417_3593. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-180/part-00001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744418_3594. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092435000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744405_3581. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092435000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744401_3577. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092440000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744407_3583. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092440000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744406_3582. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092445000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744413_3589. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092445000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744409_3585. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092450000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744415_3591. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092450000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744414_3590. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092455000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744420_3596. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092455000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744416_3592. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/receivedBlockMetadata/log-1538092380422-1538092440422: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744365_3541. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/receivedBlockMetadata/log-1538092445001-1538092505001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744408_3597. Target Replicas is 3 but found 1 replica(s).
…
…
…
…
…
…Status: CORRUPT
5.再次执行命令hdfs fsck /
[hadoop@hadoop001 hadoop]$ hdfs fsck /
Connecting to namenode via http://hadoop001:50070
FSCK started by hadoop (auth:SIMPLE) from /192.168.137.141 for path / at Wed Jan 16 22:45:04 CST 2019
…
/out/part-00000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744061_3237. Target Replicas is 3 but found 1 replica(s).
.
/out/part-00001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744060_3236. Target Replicas is 3 but found 1 replica(s).
…
/spark/checkpointdata/checkpoint-1543419775000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745968_5150. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419775000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745967_5149. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419780000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745970_5152. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419780000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745969_5151. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419785000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745972_5154. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/checkpoint-1543419785000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745971_5153. Target Replicas is 3 but found 1 replica(s).
.
/spark/checkpointdata/receivedBlockMetadata/log-1543419757424-1543419817424: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073745960_5159. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-167/_partitioner: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744412_3588. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-167/part-00000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744410_3586. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-167/part-00001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744411_3587. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-180/_partitioner: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744419_3595. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-180/part-00000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744417_3593. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/89d2e5e2-ad1a-497b-ba5a-823ab475cf16/rdd-180/part-00001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744418_3594. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092435000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744405_3581. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092435000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744401_3577. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092440000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744407_3583. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092440000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744406_3582. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092445000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744413_3589. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092445000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744409_3585. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092450000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744415_3591. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092450000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744414_3590. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092455000: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744420_3596. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/checkpoint-1538092455000.bk: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744416_3592. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/receivedBlockMetadata/log-1538092380422-1538092440422: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744365_3541. Target Replicas is 3 but found 1 replica(s).
.
/streaming/checkpoint/receivedBlockMetadata/log-1538092445001-1538092505001: Under replicated BP-848574762-192.168.137.141-1528737264517:blk_1073744408_3597. Target Replicas is 3 but found 1 replica(s).
…
…
…
…
…
…Status: HEALTHY
Total size: 730143705 B
Total dirs: 113
Total files: 567
Total symlinks: 0
发现损坏的4个blocks已经被删除,状态检查为HEALTHY
当然,这只是数据允许丢失的情况下可以使用的一种简单粗暴的方法,生产上还是无法使用这种直接删数据的方法的
那么生产上应该怎么处理这种情况呢?
(1)首先hdfs fsck -files -blocks -locations找到数据块的位置和丢失的数据信息
(2)hdfs debug recoverLease [-path ] [-retries ] 用这个命令恢复上面路径丢失的数据块,最后一个参数是重试次数