阿龙学堂-hdfs存储数据倾斜

1、现象

数据存储倾斜现象如下所示:


阿龙学堂-hdfs存储数据倾斜_第1张图片2、解决办法

配置如下参数到[hdfs-site.xml]中,然后重启NameNode和DataNode。

需要设置参数:

dfs.datanode.balance.bandwidthPerSec=52428800
dfs.datanode.balance.max.concurrent.moves=100
dfs.balance.bandwidthPerSec=52428800
dfs.datanode.max.xcievers=16384

注意:修改完该参数后,将配置信息下发到集群中的每个节点,然后需要重启hdfs,滚动重启即可。

3、启动数据均衡

nohup hdfs balancer -threshold 10 > balancer.log &
nohup hdfs balancer -threshold 10 > balancer.log &

4、异常处理

20/07/27 17:26:56 WARN balancer.Dispatcher: Failed to move blk_1135077177_61336722 with size=46096019 from 172.16.32.10:4001:DISK to 172.16.32.9:4001:DISK through 172.16.32.13:4001
java.io.IOException: Got error, status=ERROR, status message Not able to receive block 1135077177 from /172.16.32.15:34634 because threads quota is exceeded., block move is failed
	at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:121)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:431)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:372)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$3000(Dispatcher.java:230)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1056)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

出现以上异常,就说明告警信息,是线程配置低。可以调大点参数【dfs.datanode.max.xcievers】即可

在本次处理中,将【dfs.datanode.max.xcievers】的值从小修改大,示例:

阿龙学堂-hdfs存储数据倾斜_第2张图片

你可能感兴趣的:(大数据,hadoop,hdfs,hadoop,big,data)