Hive 遗留子进程频繁操作HDFS

NameNode日志中频繁出现rename失败的日志,且频繁GC

2019-10-22 12:18:15,826 WARN  hdfs.StateChange (FSDirRenameOp.java:validateRenameSource(559)) - DIR* FSDirectory.unprotectedRenameTo: rename source /apps/hive/warehouse/zs_db.db/umetrip_client_all/.hive-staging_hive_2019-10-21_15-05-03_662_6055882197773119796-53816/-ext-10000/000080_0 is not found.
2019-10-22 12:18:15,827 WARN  hdfs.StateChange (FSDirRenameOp.java:validateRenameSource(559)) - DIR* FSDirectory.unprotectedRenameTo: rename source /apps/hive/warehouse/zs_db.db/umetrip_client_all/.hive-staging_hive_2019-10-21_15-05-03_662_6055882197773119796-53816/-ext-10000/000080_0 is not found.
2019-10-22 12:18:15,827 WARN  hdfs.StateChange (FSDirRenameOp.java:validateRenameSource(559)) - DIR* FSDirectory.unprotectedRenameTo: rename source /apps/hive/warehouse/zs_db.db/umetrip_client_all/.hive-staging_hive_2019-10-21_15-05-03_662_6055882197773119796-53816/-ext-10000/000080_0 is not found.
2019-10-22 12:18:15,828 WARN  hdfs.StateChange (FSDirRenameOp.java:validateRenameSource(559)) - DIR* FSDirectory.unprotectedRenameTo: rename source /apps/hive/warehouse/zs_db.db/umetrip_client_all/.hive-staging_hive_2019-10-21_15-05-03_662_6055882197773119796-53816/-ext-10000/000080_0 is not found.
2019-10-22 12:18:15,828 WARN  hdfs.StateChange (FSDirRenameOp.java:validateRenameSource(559)) - DIR* FSDirectory.unprotectedRenameTo: rename source /apps/hive/warehouse/zs_db.db/umetrip_client_all/.hive-staging_hive_2019-10-21_15-05-03_662_6055882197773119796-53816/-ext-10000/000080_0 is not found.

根据日志发现查尝试rename 文件/apps/hive/warehouse/zs_db.db/umetrip_client_all/.hive-staging_hive_2019-10-21_15-05-03_662_6055882197773119796-53816/-ext-10000/000080_0 ,但该文件并不存在。

根据路径怀疑是Hive任务操作,但集群中并无运行中的Hive任务。类似
https://issues.apache.org/jira/browse/HIVE-7273。

解决方法:
创建文件/apps/hive/warehouse/zs_db.db/umetrip_client_all/.hive-staging_hive_2019-10-21_15-05-03_662_6055882197773119796-53816/-ext-10000/000080_0

创建该文件后,NameNode中相关日志消失,且GC频率恢复正常。

/apps/hive/warehouse/zs_db.db/umetrip_client_all/ 目录下多出了文件000080_0_copy_131756330

drwxr-xr-x   - umecron hadoop          0 2019-10-22 19:19 /apps/hive/warehouse/zs_db.db/umetrip_client_all/.hive-staging_hive_2019-10-21_15-05-03_662_6055882197773119796-53816
-rw-r--r--   3 umecron hadoop          0 2019-10-22 19:21 /apps/hive/warehouse/zs_db.db/umetrip_client_all/000080_0_copy_131756330

可能原因是运行在Yarn集群中的Hive任务中的Reducer线程遗留了下来,该线程卡死在尝试重命名不存在的文件。手工创建该文件后,重名成功,遗留线程执行结束。

你可能感兴趣的:(Hive 遗留子进程频繁操作HDFS)