hive中Could not get block locations.

hive计算时找不到文件
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: Could not get block locations. Source file “/user/hive/warehouse/dwd.db/dwd_mpi_patient_info/.hive-staging_hive_2020-07-15_12-51-27_860_3463437152013258885-1/_task_tmp.-ext-10000/ppi=2019-04-18/_tmp.000098_2” - Aborting…blocknull
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:286)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:454)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:393)
at org.apache.hadoop.mapred.YarnChild 2. r u n ( Y a r n C h i l d . j a v a : 174 ) a t j a v a . s e c u r i t y . A c c e s s C o n t r o l l e r . d o P r i v i l e g e d ( N a t i v e M e t h o d ) a t j a v a x . s e c u r i t y . a u t h . S u b j e c t . d o A s ( S u b j e c t . j a v a : 422 ) a t o r g . a p a c h e . h a d o o p . s e c u r i t y . U s e r G r o u p I n f o r m a t i o n . d o A s ( U s e r G r o u p I n f o r m a t i o n . j a v a : 1875 ) a t o r g . a p a c h e . h a d o o p . m a p r e d . Y a r n C h i l d . m a i n ( Y a r n C h i l d . j a v a : 168 ) C a u s e d b y : o r g . a p a c h e . h a d o o p . h i v e . q l . m e t a d a t a . H i v e E x c e p t i o n : j a v a . i o . I O E x c e p t i o n : C o u l d n o t g e t b l o c k l o c a t i o n s . S o u r c e f i l e " / u s e r / h i v e / w a r e h o u s e / d w d . d b / d w d m p i p a t i e n t i n f o / . h i v e − s t a g i n g h i v e 2 020 − 07 − 1 5 1 2 − 51 − 2 7 8 6 0 3 463437152013258885 − 1 / t a s k t m p . − e x t − 10000 / p p i = 2019 − 04 − 18 / t m p . 00009 8 2 " − A b o r t i n g . . . b l o c k = = n u l l a t o r g . a p a c h e . h a d o o p . h i v e . q l . e x e c . F i l e S i n k O p e r a t o r 2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Could not get block locations. Source file "/user/hive/warehouse/dwd.db/dwd_mpi_patient_info/.hive-staging_hive_2020-07-15_12-51-27_860_3463437152013258885-1/_task_tmp.-ext-10000/ppi=2019-04-18/_tmp.000098_2" - Aborting...block==null at org.apache.hadoop.hive.ql.exec.FileSinkOperator 2.run(YarnChild.java:174)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:422)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)atorg.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)Causedby:org.apache.hadoop.hive.ql.metadata.HiveException:java.io.IOException:Couldnotgetblocklocations.Sourcefile"/user/hive/warehouse/dwd.db/dwdmpipatientinfo/.hivestaginghive2020071512512786034634371520132588851/tasktmp.ext10000/ppi=20190418/tmp.0000982"Aborting...block==nullatorg.apache.hadoop.hive.ql.exec.FileSinkOperatorFSPaths.closeWriters(FileSinkOperator.java:198)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1058)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:686)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:700)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:700)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:277)
… 7 more
Caused by: java.io.IOException: Could not get block locations. Source file “/user/hive/warehouse/dwd.db/dwd_mpi_patient_info/.hive-staging_hive_2020-07-15_12-51-27_860_3463437152013258885-1/_task_tmp.-ext-10000/ppi=2019-04-18/_tmp.000098_2” - Aborting…block
null
at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1477)
at org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667)

Task attempt attempt_1594715484213_1313_r_000098_2 is done from TaskUmbilicalProtocol’s point of view. However, it stays in finishing state for too long
[2020-07-15 13:00:16.795]Container killed by the ApplicationMaster.
[2020-07-15 13:00:16.795]Sent signal OUTPUT_THREAD_DUMP (SIGQUIT) to pid 115208 as user ngariHZ for container container_1594715484213_1313_01_000230, result=success
[2020-07-15 13:00:16.809]Container killed on request. Exit code is 143
[2020-07-15 13:00:16.819]Container exited with a non-zero exit code 143.

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 26 Reduce: 99 Cumulative CPU: 903.72 sec HDFS Read: 6643488692 HDFS Write: 46424204 HDFS EC Read: 0 FAIL
Total MapReduce CPU Time Spent: 15 minutes 3 seconds 720 msec

原因:mapred.task.timeout设置时间过短,如上日志,在200秒左右任务状态没有任何变化,hadoop将该任务kill,并清理临时目录,后续遍找不到临时数据了。

修改参数

mapred.task.timeout 200000 The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string. mapred.task.timeout修改称10分钟600000即可。

你可能感兴趣的:(hive中Could not get block locations.)