hive code 143

Examining task ID: task_1468807885116_444914_r_000469 (and more) from job job_1468807885116_444914
Examining task ID: task_1468807885116_444914_r_000490 (and more) from job job_1468807885116_444914

Task with the most failures(1): 
-----
Task ID:
  task_1468807885116_444914_r_000075

URL:
  http://hadoop-jr-nn02.pekdc1.jdfin.local:8088/taskdetails.jsp?jobid=job_1468807885116_444914&tipid=task_1468807885116_444914_r_000075
-----
Diagnostic Messages for this Task:
Task KILL is received. Killing attempt!
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 19008  Reduce: 501   Cumulative CPU: 173568.6 sec   HDFS Read: 128356858364 HDFS Write: 2411144083 FAIL
Total MapReduce CPU Time Spent: 2 days 0 hours 12 minutes 48 seconds 600 msec

网上查阅资料发现code 143 一般是由于内存不足造成的,于是加大内存

set mapreduce.map.memory.mb=16384;
set mapreduce.map.java.opts=-Xmx13106M;   
set mapred.map.child.java.opts=-Xmx13106M;
set mapreduce.reduce.memory.mb=16384;
set mapreduce.reduce.java.opts=-Xmx13106M;--reduce.memory*0.8
set mapreduce.task.io.sort.mb=512

修改之后发现还是报143,直觉感觉应该不是内存问题,查看日志发现Reduce任务很早就开始运行,最后一堆任务被master kill掉,怀疑是不是因为任务并行度太大的问题,于是修改reduce任务开始的执行时间

2016-07-29 11:01:55,483 Stage-1 map = 47%,  reduce = 0%, Cumulative CPU 70616.67 sec
2016-07-29 11:02:02,433 Stage-1 map = 48%,  reduce = 0%, Cumulative CPU 73860.9 sec
2016-07-29 11:02:03,594 Stage-1 map = 49%,  reduce = 2%, Cumulative CPU 75543.31 sec
2016-07-29 11:02:04,735 Stage-1 map = 51%,  reduce = 8%, Cumulative CPU 80289.2 sec
2016-07-29 11:02:11,624 Stage-1 map = 61%,  reduce = 18%, Cumulative CPU 95621.34 sec
2016-07-29 11:02:12,772 Stage-1 map = 64%,  reduce = 18%, Cumulative CPU 98355.46 sec
2016-07-29 11:02:30,284 Stage-1 map = 82%,  reduce = 27%, Cumulative CPU 121605.43 sec
2016-07-29 11:02:33,633 Stage-1 map = 85%,  reduce = 28%, Cumulative CPU 124751.68 sec
2016-07-29 11:02:39,631 Stage-1 map = 88%,  reduce = 29%, Cumulative CPU 127854.76 sec
2016-07-29 11:02:44,473 Stage-1 map = 90%,  reduce = 30%, Cumulative CPU 130540.72 sec
2016-07-29 11:02:48,743 Stage-1 map = 94%,  reduce = 31%, Cumulative CPU 133265.67 sec
2016-07-29 11:05:10,438 Stage-1 map = 100%,  reduce = 96%, Cumulative CPU 188791.68 sec
[Fatal Error] total number of created files now is 50062, which exceeds 50000. Killing the job.
MapReduce Total cumulative CPU time: 2 days 4 hours 26 minutes 31 seconds 680 msec
Ended Job = job_1468807885116_443473 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1468807885116_443473_m_005492 (and more) from job job_1468807885116_443473
Examining task ID: task_1468807885116_443473_m_003818 (and more) from job job_1468807885116_443473
Examining task ID: task_1468807885116_443473_m_001767 (and more) from job job_1468807885116_443473
………………
Examining task ID: task_1468807885116_443473_m_002210 (and more) from job job_1468807885116_443473
修改reduce任务从map完成80%后开始执行。
set mapreduce.job.reduce.slowstart.completedmaps=0.8

修改完成后,被maste kill的作业数量大幅减少,但是出现了新的错误。。。如下:

2016-07-29 12:44:32,119 Stage-1 map = 100%,  reduce = 97%, Cumulative CPU 178926.55 sec
[Fatal Error] total number of created files now is 50074, which exceeds 50000. Killing the job.
MapReduce Total cumulative CPU time: 2 days 1 hours 42 minutes 6 seconds 550 msec
Ended Job = job_1468807885116_449213 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 

由于本任务是动态插入分区的任务,而由于数据量太大,动态分区产生的文件超过了集群的限制。

set hive.exec.max.created.files=60000;

最后增大生成文件数,成功解决问题。

附作业最终参数:

set mapred.output.compress=true;
set hive.exec.compress.output=true;
set mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec ;
set io.compression.codecs=com.hadoop.compression.lzo.LzopCodec ;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set mapreduce.map.memory.mb=16384;
set mapreduce.map.java.opts=-Xmx13106M;
set mapred.map.child.java.opts=-Xmx13106M;
set mapreduce.reduce.memory.mb=16384;
set mapreduce.reduce.java.opts=-Xmx13106M;
set mapreduce.job.reduce.slowstart.completedmaps=0.8;
set hive.exec.max.created.files=60000;
set hive.merge.mapredfiles=true;--在Map-Reduce的任务结束时合并小文件

你可能感兴趣的:(hive,hive)