Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hado

今天在数据合并的时候发现map 成功但是reduce总是失败,

问题简单描述:把每天采集的数据,合并都汇总表中,按天为分区。

如下看到map在执行时均成功:但是在最终的 reduce阶段失败,查找原因:

Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException): The maximum path component name limit of query_date=

hive> from log_169_searchd_pro_20141122 insert into table searchd_pro1 PARTITION (query_date)

    > select to_date(query_date),real_time,wall_time,match_mode,filters_count,sort_mode,total_matches,offset,index_name,query;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1417056252041_0024, Tracking URL = http://master:8088/proxy/application_1417056252041_0024/
Kill Command = /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop job  -kill job_1417056252041_0024
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2014-12-02 08:22:48,257 Stage-1 map = 0%,  reduce = 0%
2014-12-02 08:23:10,167 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 17.47 sec
2014-12-02 08:23:24,801 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 22.34 sec
2014-12-02 08:23:27,935 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 17.47 sec
2014-12-02 08:23:39,476 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 22.16 sec
2014-12-02 08:23:43,664 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 17.47 sec
2014-12-02 08:23:55,185 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 22.22 sec
2014-12-02 08:23:58,325 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 17.47 sec
2014-12-02 08:24:09,757 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 22.25 sec
2014-12-02 08:24:12,887 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 17.47 sec
MapReduce Total cumulative CPU time: 17 seconds 470 msec
Ended Job = job_1417056252041_0024 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1417056252041_0024_m_000000 (and more) from job job_1417056252041_0024


Task with the most failures(4): 
-----
Task ID:
  task_1417056252041_0024_r_000000


URL:
  http://master:8088/taskdetails.jsp?jobid=job_1417056252041_0024&tipid=task_1417056252041_0024_r_000000
-----
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException): The maximum path component name limit of query_date=%2212 位 太阳能 计算器 , 8 位 礼品 计算器 , 语音 计算器 12 位 , 8 位 太阳能 计算器 , 12 位 白色 计算器 , 8 位 数字 计算器 , 8 位 硅胶 计算器 , 8 位 翻盖 计算器 , 8 位 塑胶 计算器 , 12 位 台式 计算器 , 8 位 台式 计算器 , 8 位数 显 计算器%22%2F24 in directory /tmp/hive-hdfs/hive_2014-12-02_08-22-34_112_762235043055488695-1/_task_tmp.-ext-10000 is exceeded: limit=255 length=329
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxComponentLength(FSDirectory.java:1915)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1989)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1759)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4149)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2625)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2509)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2397)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:550)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:108)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:388)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)


        at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)
        at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)

        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)


问题排查定位:从上面看,明显是数据合并阶段超过了最大值,我们这次采集2条记录。

insert into table searchd_pro1 PARTITION (query_date) 
select to_date(query_date),real_time,wall_time,match_mode,filters_count,sort_mode,total_matches,offset,index_name,query from log_170_searchd_pro_20141130 limit 2;

MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 4.75 sec   HDFS Read: 65802 HDFS Write: 261 SUCCESS
Stage-Stage-2: Map: 1  Reduce: 1   Cumulative CPU: 3.45 sec   HDFS Read: 621 HDFS Write: 324 SUCCESS
Total MapReduce CPU Time Spent: 8 seconds 200 msec
OK
Time taken: 71.38 seconds

成功了。查看数据发现:

hive> select *
    >  from searchd_pro1;
OK
NULL    0.05    0.051   NULL    0       NULL    1       (0,4000)        product_distri  "遂宁" "仔猪" "行情"
NULL    0.055   0.055   NULL    0       NULL    0       (0,4000)        product_new_distri      wb 200 f "充电器" "包邮"
Time taken: 0.178 seconds, Fetched: 2 row(s)

有空值null,而且明显数据错误,在合并的过程中日期值,获取的时最后一个字段,有大量的值,明显的数据倾斜。所以会失败。

解决办法:把分区字段放在最后,各个列对应正确的位置。

hive> insert into table searchd_pro PARTITION (query_date)
    > select real_time,wall_time,match_mode,filters_count,sort_mode,total_matches,offset,index_name,query,to_date(query_date) from log_170_searchd_pro_20141130;


另外一种方法:增加足够多的reduce也能成功,不过在这里数据是错误的。所以不是本身reduce不足的问题。

完毕。





你可能感兴趣的:(hive)