一、背景
从业务系统MySQL中导入一张表,作为原始层ods;
dwd表建立分区,将时间分为day和hour分区存储。
关系型数据库中,对分区表Insert数据时候,数据库自动会根据分区字段的值,将数据插入到相应的分区中,Hive中也提供了类似的机制,即动态分区(Dynamic Partition),只不过,使用Hive的动态分区,需要进行相应的配置。
二、出现问题
2.1 默认分区数为100,过小。
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2023-05-31 14:53:40,618 Stage-1 map = 0%, reduce = 0%
2023-05-31 14:53:47,728 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 4.17 sec
2023-05-31 14:54:18,126 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.17 sec
MapReduce Total cumulative CPU time: 4 seconds 170 msec
Ended Job = job_1684310429942_0076 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1684310429942_0076_m_000000 (and more) from job job_1684310429942_0076
Task with the most failures(4):
-----
Task ID:
task_1684310429942_0076_r_000000
URL:
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1684310429942_0076&tipid=task_1684310429942_0076_r_000000
-----
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{},"value":{"_col0":170936,"_col1":"MO20230518020-GD_1","_col2":"LINE_ASS_5","_col3":"704528829","_col4":"单相表模块A","_col5":80,"_col6":80,"_col7":"2023-05-25 23:54:00","_col8":1,"_col9":"20230525","_col10":"23"}}
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:241)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:445)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:393)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{},"value":{"_col0":170936,"_col1":"MO20230518020-GD_1","_col2":"LINE_ASS_5","_col3":"704528829","_col4":"单相表模块A","_col5":80,"_col6":80,"_col7":"2023-05-25 23:54:00","_col8":1,"_col9":"20230525","_col10":"23"}}
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:229)
... 7 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to 100 partitions per node, number of dynamic partitions on this node: 101
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:951)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:722)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:882)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:220)
... 7 more
23/05/31 14:54:42 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 4.17 sec HDFS Read: 591765 HDFS Write: 0 HDFS EC Read: 0 FAIL
Total MapReduce CPU Time Spent: 4 seconds 170 msec
原因是:源数据中包含了一年的数据,即day字段有365个值,那么该参数就需要设置成大于365,如果使用默认值100,则会报错。
解决方法:
set hive.exec.max.dynamic.partitions.pernode=600000;
set hive.exec.max.dynamic.partitions=6000000;
set hive.exec.max.created.files=6000000;
2.2 Error: GC overhead limit exceeded
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2023-05-31 15:01:52,502 Stage-1 map = 0%, reduce = 0%
2023-05-31 15:01:58,589 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 4.5 sec
2023-05-31 15:02:38,095 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.5 sec
MapReduce Total cumulative CPU time: 4 seconds 500 msec
Ended Job = job_1684310429942_0078 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1684310429942_0078_m_000000 (and more) from job job_1684310429942_0078
Task with the most failures(4):
-----
Task ID:
task_1684310429942_0078_r_000000
URL:
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1684310429942_0078&tipid=task_1684310429942_0078_r_000000
-----
Diagnostic Messages for this Task:
Error: GC overhead limit exceeded
23/05/31 15:03:02 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 4.5 sec HDFS Read: 591780 HDFS Write: 0 HDFS EC Read: 0 FAIL
Total MapReduce CPU Time Spent: 4 seconds 500 msec
原因是:内存问题。
JVM抛出 java.lang.OutOfMemoryError: GC overhead limit exceeded 错误就是发出了这样的信号: 执行垃圾收集的时间比例太大, 有效的运算量太小. 默认情况下, 如果GC花费的时间超过 98%, 并且GC回收的内存少于 2%, JVM就会抛出这个错误。
原文链接:https://blog.csdn.net/qq_36908872/article/details/102685311
解决方法:配置JVM参数:
set mapred.child.java.opts=-Xmx8000m;
set mapreduce.map.java.opts=-Xmx8096m;
set mapreduce.reduce.java.opts=-Xmx8096m;
set mapreduce.map.memory.mb=8096;
set mapreduce.reduce.memory.mb=8096;
其他优化方案:在插入条件后面加入 cluster by 落入数据关键字段 , 将数据打散,同时生成一定量的reduce task 来处理一部分数据。
2.3
main ERROR Unable to invoke factory method in class class org.apache.hadoop.hive.ql.log.HushableRandomAccessFileAppender for element HushableMutableRandomAccess. java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:132)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:952)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:892)
at org.apache.logging.log4j.core.appender.routing.RoutingAppender.createAppender(RoutingAppender.java:271)
at org.apache.logging.log4j.core.appender.routing.RoutingAppender.getControl(RoutingAppender.java:255)
at org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:225)
at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:156)
at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:129)
at org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:120)
at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:448)
at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:433)
at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:417)
at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:403)
原因:mr将数据量小的表识别成了大表,数据量大的识别成小表,导致将数据量大的表加入到内存,导致程序异常.
解决方法:
set hive.execution.engine=mr;
set hive.mapjoin.smalltable.filesize=55000000;
#取消小表加载至内存中
set hive.auto.convert.join = false;