kylin报错及解决方案总结

一、在build cube这一步中报错:Value not exists!

查询该步的mr日志,提示 Not a valid value:2017-05-31,有两种可能

1.该错误是由于build过程中,所引用的维表数据发生了变化,使用该值查询维表,维表中不存在这条数据。

2.olap表关联了维表,但只使用了关联字段,如果olap表的code在维表里不存在,则会报错

    解决:

         1、确定维表中是否存在该值。

         2、确定维表中为什么不存在。

         3、该值在olap中是否合理。

         4.如果是 olap表关联了维表,但只使用了关联字段,如果olap表的code在维表里不存在,则会报错  导致报错 Value not exists!

              可以设置 kylin.hive.config.override.forcejoin.enabled=true  这样可以强制kylin关联dim表 过滤掉olap在维表不存在的值

              但如果维表本身有问题(数据不全或者为空)会导致olap的数据被过滤 请根据场景设置

               设置后检查sql是否符合预期(如图)

              

二、.build 第三步Extract Fact Table Distinct报错 :ArrayIndexOutOfBoundsException: -1 at  (数据溢出) 

原因:olap中有与维表同名的字段或者维表之间有同名的字段(或者模型有问题,join的字段,维度要选择事实表字段,已解决)

解决:去掉同名字段


三、build第四步Build Dimension Dictionary

 Failed to create dictionary on OLAP.OLAP_CUSTOMER_STOCK_2_DA.CUST_ID_STOCK

java.lang.RuntimeException: Failed to create dictionary on OLAP.OLAP_CUSTOMER_STOCK_2_DA.CUST_ID_STOCK

	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325)
	at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
	at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File does not exist: /kylin/kylin-kylin_metadata/resources/GlobalDict/dict/OLAP.OLAP_CUSTOMER_STOCK_2_DA/CUST_ID_STOCK/.index
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1828)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:588)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

	at org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:127)
	at org.apache.kylin.dict.DictionaryManager.getDictionary(DictionaryManager.java:114)
	at org.apache.kylin.dict.GlobalDictionaryBuilder.build(GlobalDictionaryBuilder.java:65)
	at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81)
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:323)
	... 14 more
或者以下日志

出现原因是该字段配置了全局字典,如果同时提交多个segment构建任务,并且Build Dimension Dictionary这步正好同时运行到,会导致多任务操作同一个字典文件,导致异常

全局字典任务尽量不要并行构建,出现问题后,resume任务或者重新提交build(目前已经增加分布式锁,可以并行构建,应该不会有这种报错)

四、

使用全局时,构建cube的第四步Build Dimension Dictionary出错。

报错信息如下所示,其中OLAP.OLAP_LOG_WEB_TS_DI.ORIGINAL_SESSION_ID 是设置为全局字典的字段。

 

 

java.lang.RuntimeException: Failed to create dictionary on OLAP.OLAP_LOG_WEB_TS_DI.ORIGINAL_SESSION_ID
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325)
	at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
	at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException
	at org.apache.kylin.dict.CachedTreeMap.writeValue(CachedTreeMap.java:240)
	at org.apache.kylin.dict.CachedTreeMap.write(CachedTreeMap.java:374)
	at org.apache.kylin.dict.AppendTrieDictionary.flushIndex(AppendTrieDictionary.java:1043)
	at org.apache.kylin.dict.AppendTrieDictionary$Builder.build(AppendTrieDictionary.java:954)
	at org.apache.kylin.dict.GlobalDictionaryBuilder.build(GlobalDictionaryBuilder.java:82)
	at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81)
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:323)
	... 14 more

result code:2

出错原因:使用全局字典有容量的限制,Count distinct指标字段的字符串长度不能超过255。建议检查出错字段的原始数据长度。


四-2:

使用全局时,构建cube的第四步Build Dimension Dictionary出错。

目前为偶发异常 ,discard后重新提交

 

四-3:

构建cube的第四步Build Dimension Dictionary出错。

java.lang.RuntimeException: Failed to create dictionary on OLAP.OLAP_MKT_LOG_ACCESS_PAGE_INDICATOR_DI.USER_ID
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325)
	at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
	at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:185)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: java.io.EOFException
	at org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:127)
	at org.apache.kylin.dict.DictionaryManager.getDictionary(DictionaryManager.java:114)
	at org.apache.kylin.dict.AppendTrieDictionary$Builder.createNewBuilder(AppendTrieDictionary.java:884)
	at org.apache.kylin.dict.AppendTrieDictionary$Builder.getInstance(AppendTrieDictionary.java:844)
	at org.apache.kylin.dict.AppendTrieDictionary$Builder.getInstance(AppendTrieDictionary.java:838)
	at org.apache.kylin.dict.GlobalDictionaryBuilder.build(GlobalDictionaryBuilder.java:65)
	at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81)
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:323)
	... 14 more
Caused by: java.io.EOFException
	at java.io.DataInputStream.readInt(DataInputStream.java:392)
	at org.apache.kylin.dict.AppendTrieDictionary.readFields(AppendTrieDictionary.java:1238)
	at org.apache.kylin.dict.DictionaryInfoSerializer.deserialize(DictionaryInfoSerializer.java:74)
	at org.apache.kylin.dict.DictionaryInfoSerializer.deserialize(DictionaryInfoSerializer.java:34)
	at org.apache.kylin.common.persistence.ResourceStore.getResource(ResourceStore.java:146)
	at org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:421)
	at org.apache.kylin.dict.DictionaryManager$1.load(DictionaryManager.java:103)
	at org.apache.kylin.dict.DictionaryManager$1.load(DictionaryManager.java:100)
	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257)
	at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
	at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
	at org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:120)
	... 21 more

联系管理员处理)

1.查询全局字典的路径,找出.index大小为0的目录,删除或者mv


五、

GlobalDict /dict/OLAP.OLAP_CUSTOMER_NEW_3_DI/CUST_PHONE1_ENCRYPTED should have 0 or 1 append dict but 2

联系管理员处理)

解决: http://apache-kylin.74782.x6.nabble.com/Re-more-than-1-append-dict-for-globalDict-td7028.html

1.从cubedesc中找出不一样的全局字典

2.删除这些字典以及对应的segment

补充说明:

1.上述处理方法可能会导致字典异常,造成count distinct不准,最好清空整个cube 运行metastore.sh clean后重跑数据

 


六、

 

hbase问题导致,取消任务,重试即可


七:刷新任务列表失败

异常为:

 

java.lang.NullPointerException
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
at org.apache.kylin.rest.service.JobService.parseToJobStep(JobService.java:309)
at org.apache.kylin.rest.service.JobService.parseToJobInstance(JobService.java:303)
at org.apache.kylin.rest.service.JobService.access$000(JobService.java:73)
at org.apache.kylin.rest.service.JobService$1.apply(JobService.java:134)
at org.apache.kylin.rest.service.JobService$1.apply(JobService.java:131)
at com.google.common.collect.Iterators$8.transform(Iterators.java:860)
at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
at com.google.common.collect.Lists.newArrayList(Lists.java:145)
at com.google.common.collect.Lists.newArrayList(Lists.java:125)
at org.apache.kylin.rest.service.JobService.listCubeJobInstance(JobService.java:131)
at org.apache.kylin.rest.service.JobService.listAllJobs(JobService.java:103)
at org.apache.kylin.rest.service.JobService.listAllJobs(JobService.java:84)
at org.apache.kylin.rest.service.JobService$$FastClassBySpringCGLIB$$83a44b2a.invoke()
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:629)
at org.apache.kylin.rest.service.JobService$$EnhancerBySpringCGLIB$$29ce7197.listAllJobs()
at org.apache.kylin.rest.controller.JobController.list(JobController.java:133)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)

处理方法:

去server05上刷新任务列表,查看kylin.out最后停住的记录

使用./metastore.sh remove "/execute/${id}”删除该记录

 

八:hbase建表冲突

1) discard任务

2)调用删除segment api

3)去hbase中删除该表

九:

有全局字典,并且Build Base Cuboid Data时间过长

查看mr的counter发现gc时间较长

1) 利用adhoc查询设置全局字典的列的distinct大小

2)增加cube设置kylin.job.mr.config.override.mapred.map.child.java.opts=-Xmx8g kylin.job.mr.config.override.mapreduce.map.memory.mb=8500

如果内存还不够可改为(可以优先调整map大小,即前两项,如还不行,四项均设置)

kylin.job.mr.config.override.mapreduce.map.java.opts=-Xmx15g kylin.job.mr.config.override.mapreduce.map.memory.mb=16000

kylin.job.mr.config.override.mapreduce.reduce.java.opts=-Xmx15g kylin.job.mr.config.override.mapreduce.reduce.memory.mb=16000


十、第三步

问题:

 

java.lang.IllegalStateException
	at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:98)
	at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92)
	at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

解决:reload metadata

http://www.cnblogs.com/aprilrain/p/6916280.html


十一:查询报错Not a valid ID

 

可能原因1:修改cube 增加维度时会造成cube元数据不同步

当修改一个cube增加新的维度字段后,cube build能成功完成。但是当查询语句中包含该新增加的维度时,会报如下错误:Not a valid ID。该维度并未包含在cube的元数据中。所以在使用kylin的过程中,应尽量避免在cube上做修改,建议新建cube或者clone cube后进行修改。

 

可能原因2:cube设置了自动合并

kylin的自动合并有bug,建议关闭,查询有问题的时间段可以通过重新build修复 


十二:Build Dimension Dictionary报错

 

java.lang.IllegalStateException: Dup key found, key=[0], value1=[0,未知], value2=[0,null]
	at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:85)
	at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:68)
	at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)
	at org.apache.kylin.dict.lookup.LookupTable.(LookupTable.java:56)
	at org.apache.kylin.dict.lookup.LookupStringTable.(LookupStringTable.java:65)
	at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:674)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:60)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
	at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

原因:维表主键有重复,0在维表出现两次,分别是[0,未知],[0,null]


十三:

load kylin中已有的olap表出错

原因是存在desbroken状态的cube导致  drop掉以即可

十四:

build第三步报错

 

原因:几台build机器之间的配置不同步,比如

kylin.cube.aggrgroup.max.combination的值的设置不一样

 十五:

现象:设置了全局字典的指标build完之后,数据与hive中查询的不一致

解决:全局字典问题,清空数据后,删除该字典,重新build即可




 

十六:

 

mr任务失败提示 Error: GC overhead limit exceeded,原因是mr内存不够

 

可增加cube设置kylin.job.mr.config.override.mapred.map.child.java.opts=-Xmx8g kylin.job.mr.config.override.mapreduce.map.memory.mb=8500

 

如果内存还不够可改为(可以优先调整map大小,即前两项,如还不行,四项均设置)

 

kylin.job.mr.config.override.mapreduce.map.java.opts=-Xmx15g

 kylin.job.mr.config.override.mapreduce.map.memory.mb=16000

kylin.job.mr.config.override.mapreduce.reduce.java.opts=-Xmx15g

 kylin.job.mr.config.override.mapreduce.reduce.memory.mb=16000


你可能感兴趣的:(kylin)