kylin1.6.0构建build测试cube出错

安装好kylin以及所需的环境之后,在build测试cube之后,发现出现了如下错误:

2017-02-07 11:47:34,278 ERROR [pool-9-thread-2] execution.AbstractExecutable:370 : job:63a77287-4de9-44aa-a714-a40c780dec22-01 execute finished with exception
java.io.FileNotFoundException: File /kylin/kylin_metadata/kylin-63a77287-4de9-44aa-a714-a40c780dec22/row_count does not exist
        at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1557)
        at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:674)
        at org.apache.kylin.source.hive.HiveMRInput$RedistributeFlatHiveTableStep.doWork(HiveMRInput.java:338)
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
        at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
        at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

但是我通过hdfs命令登录到hdfs上面去找这个文件发现也是存在的,莫名其妙的问题,后来google了一下,最后还是通过史少锋大神在kylin论坛的回答解决了我的问题:

把hadoop配置文件中的core-site.xml copy到了kylin的conf目录下,参考链接

http://apache-kylin.74782.x6.nabble.com/Can-not-build-sample-cube-td7088.html

http://apache-kylin.74782.x6.nabble.com/Error-in-kylin-with-standalone-HBase-cluster-td6893.html

当然,下面还建议尝试将hadoop降级版本测试一下,这个我没有测试。

至少通过copy这个xml文件,当前错误确实得到了解决。

然而继续往下操作的时候,依然出现了问题,我的hadoop版本是2.7.3,hbase是1.2.3,hive是1.2.1,kylin是1.6.0,错误如下:

2017-02-07 18:24:06,161 ERROR [pool-9-thread-1] threadpool.DefaultScheduler:140 : ExecuteException job:20d6e927-f2d8-490c-8beb-054fa58708a9
org.apache.kylin.job.exception.ExecuteException: org.apache.kylin.job.exception.ExecuteException: java.lang.NoSuchMethodError: org.apache.hadoop.yarn.proto.YarnProtos$LocalResourceProto.hashLong(J)I
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:123)
        at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kylin.job.exception.ExecuteException: java.lang.NoSuchMethodError: org.apache.hadoop.yarn.proto.YarnProtos$LocalResourceProto.hashLong(J)I
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:123)
        at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
        ... 4 more
Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.yarn.proto.YarnProtos$LocalResourceProto.hashLong(J)I
        at org.apache.hadoop.yarn.proto.YarnProtos$LocalResourceProto.hashCode(YarnProtos.java:11864)
        at org.apache.hadoop.yarn.api.records.impl.pb.LocalResourcePBImpl.hashCode(LocalResourcePBImpl.java:62)
        at java.util.HashMap.hash(HashMap.java:338)
        at java.util.HashMap.put(HashMap.java:611)
        at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:133)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:163)
        at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
        at org.apache.kylin.engine.mr.common.AbstractHadoopJob.waitForCompletion(AbstractHadoopJob.java:149)
        at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:108)
        at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92)
        at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120)
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
        ... 6 more

当然这个错误在我之前的一篇博客中提及到,就是protobuf-java包版本的问题,我也尝试着将hbase的lib下protobuf-java包的版本升级,but,不可以,此时kylin都找不到online的regionserver了,错误如下:

2017-02-07 16:53:49,374 WARN  [localhost-startStop-1] support.XmlWebApplicationContext:487 : Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'mvcContentNegotiationManager': BeanPostProcessor before instantiation of bean failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.springframework.cache.config.internalCacheAdvisor': Cannot resolve reference to bean 'org.springframework.cache.annotation.AnnotationCacheOperationSource#0' while setting bean property 'cacheOperationSource'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.springframework.cache.annotation.AnnotationCacheOperationSource#0': BeanPostProcessor before instantiation of bean failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.springframework.security.methodSecurityMetadataSourceAdvisor': Cannot resolve reference to bean 'org.springframework.security.access.method.DelegatingMethodSecurityMetadataSource#0' while setting constructor argument; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.springframework.security.access.method.DelegatingMethodSecurityMetadataSource#0': Cannot create inner bean '(inner bean)#37310ffe' of type [org.springframework.security.access.prepost.PrePostAnnotationSecurityMetadataSource] while setting constructor argument with key [0]; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name '(inner bean)#37310ffe': Cannot create inner bean '(inner bean)#2157a4d9' of type [org.springframework.security.access.expression.method.ExpressionBasedAnnotationAttributeFactory] while setting constructor argument; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name '(inner bean)#2157a4d9': Cannot resolve reference to bean 'expressionHandler' while setting constructor argument; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'expressionHandler' defined in class path resource [kylinSecurity.xml]: Cannot resolve reference to bean 'permissionEvaluator' while setting bean property 'permissionEvaluator'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'permissionEvaluator' defined in class path resource [kylinSecurity.xml]: Cannot resolve reference to bean 'aclService' while setting constructor argument; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'aclService': Invocation of init method failed; nested exception is org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=5, exceptions:
Tue Feb 07 16:53:16 CST 2017, RpcRetryingCaller{globalStartTime=1486457595635, pause=3000, retries=5}, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on dmc-tst-ser01,16201,1486457483627
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2910)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1057)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2388)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33648)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2178)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
        at java.lang.Thread.run(Thread.java:745)

Tue Feb 07 16:53:19 CST 2017, RpcRetryingCaller{globalStartTime=1486457595635, pause=3000, retries=5}, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on dmc-tst-ser01,16201,1486457483627
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2910)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1057)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2388)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33648)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2178)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
 at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
        at java.lang.Thread.run(Thread.java:745)


Tue Feb 07 16:53:25 CST 2017, RpcRetryingCaller{globalStartTime=1486457595635, pause=3000, retries=5}, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on dmc-tst-ser01,16201,1486457483627
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2910)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1057)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2388)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33648)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2178)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
        at java.lang.Thread.run(Thread.java:745)


Tue Feb 07 16:53:34 CST 2017, RpcRetryingCaller{globalStartTime=1486457595635, pause=3000, retries=5}, org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on dmc-tst-ser01,16201,1486457483627
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2910)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1057)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2388)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33648)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2178)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)

意思也就是我的hbase的regionserver都没有online,那也就是我启动hbase的时候就已经出问题了?

还是将版本还原了,但是如上的第一个错误依然存在,我感觉降低hadoop的版本可能会管用,好麻烦啊,不想弄了,有遇到的大牛们解释一下吧(峰回路转)

还是在看了史少锋大神的回复后,还是坚定了基本上就是版本的问题

https://www.mail-archive.com/[email protected]/msg01127.html

https://issues.apache.org/jira/browse/KUDU-1318

回过头再去找,发现我hbase下面的lib目录中的跟hadoop相关的包都是2.7.3版本的,但是hbase-1.2.3版本默认的依赖hadoop包都是2.5.1版本的(应该是我在安装的时候给替换的),于是给换回来,发现问题解决了,这个错误不存在了,唉(真是峰回路转啊)

此处没问题,又出现了新问题,再次build的时候发现出现了如下的错误:

2017-02-08 16:59:04,179 ERROR [pool-9-thread-1] common.MapReduceExecutable:127 : error execute MapReduceExecutable{id=20d6e927-f2d8-490c-8beb-054fa58708a9-02, name=Extract Fact Table Distinct Columns, state=RUNNING}
java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/home/root/apache-hive-1.2.1-bin/lib/hive-exec-1.2.1.jar
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1072)
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1064)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1064)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
	at org.apache.kylin.engine.mr.common.AbstractHadoopJob.waitForCompletion(AbstractHadoopJob.java:149)
	at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:108)
	at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92)
	at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

hive在执行任务的时候为什么要去hdfs上面去找安装目录下面的jar包呢?/home/root/apache-hive-1.2.1-bin/这个是我的hive的安装目录啊,我觉得肯定是哪里的配置的问题,目前还没有找到问题所在,还在探索中............................

最后实在没招,就把hive的lib目录整个都上传到了hdfs上对应它去找的目录(临时解决方案吧),,这样这个包能找到了,但是还是有错误:

java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/home/root/hadoop/tmp/mapred/staging/root398825514/.staging/job_local398825514_0001/libjars/hive-exec-1.2.1.jar 
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1072) 
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1064) 
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) 
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1064) 
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288) 
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224) 
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99) 
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57) 
        at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265) 
        at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301) 
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389) 
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) 
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) 
        at java.security.AccessController.doPrivileged(Native Method) 
        at javax.security.auth.Subject.doAs(Subject.java:422) 
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) 
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) 
        at org.apache.kylin.engine.mr.common.AbstractHadoopJob.waitForCompletion(AbstractHadoopJob.java:149) 
        at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:108) 
        at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92) 
        at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120) 
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) 
        at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) 
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) 
        at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136) 
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
        at java.lang.Thread.run(Thread.java:745)

这个临时目录可是构建job时候临时生成的,这个目录怎么可能找得到hive的jar包,我也真是无语了,官网论坛上发了几篇帖子目前还无人能解决,waiting......

更新:

官网上有人有解决的,还是更换hadoop环境,用CDN或者HDP版本的,就能解决这个问题了..

你可能感兴趣的:(大数据/数据挖掘)