安装kylin的环境准备:
hadoop
hive
zookeeper(hbase依赖zookeeper,因为我没有使用hbase默认的zookeeper)
hbase
jdk
spark可以选择性安装
在官网下载kylin安装包后解压,配置好环境变量即可
注意:在运行官方实例时物理机的内存最少要16g,分配给虚拟机主节点的内存最少4g,否则hbase会不停挂。也会影响其他集群的运行(比如我出现过hadoop集群datanode启动失败的问题,总之问题很多),kylin集群与单节点并没有关系,另外还要注意日志文件的大小,在测试时发现日志文件的增大的速度很快,预计只需要半个小时左右会增长到20g,所以最好在服务器上测试
启动:
./kylin.sh start
WEB界面
hostname:7070/kylin,用户名默认ADMIN,密码:KYLIN
具体搭建参考官网http://kylin.apache.org/cn/
最重要的是kylin.properties配置文件,在配置kylin.server.model=xx时,kylin主节点的模式为all,从节点的模式为query,只有这一点不一样
我在hbase(1.3.1)与hive(1.2.1)整合时发现版本不对应的问题,将hbase更换成1.2.1后发现没解决,最后将1.3.1重新安装,出现phoenix连接hbase的问题,现在发现kylin启动后web界面访问不了的问题,查看logs下的日志发现
Caused by: org.apache.hadoop.hbase.TableExistsException: kylin_metadata_acl
at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance
(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance
(Constructor.java:422)
at org.apache.hadoop.ipc.RemoteException.instantiateException
(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException
(RemoteException.java:95)
at
org.apache.hadoop.hbase.util.ForeignExceptionUtil.toIOException(ForeignExceptionUtil.java:45)
at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.convertResult(HBaseAdmin.java:4713)
at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.waitProcedureResult
(HBaseAdmin.java:4671)
at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.get
(HBaseAdmin.java:4604)
at org.apache.hadoop.hbase.client.HBaseAdmin.createTable
(HBaseAdmin.java:679)
at org.apache.hadoop.hbase.client.HBaseAdmin.createTable
(HBaseAdmin.java:609)
at org.apache.kylin.storage.hbase.HBaseConnection.createHTableIfNeeded
(HBaseConnection.java:294)
at
org.apache.kylin.storage.hbase.HBaseConnection.createHTableIfNeeded(HBaseConnection.java:265)
at org.apache.kylin.rest.security.RealAclHBaseStorage.prepareHBaseTable
(RealAclHBaseStorage.java:49)
at
org.apache.kylin.rest.security.MockAclHBaseStorage.prepareHBaseTable(MockAclHBaseStorage.java:53)
at org.apache.kylin.rest.service.AclService.init(AclService.java:121)
at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at
java.lang.reflect.Method.invoke(Method.java:497)
at
org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor
$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:344)
at
org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor
$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:295)
at
org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBe
foreInitialization(InitDestroyAnnotationBeanPostProcessor.java:130)
... 125 more
上面的错误说表已经存在,kylin在第一次启动时会自动创建源数据表,由于我们是独立的zookeeper,所以进入到zookeeper的安装目录,在bin下执行./zkCli.sh,使用命令查看hbase下的表ls /hbase/table ,发现有三张以kylin开头的表,kylin_metadata_acl、kylin_metadata、kylin_metadata_user。单独删除一张发现不起作用,将三张全部删除,重启hbase,kylin,web界面正常访问,phoenix解决方法类似(删除系统表)
在条件允许的情况下分配给job(比如mr)的内存要尽可能大,例如:
mapreduce.reduce.memory.mb
3096
每个Reduce Task需要的内存量
mapreduce.reduce.java.opts
-Xmx3096m
reduce任务内存
否者在build cube时会不停的失败需要不停的重新运行任务(Resume),在运行kylin自带的例子(sample.sh)时会出现连接不上hadoop的问题:
Exception: java.net.ConnectException: Call From node-02/192.168.8.129 to 0.0.0.0:10020 failed on
connection exception: java.net.ConnectException: 拒绝连接; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
java.net.ConnectException: Call From node-02/192.168.8.129 to 0.0.0.0:10020 failed on connection
exception: java.net.ConnectException: 拒绝连接; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
根据Mr jobid去8088界面发现任务运行状态是success,但是kylin显示失败,暂时只能再次运行(很痛苦的过程)
在任务运行的过程中发现hbase集群中一个节点挂掉问题,后面整个集群完全挂掉,但是kylin可以访问,查询不了build cube任务,具体问题暂时没解决。报错信息如下:
2019-03-21 13:45:44,739 ERROR [pool-7-thread-1] manager.ExecutableManager:209 : error get All Job
Ids
org.apache.kylin.job.exception.PersistentException:
org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x10fe55ee closed
at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:149)
at org.apache.kylin.job.manager.ExecutableManager.getAllJobIds
(ExecutableManager.java:207)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run
(DefaultScheduler.java:85)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301
(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run
(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x10fe55ee closed
at org.apache.hadoop.hbase.client.ConnectionManager
$HConnectionImplementation.locateRegion(ConnectionManager.java:1174)
at org.apache.hadoop.hbase.client.ConnectionManager
$HConnectionImplementation.relocateRegion(ConnectionManager.java:1154)
at org.apache.hadoop.hbase.client.ConnectionManager
$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1359)
at org.apache.hadoop.hbase.client.ConnectionManager
$HConnectionImplementation.locateRegion(ConnectionManager.java:1183)
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations
(RpcRetryingCallerWithReadReplicas.java:305)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call
(ScannerCallableWithReplicas.java:156)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call
(ScannerCallableWithReplicas.java:60)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries
(RpcRetryingCaller.java:212)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction
(ClientScanner.java:164)
at org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:159)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:796)
at org.apache.kylin.storage.hbase.HBaseResourceStore.visitFolder
(HBaseResourceStore.java:137)
at org.apache.kylin.storage.hbase.HBaseResourceStore.listResourcesImpl
(HBaseResourceStore.java:107)
at org.apache.kylin.common.persistence.ResourceStore.listResources
(ResourceStore.java:121)
at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:138)
... 9 more
猜想可能是内存不足的问题