zeppelin 版本0.8.2 ,hive版本:3.0.0
2.1.安装启动hive3
略
2.1.配置hiveserver2
如果需要配置zeppelin与hive的集成,我们需要启动hive的metastore服务以及hiveserver2服务。
首先为hive配置hiveserver2
conf/hive-site.xml
启动hiveserver2:
nohup bin/hive --service hiveserver2 &
2.2.beeline 连接:
beeline -u "jdbc:hive2://hadoop01:10005/feedback;principal=hive/_HOST@YOUR_REALM;auth=kerberos"
2.3.[Zeppelin] Kerberos认证下Hive解释器的配置
default.driver = org.apache.hive.jdbc.HiveDriver
# thrift端口是10000,http端口是10001
# 如果hive开启了负载均衡,则不能使用这些端口直连,只能连接负载均衡器的地址和端口
# YOUR_REALM表示kerberos所属域,需要替换成实际的域,不清楚可以查看/etc/krb5.conf
default.url = jdbc:hive2://localhost:10000/default;principal=hive/_HOST@YOUR_REALM
default.user = hive
zeppelin.jdbc.auth.type = KERBEROS
# 运行Zeppelin的用户需要对hive.keytab具有读权限
zeppelin.jdbc.keytab.location = /home/xxxxx/hive.keytab
# 需要替换
zeppelin.jdbc.principal = hive@YOUR_REALM
default.proxy.user.property = hive.server2.proxy.user
2.4.添加依赖包
可以直接把jar 拷贝到interpreter/jdbc/目录下,也可以在zeppelin界面上添加。
为什么是拷贝到 interpreter/jdbc 目录呢?因为我们配置的hive解释器属于JDBC组,启动解释器时 interpreter/jdbc 目录会被加到CLASSPATH里。
2.5.配置时报错内容:
org.apache.zeppelin.interpreter.InterpreterException: Error in doAs at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:479) at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:692) at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:820) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:632) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1700) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:471) ... 13 more Caused by: java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://10.16.0.83:10005/feedback;principal=mcloud/[email protected];auth=kerberos: GSS initiate failed at org.apache.hive.jdbc.HiveConnection.
解决:解释器中添加如下配置完美解决:
default.proxy.user.property=hive.server2.proxy.user