如果JDBC代码访问出现问题,解决问题的办法(我们在不同环境出现了不少问题,发现匪夷所思的问题大部分都是版本导致的)
两者都允许远程客户端使用多种编程语言,通过HiveServer或者HiveServer2,客户端可以在不启动CLI的情况下对Hive中的数据进行操作
HiveServer或者HiveServer2都是基于Thrift的,HiveServer不能处理多于一个客户端的并发请求,HiveServer2支持多客户端的并发和认证,为开放API客户端如JDBC、ODBC提供更好的支持。
Hiveserver1 和hiveserver2的JDBC区别:
HiveServer version Connection URL Driver Class
HiveServer2 jdbc:hive2://: org.apache.hive.jdbc.HiveDriver
HiveServer1 jdbc:hive://: org.apache.hadoop.hive.jdbc.HiveDriver
默认端口是10000
使用JDBC访问的前提是hiveserver2服务已经开启,在访问之前可以使用jdbc的URL在服务器的beeline上试着连接一下连接成功证明URL是没有问题的。
JDBC 访问hive分为两种方式普通访问以及kerberos认证的访问,一下做出说明
直接上相关代码
相关代码
private static String driverName = "org.apache.hive.jdbc.HiveDriver";
private static String CONNECTION_URL = "jdbc:hive2://****:10000/hive";
public static void main(String[] args) throws SQLException, IOException, InterruptedException {
try {
Class.forName(driverName);
} catch (ClassNotFoundException e) {
e.printStackTrace();
System.exit(1);
}
Connection connection = DriverManager.getConnection(CONNECTION_URL, "hadoop", "");
ResultSet resultSet = connection.prepareStatement("select * from vbapffba9dca5df44dc088cc151ee4e69f91_7 limit 10").executeQuery();
while (resultSet.next()) {
String str = resultSet.getString(1);
System.out.println(str);
}
connection.close();
}
直接上相关代码
相关代码
private static String driverName = "org.apache.hive.jdbc.HiveDriver";
private static String CONNECTION_URL ="jdbc:hive2://****:10001/devtest;principal=hs2/[email protected];auth=kerberos";
public static void main(String[] args) throws SQLException, IOException, InterruptedException {
try {
Class.forName(driverName);
} catch (ClassNotFoundException e) {
e.printStackTrace();
System.exit(1);
}
//这里注意 这里的路径不能是相对路径,一定要是绝对路径,不然报Can't get Kerberos realm的错误
//这一行也必须要加上
System.setProperty("java.security.krb5.conf", "E:\\study_workSpace\\ceshi\\src\\main\\resources\\ceshi\\krb5.conf");
//这一行在调测krb5的时候可以加上
// System.setProperty("sun.security.krb5.debug", "true");
Configuration configuration = new Configuration();
configuration.addResource(new Path("ceshi/core-site.xml"));
configuration.addResource(new Path("ceshi/hdfs-site.xml"));
configuration.set("hadoop.security.authentication", "Kerberos");
UserGroupInformation.setConfiguration(configuration);
//这里keytab也是需要用绝对路径的
UserGroupInformation UGI = UserGroupInformation.loginUserFromKeytabAndReturnUGI("[email protected]", "E:\\study_workSpace\\ceshi\\src\\main\\resources\\ceshi\\test001.ketab");
Connection connection = UGI.doAs(new PrivilegedAction() {
@Override
public Connection run() {
try {
Connection connection = DriverManager.getConnection(CONNECTION_URL,"hadoop","");
return connection;
} catch (Exception e) {
}
return null;
}
});
ResultSet resultSet = connection.prepareStatement("select * from vbapfea10b1fcfc8067ebc69ec0d limit 10").executeQuery();
while (resultSet.next()){
String str = resultSet.getString(1);
System.out.println(str);
}
connection.close();
}
java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
User root is not allowed to impersonate anonymous 错误。
解决:
修改hadoop 配置文件 etc/hadoop/core-site.xml,加入如下配置项
hadoop.proxyuser.root.hosts
*
hadoop.proxyuser.root.groups
*
参考资料://https://blog.csdn.net/zengmingen/article/details/78607795
org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, configuration:{use:database=default})
at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:156)
at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:143)
at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:583)
at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:192)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at HiveJdbcJobTest.main(HiveJdbcJobTest.java:28)
原因:hive的版本不对,将hive的版本换成hive安装的版本即可
java jdbc连接hive spark thriftserver异常HiveException: Unable to move source
java.sql.SQLException: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://master:9000/user/hive/warehouse/datacenter.db/test/.hive-staging_hive_2019-01-21_09-30-22_299_3322687924153036286-9/-ext-10000/part-00000-82fd3ed3-2734-4044-a779-9405d97caeaa-c000 to destination hdfs://master:9000/user/hive/warehouse/datacenter.db/test/part-00000-82fd3ed3-2734-4044-a779-9405d97caeaa-c000;
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296)
at org.apache.hive.jdbc.HiveStatement.executeUpdate(HiveStatement.java:406)
at net.itxw.example.HiveTest.run(HiveTest.java:24)
at net.itxw.example.HiveTest.main(HiveTest.java:47)
解决办法:
spark/conf/hive-site.xml添加配置:
fs.hdfs.impl.disable.cache
true
原因:spark和hdfs使用的是同样一个底层实现的api。执行完一次数据插入,jdbc connection.close()关闭连接,也把hdfs的Filesystem连接关了。此时一道直接把thriftserver的hdfs Filesystem连接也关了,那也就是为什么我启动thriftserver第一次能插入成功,而第二次thriftserver的日志就报错Filesystem closed,Filesystem 已经关闭了。
######4、Unable to read HiveServer2 uri from ZooKeeper
Caused by: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: Unable to read HiveServer2 uri from ZooKeeper
这种报错的前提是由于 使用的jdbcURL是有zookeeper访问的这种形式,样例如下
jdbc:hive2://hadoop001.potato.hamburg:2188,hadoop002.potato.hamburg:2188,hadoop003.potato.hamburg:2188/hsp;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hs2/[email protected]
发生的原因是由于 maven打的jar中hive-jdbc的版本是1.2.1,但是程序中的hive版本是2.3的版本,版本不一致造成的这个原因。所以将maven的hive-jdbc版本修改为与hive的版本对应即可。