官方连接: https://cwiki.apache.org/confluence/display/Hive/HiveClient
CDH 5.0.1
Hadoop 2.3
Hive 0.13.0
Eclipse Helios
OS:Window 7 or CentOS 6.2
Hive的metastore有3种类型,Embeded,Locale,Remote
Embeded就是使用嵌入式进程的本地数据引擎,hive使用的是Derby,我想SqlLite应该也可以。
Locate和Remote我认为本质是一样的,都是使用独立数据引擎来存储metastore.
cdh5.0默认启动了locale 的mysql, 所以可以多个session并发访问。
具体可以去,/hive/lib/conf/hive-site里边查找ConnectionURL和ConnectionDriver
jdbc的依赖程序,主要来自于hive/lib, hadoop/lib, hadoop/client,所以一般把这几个目录的全部弄出来就可以搞定。
不过具体其实只用到几个,我列举我的成功案例,可以节省看官你的时间。其实这个也是我之前迷惑之处,网上都没有,还是自己动手丰衣足食吧。有人说httpclient不需要,也许,您可以试一下。
07/12/2014 10:11 PM 298,829 commons-configuration-1.6.jar 07/12/2014 10:11 PM 62,050 commons-logging-1.1.3.jar 07/12/2014 10:11 PM 2,827,295 hadoop-common-2.3.0-cdh5.1.0.jar 07/17/2014 08:42 PM 10,023,451 hive-exec-0.12.0-cdh5.1.0.jar 07/17/2014 08:42 PM 132,353 hive-jdbc-0.12.0-cdh5.1.0.jar 07/17/2014 08:42 PM 3,443,238 hive-metastore-0.12.0-cdh5.1.0.jar 07/17/2014 08:42 PM 1,755,877 hive-service-0.12.0-cdh5.1.0.jar 07/12/2014 10:11 PM 433,368 httpclient-4.2.5.jar 07/12/2014 10:11 PM 227,708 httpcore-4.2.5.jar 07/12/2014 11:52 PM 274,725 libfb303-0.9.0.jar 07/12/2014 11:53 PM 348,175 libthrift-0.9.0.cloudera.2.jar 07/12/2014 10:11 PM 489,884 log4j-1.2.17.jar 07/12/2014 10:11 PM 26,084 slf4j-api-1.7.5.jar 07/12/2014 09:42 PM 8,864 slf4j-log4j12-1.7.5.jar
iHive有两个server, hiveserver1 和 hiveserver2, 前者未来会废弃。
package com.jinbao.hive.client; import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; public class HiveClient2 { private static String driverName = "org.apache.hive.jdbc.HiveDriver"; public static void main(String[] args) throws SQLException { System.out.print("Test for access hive by jdbc.\n"); try { Class.forName(driverName); } catch (ClassNotFoundException e) { e.printStackTrace(); System.exit(1); } Connection con = DriverManager.getConnection( "jdbc:hive2://192.168.1.1:10000/default", "hive", "cloudera"); // change to user 'cloudera' if having some problem. Statement stmt = con.createStatement(); String tableName = "testHiveDriverTable"; stmt.execute("drop table if exists " + tableName); String create = "create table " + tableName + " (key int, value string)"; create += " row format delimited fields terminated by ','"; stmt.execute(create); // show tables String sql = "show tables '" + tableName + "'"; System.out.println("Running: " + sql); ResultSet res = stmt.executeQuery(sql); if (res.next()) { System.out.println(res.getString(1)); } // describe table sql = "describe " + tableName; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); while (res.next()) { System.out.println(res.getString(1) + "\t" + res.getString(2)); } // load data into table // NOTE: filepath has to be local to the hive server // NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line String filepath = "/tmp/a.txt"; sql = "load data local inpath '" + filepath + "' into table " + tableName; System.out.println("Running: " + sql); stmt.execute(sql); // select * query sql = "select * from " + tableName; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); while (res.next()) { System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2)); } // regular hive query sql = "select count(1) from " + tableName; System.out.println("Running: " + sql); res = stmt.executeQuery(sql); while (res.next()) { System.out.println(res.getString(1)); } System.out.print("finally exit.\n"); } }
Hive jdbc 可以执行add jar, create function, show functions, show tables, 但是目前不能执行list jars, source *.hql,等, 所以jdbc的初始化还得一点点地自己加。
最后运行一个复杂的hql查询怎么都报一个错,最后去查yarn resource manager,发现语法分析没过,但是这个语句可以在cli里运行,错误大概是:
return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.
解决方法,重新启动hive-server2, 命令:server hive-server2 --full-restart
Hive是个好用,但有时候很慢的好dd.