spark 通过 Thrift JDBC/ODBC server 实现web JDBC查询

注：参考Thrift JDBC/ODBC server
环境：spark 1.6 mysql

sparksql 可作为分布式sql查询引擎，web程序通过JDBC连接实现sql查询;
实现步骤：
1. 运行Thrift JDBC/ODBC server
  在Spark目录下运行下面这个命令，启动一个JDBC/ODBC server
  sbin/start-thriftserver.sh --master spark://192.168.172.103:7077 --driver-class-path /usr/local/hive-2.1/lib/mysql-connector-java-5.1.32.jar
  脚本参数可以参考 bin/spark-submit命令支持的选项参数，外加一个 –hiveconf 选项，来指定Hive属性。运行./sbin/start-thriftserver.sh –help可以查看完整的选项列表。默认情况下，启动的server将会在localhost:10000端口上监听。要改变监听主机名或端口，可以用以下环境变量：export HIVE_SERVER2_THRIFT_PORT= export HIVE_SERVER2_THRIFT_BIND_HOST= ./sbin/start-thriftserver.sh --master
  或者Hive系统属性来指定

./sbin/start-thriftserver.sh \
  --hiveconf hive.server2.thrift.port= \
  --hiveconf hive.server2.thrift.bind.host= \
  --master

通过beeline 进行连接
接下来，你就可以开始在beeline中测试这个Thrift JDBC/ODBC server;
可能需要输入用户名和密码。在非安全模式下，只要输入你本机的用户名和一个空密码即可。对于安全模式，请参考beeline documentation.
./bin/beeline beeline> !connect jdbc:hive2://192.168.172.103:10000、

代码连接

import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager;

 public class HiveJdbcClient {
 private static String driverName = "org.apache.hive.jdbc.HiveDriver";
 
      /**
     * @param args
     * @throws SQLException
    */
 public static void main(String[] args) throws SQLException {
  try {
  Class.forName(driverName);
} catch (ClassNotFoundException e) {
  // TODO Auto-generated catch block
  e.printStackTrace();
  System.exit(1);
}
//replace "hive" here with the name of the user the queries should run as
Connection con = DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive", "");
Statement stmt = con.createStatement();
String tableName = "testHiveDriverTable";
stmt.execute("drop table if exists " + tableName);
stmt.execute("create table " + tableName + " (key int, value string)");
// show tables
String sql = "show tables '" + tableName + "'";
System.out.println("Running: " + sql);
ResultSet res = stmt.executeQuery(sql);
if (res.next()) {
  System.out.println(res.getString(1));
}
   // describe table
sql = "describe " + tableName;
System.out.println("Running: " + sql);
res = stmt.executeQuery(sql);
while (res.next()) {
  System.out.println(res.getString(1) + "\t" + res.getString(2));
}

// load data into table
// NOTE: filepath has to be local to the hive server
// NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line
String filepath = "/tmp/a.txt";
sql = "load data local inpath '" + filepath + "' into table " + tableName;
System.out.println("Running: " + sql);
stmt.execute(sql);

// select * query
sql = "select * from " + tableName;
System.out.println("Running: " + sql);
res = stmt.executeQuery(sql);
while (res.next()) {
  System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2));
}

// regular hive query
sql = "select count(1) from " + tableName;
System.out.println("Running: " + sql);
res = stmt.executeQuery(sql);
while (res.next()) {
  System.out.println(res.getString(1));
}
  }
}```

hive-site.xml 配置beeline

spark 通过 Thrift JDBC/ODBC server 实现web JDBC查询

你可能感兴趣的:(spark 通过 Thrift JDBC/ODBC server 实现web JDBC查询)