spark 通过 Thrift JDBC/ODBC server 实现web JDBC查询

注:参考Thrift JDBC/ODBC server
环境:spark 1.6 mysql

  • sparksql 可作为分布式sql查询引擎,web程序通过JDBC连接实现sql查询;
  • 实现步骤:
    1. 运行Thrift JDBC/ODBC server
      在Spark目录下运行下面这个命令,启动一个JDBC/ODBC server
      sbin/start-thriftserver.sh --master spark://192.168.172.103:7077 --driver-class-path /usr/local/hive-2.1/lib/mysql-connector-java-5.1.32.jar
      脚本参数可以参考 bin/spark-submit命令支持的选项参数,外加一个 –hiveconf 选项,来指定Hive属性。运行./sbin/start-thriftserver.sh –help可以查看完整的选项列表。默认情况下,启动的server将会在localhost:10000端口上监听。要改变监听主机名或端口,可以用以下环境变量:export HIVE_SERVER2_THRIFT_PORT= export HIVE_SERVER2_THRIFT_BIND_HOST= ./sbin/start-thriftserver.sh --master
      或者Hive系统属性 来指定
./sbin/start-thriftserver.sh \
  --hiveconf hive.server2.thrift.port= \
  --hiveconf hive.server2.thrift.bind.host= \
  --master 
  1. 通过beeline 进行连接
    接下来,你就可以开始在beeline中测试这个Thrift JDBC/ODBC server;
    可能需要输入用户名和密码。在非安全模式下,只要输入你本机的用户名和一个空密码即可。对于安全模式,请参考beeline documentation.
    ./bin/beeline beeline> !connect jdbc:hive2://192.168.172.103:10000

  2. 代码连接

    import java.sql.SQLException;
    import java.sql.Connection;
    import java.sql.ResultSet;
    import java.sql.Statement;
    import java.sql.DriverManager;
    
     public class HiveJdbcClient {
     private static String driverName = "org.apache.hive.jdbc.HiveDriver";
     
          /**
         * @param args
         * @throws SQLException
        */
     public static void main(String[] args) throws SQLException {
      try {
      Class.forName(driverName);
    } catch (ClassNotFoundException e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
      System.exit(1);
    }
    //replace "hive" here with the name of the user the queries should run as
    Connection con = DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive", "");
    Statement stmt = con.createStatement();
    String tableName = "testHiveDriverTable";
    stmt.execute("drop table if exists " + tableName);
    stmt.execute("create table " + tableName + " (key int, value string)");
    // show tables
    String sql = "show tables '" + tableName + "'";
    System.out.println("Running: " + sql);
    ResultSet res = stmt.executeQuery(sql);
    if (res.next()) {
      System.out.println(res.getString(1));
    }
       // describe table
    sql = "describe " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1) + "\t" + res.getString(2));
    }
    
    // load data into table
    // NOTE: filepath has to be local to the hive server
    // NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line
    String filepath = "/tmp/a.txt";
    sql = "load data local inpath '" + filepath + "' into table " + tableName;
    System.out.println("Running: " + sql);
    stmt.execute(sql);
    
    // select * query
    sql = "select * from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2));
    }
    
    // regular hive query
    sql = "select count(1) from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1));
    }
      }
    }```
    
    
  3. hive-site.xml 配置beeline

你可能感兴趣的:(spark 通过 Thrift JDBC/ODBC server 实现web JDBC查询)