SparkSQL访问Hive源

软件环境

hadoop2.7.6
spark-2.3.0
scala-2.11.12
hive-2.1.1

SparkSQL命令行模式可以直接连接Hive的

将hive目录中的 D:\Soft\apache-hive-2.1.1-bin\conf\hive-site.xml 文件拷贝贝到  D:\Soft\spark\conf spark目录中

D:\soft\spark\jars 目录中放 mysql-connector-java-5.1.30.jar 包

Java程序SparkSQL连接Hive

1)将hive目录中的 D:\Soft\apache-hive-2.1.1-bin\conf\hive-site.xml 文件拷贝到  \src\main\resources 资源目录中

 2)添加依赖  

 
      org.apache.spark
      spark-hive_2.11
      2.3.1
      provided
    

 
      mysql
      mysql-connector-java
      5.1.30
    

3) 创建SparkSession

 /**
     * SparkSession
     * 支持数据源:hive
     * @return
     */
    public static SparkSession getSparkSessionForHive() {
        return SparkSession
                .builder()
                .appName("SparkSQLForHive")
                .master("local[*]")
                .enableHiveSupport()
                .getOrCreate();
    }

4) 测试代码

 public static void main(String[] args) {
        SparkSession spark = SparkUtil.getSparkSessionForHive();
        spark.sql("show tables").show();
        spark.sql("select * from test1").show();
    }

5) 运行结果

18/11/18 22:36:44 INFO CodeGenerator: Code generated in 234.231366 ms
18/11/18 22:36:44 INFO CodeGenerator: Code generated in 11.285122 ms
+--------+--------------+-----------+
|database|     tableName|isTemporary|
+--------+--------------+-----------+
| default|bucket_persion|      false|
| default|   bucket_temp|      false|
| default|         hdfs1|      false|
| default|         hdfs2|      false|
| default|           pt1|      false|
| default|        tbcsv1|      false|
| default|        tbcsv2|      false|
| default|         test1|      false|
| default|  test_table_2|      false|
+--------+--------------+-----------+

.........

18/11/18 22:36:46 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1346 bytes result sent to driver
18/11/18 22:36:46 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 237 ms on localhost (executor driver) (1/1)
18/11/18 22:36:46 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
18/11/18 22:36:46 INFO DAGScheduler: ResultStage 0 (show at redHive.java:14) finished in 0.313 s
18/11/18 22:36:46 INFO DAGScheduler: Job 0 finished: show at redHive.java:14, took 0.352593 s
+-------+---+-------+------+
|   name|age|address|school|
+-------+---+-------+------+
|    chy|  1|     芜湖|    示范|
|    zjj|  2|     南京|    南开|
|gaoxing|  3|    马鞍山|   安工大|
+-------+---+-------+------+

18/11/18 22:36:46 INFO SparkContext: Invoking stop() from shutdown hook

你可能感兴趣的:(大数据)