Hive 与spark集成 3.1.2

# hive与spark集成 基于Docker

hive 3.1.2

spark 3.1.2

参考spark3.1.2官网中Hive Table(hive metadata)

1.将$HIVE_HOME中所有的jar上传到hdfs上

    1.hdfs dfs -mkdir -p /opt/apace-hive.3.1.2-bin

    2.hdfs dfs -put $HIVE_HOME/hcatalog /opt/apace-hive.3.1.2-bin

    3.hdfs dfs -put $HIVE_HOME/jdbc /opt/apace-hive.3.1.2-bin

    4.hdfs dfs -put $HIVE_HOME/lib /opt/apace-hive.3.1.2-bin

    5.hdfs dfs -put $HIVE_HOME/scripts /opt/apace-hive.3.1.2-bin

2.修改$SPARK_HOME/conf/spark-defaults.conf.template为spark.conf

    vi spark.conf:

        spark.sql.hive.metastore.version    3.1.2

        spark.sql.hive.metastore.jars   path

        spark.sql.hive.metastore.jars.path  hdfs://hadoop-master:9000/opt/apache-hive-3.1.2-bin/lib/*.jar

3.将hive-site.xml copy到$SPARK_HOME/conf/下

    vi hive-site.xml将:

       

            hive.server2.thrift.bind.host

            0.0.0.0

            Bind host on which to run the HiveServer2 Thrift service.

       

        改为:

       

            spark.sql.warehouse.dir

            hdfs://hadoop-master:9000/user/hive/warehouse

       

4.再次构建zeppelin容器

    docker-compose -f docker-compose.yml up -d zeppelin

        配置spark解释器即可

            /opt/spark

            yarn

            client

测试:

    spark.sql(show databases).show()

    在yarn:8088上查看任务执行情况

finish成功

你可能感兴趣的:(hive,spark)