Hive3.1.2 on Tez0.10.1的安装部署

一、Hive环境搭建

  1. Hive安装部署
1)把apache-hive-3.1.2-bin.tar.gz上传到linux的/opt/software目录下
2)解压apache-hive-3.1.2-bin.tar.gz到/opt/module/目录下面
[yili@hadoop102 software]$ tar -zxvf /opt/software/apache-hive-3.1.2-bin.tar.gz -C /opt/module/
3)修改apache-hive-3.1.2-bin.tar.gz的名称为hive
[yili@hadoop102 software]$ mv /opt/module/apache-hive-3.1.2-bin/ /opt/module/
4)修改/etc/profile.d/my_env.sh,添加环境变量
[yili@hadoop102 software]$ sudo vim /etc/profile.d/my_env.sh
添加内容
#HIVE_HOME
export HIVE_HOME=/opt/module/hive-3.1.2
export PATH=$PATH:$HIVE_HOME/bin
重启Xshell对话框或者source一下 /etc/profile.d/my_env.sh文件,使环境变量生效
[yili@hadoop102 software]$ source /etc/profile.d/my_env.sh
5)解决日志Jar包冲突,进入/opt/module/hive-3.1.2/lib目录
[yili@hadoop102 lib]$ mv log4j-slf4j-impl-2.10.0.jar log4j-slf4j-impl-2.10.0.jar.bak
  1. Hive元数据配置到MySQL
    (1)拷贝驱动
将MySQL的JDBC驱动拷贝到Hive的lib目录下
[yili@hadoop102 lib]$ cp /opt/software/mysql-connector-java-bin-5.1.27.jar /opt/module/hive-3.1.2/lib/

(2)配置MySQL作为元数据存储

$HIVE_HOME/conf目录下新建hive-site.xml文件
[atguigu@hadoop102 conf]$ vim hive-site.xml
添加如下内容
"1.0"?>
type="text/xsl" href="configuration.xsl"?>

    
        javax.jdo.option.ConnectionURL</name>
        jdbc:mysql://hadoop102:3306/metastore?useSSL=false?useSSL=false&;useUnicode=true&;characterEncoding=UTF-8</value>
    </property>

    
        javax.jdo.option.ConnectionDriverName</name>
        com.mysql.jdbc.Driver</value>
    </property>

    
        javax.jdo.option.ConnectionUserName</name>
        root</value>
    </property>

    
        javax.jdo.option.ConnectionPassword</name>
        123456</value>
    </property>

    
        hive.metastore.warehouse.dir</name>
        /user/hive/warehouse</value>
    </property>

    
        hive.metastore.schema.verification</name>
        false</value>
    </property>

    
    hive.server2.thrift.port</name>
    10000</value>
    </property>

    
        hive.server2.thrift.bind.host</name>
        hadoop102</value>
    </property>

    
        hive.metastore.event.db.notification.api.auth</name>
        false</value>
    </property>
    
    
        hive.cli.print.header</name>
        true</value>
    </property>

    
        hive.cli.print.current.db</name>
        true</value>
    </property>
</configuration>
  1. 启动Hive
    (1)初始化元数据库

1)登录MySQL

[yili@hadoop102 conf]$ mysql -uroot -p123456

2)新建Hive元数据库

mysql> create database metastore;
mysql> quit;

3)初始化Hive元数据库

[yili@hadoop102 conf]$ schematool -initSchema -dbType mysql -verbose

(2)启动hive客户端
1)启动Hive客户端

[yili@hadoop102 hive]$ bin/hive

2)查看数据库

hive (default)> show databases;
OK
database_name
default
  1. Hive引擎介绍
    Hive引擎包括:默认MR、tez、spark
  2. Hive on Tez配置
    (1)在Hive所在节点部署Tez
    1)tez下载地址:
    链接:https://pan.baidu.com/s/1HgyIgO3GgNtqQaF8EYD4oQ
    提取码:yyds
    2)将tez安装包拷贝到集群,并解压tar包,注意解压的是minimal
 mkdir /opt/module/tez
 tar -zxvf /opt/software/tez-0.10.1-SNAPSHOT-minimal.tar.gz -C /opt/module/tez

3)上传tez依赖到HDFS(上传的是不带minimal的那个)

hadoop fs -mkdir /tez(集群创建/tez路径,然后再上传,注意路径)
hadoop fs -put /opt/software/tez-0.10.1-SNAPSHOT.tar.gz /tez

4)新建tez-site.xml在$HADOOP_HOME/etc/hadoop/路径下(注意,不要放在hive/conf/目录下,不生效),记得把tez-site.xml同步到集群其他机器。

type="text/xsl" href="configuration.xsl"?>

<!-- 注意你的路径以及文件名是否和我的一样 -->

        tez.lib.uris</name>
    ${fs.defaultFS}/tez/tez-0.10.1-SNAPSHOT.tar.gz</value>
</property>

     tez.use.cluster.hadoop-libs</name>
     true</value>
</property>

     tez.am.resource.memory.mb</name>
     1024</value>
</property>

     tez.am.resource.cpu.vcores</name>
     1</value>
</property>

     tez.container.max.java.heap.fraction</name>
     0.4</value>
</property>

     tez.task.resource.memory.mb</name>
     1024</value>
</property>

     tez.task.resource.cpu.vcores</name>
     1</value>
</property>
</configuration>

4)修改Hadoop环境变量,添加以下内容

cd /opt/module/hadoop-3.1.3/etc/hadoop/shellprofile.d
vi example.sh
hadoop_add_profile tez
function _tez_hadoop_classpath
{
    hadoop_add_classpath "$HADOOP_HOME/etc/hadoop" after
    hadoop_add_classpath "/opt/module/tez/*" after
    hadoop_add_classpath "/opt/module/tez/lib/*" after
}

5)修改hive的计算引擎,vim $HIVE_HOME/conf/hive-site.xml,添加以下内容


    hive.execution.engine</name>
    tez</value>
</property>

    hive.tez.container.size</name>
    1024</value>
</property>

6)在hive-env.sh中添加tez的路径

export TEZ_HOME=/opt/module/tez    #是你的tez的解压目录
export TEZ_JARS=""
for jar in `ls $TEZ_HOME |grep jar`; do
    export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/$jar
done
for jar in `ls $TEZ_HOME/lib`; do
    export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/lib/$jar
done

export HIVE_AUX_JARS_PATH=/opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-lzo-0.4.21-SNAPSHOT.jar$TEZ_JARS

7)解决日志Jar包冲突

rm /opt/module/tez/lib/slf4j-log4j12-1.7.10.jar
  1. Hive on Tez测试
1)启动Hive
[yili@hadoop102 hive]$ hive
2)创建表
hive (default)> create table student(
id int,
name string);
3)向表中插入数据
hive> insert into table student values(3,'guodong');
4)如果没有报错就表示成功了
Query ID = yili_20220721144352_ebaa6f16-71e2-4bd5-bfcf-b611af819ebb
Total jobs = 1
Launching Job 1 out of 1
Tez session was closed. Reopening...
Session re-established.
Session re-established.
Status: Running (Executing on YARN cluster with App id application_1658371098258_0011)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED  
----------------------------------------------------------------------------------------------
Map 1 .......... container     SUCCEEDED      1          1        0        0       0       0  
Reducer 2 ...... container     SUCCEEDED      1          1        0        0       0       0  
----------------------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 17.20 s    
----------------------------------------------------------------------------------------------
Loading data to table default.student
OK
Time taken: 31.1 seconds
hive> select * from student;
OK
1       zhangsan
2       linxin
3       guodong

但是现实却往往并没有这么顺利,再执行hive命令以及插入数据都分别报了一个错误,请看如下两个错误;
error1:
Hive3.1.2 on Tez0.10.1的安装部署_第1张图片
错误描述:报错的大概内容就是加载hive的过程中,缺少相关依赖:
解决办法:下载相关依赖commons-collections4-4.1.jar,然后放置到$HIVE_HOME/lib目录下
error2:
Hive3.1.2 on Tez0.10.1的安装部署_第2张图片
错误描述:当向hive库插入数据时候会报上述错误
解决办法:
向表中 执行一次load 函数从本机加载数据到表中,下次就不会报这个错误了。

你可能感兴趣的:(数据开发,hive,hadoop,大数据,Tez引擎)