Hive更换TEZ引擎

1、关于版本问题:

hive-1.2.1+tez-0.9.0+hadoop-2.7.7
hive-2.3.6+tez-0.9.0+hadoop-2.7.1
两种搭配均试过可行,之前配置的时候觉得与版本有很大关系,试过之后发现版本影响不大,还没发现有问题

2、Tez安装与配置

1、解压并修改文件夹名字

[hadoop@hadoop01 tez]$ tar -zxvf apache-tez-0.9.0-bin.tar.gz
[hadoop@hadoop01 tez]$ mv apache-tez-0.9.0-bin tez-0.9.0

2、在hdfs上创建/tez-0.9.0目录,并tez.tar.gz文件上传至此目录

[hadoop@hadoop01 hadoop]$ hdfs dfs -mkdir /tez-0.9.0
[hadoop@hadoop01 hadoop]$ hdfs dfs -put /home/hadoop/app/tez/tez-0.9.0/share/tez.tar.gz  /tez-0.9.0

3、更换tez下的lib目录中的hadoop包的版本,将其与hadoop版本一致。
删除/tez-0.9.0/lib下的hadoop包,并复制hadoop包至此文件夹

[hadoop@hadoop01 lib]# rm hadoop-mapreduce-client-core-2.7.0.jar hadoop-mapreduce-client-common-2.7.0.jar
cp /home/hadoop/app/hdfs/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.7.jar hadoop-mapreduce-client-core-2.7.7.jar ./

4、在hadoop/hadoop-2.7.7/etc/hadoop/目录 下增加 tez-site.xml文件

[hadoop@hadoop01 hadoop]$ vi tez-site.xml



tez.lib.uris
${fs.defaultFS}/tez-0.9.0/tez.tar.gz



tez.container.max.java.heap.fraction
0.2

  
     tez.use.cluster.hadoop-libs
     true
  
  
     tez.history.logging.service.class
     org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService
  


并将将文件发送至其他节点

[hadoop@hadoop02 hadoop]$ scp tez-site.xml hadoop@hadoop02:/home/hadoop/app/hdfs/hadoop/hadoop-2.7.7/etc/hadoop
[hadoop@hadoop02 hadoop]$ scp tez-site.xml hadoop@hadoop03:/home/hadoop/app/hdfs/hadoop/hadoop-2.7.7/etc/hadoop

3、配置tez至hive上

1、在hive-env-sh 新增以下内容

export TEZ_HOME=/home/hadoop/app/tez/tez-0.9.0
export TEZ_JARS=""
for jar in `ls $TEZ_HOME |grep jar`; do
    export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/$jar
done
for jar in `ls $TEZ_HOME/lib`; do
 export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/lib/$jar
done
#在hadoop下找对应的lzo包
export HIVE_AUX_JARS_PATH=$HADOOP_HOME/share/hadoop/common/hadoop-lzo-0.4.21-SNAPSHOT.jar$TEZ_JARS

2、启动hadoop与hive并测试

hive> set hive.execution.engine=tez;
hive> select
    > age,
    > count(*)
    > from tz
    > group by age
    > ;
Query ID = hadoop_20190920100059_baf244cb-d9ce-435b-a74a-f884a0d95835
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id application_1568944648748_0001)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 26.91 s
--------------------------------------------------------------------------------
OK
16	2
18	3
Time taken: 64.555 seconds, Fetched: 2 row(s)

成功。

注:关于在hadoop集群所有在yarn上走的mr均使用tez配置

安装Tez在Hadoop上会对集群影响,使得所有在yarn上运行的mapreduce都只能走tez引擎。

1、修改hadoop-env.sh增加以下内容

export TEZ_HOME=/opt/moudle/tez  #是你的tez的解压安装目录
for jar in `ls $TEZ_HOME |grep jar`; do
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_HOME/$jar
done
for jar in `ls $TEZ_HOME/lib`; do
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_HOME/lib/$jar
done

2、修改mapred-site.xml


        mapreduce.framework.name
        yarn-tez

3、将文件同步至集群其他节点

你可能感兴趣的:(linux中安装工具)