TEZ的安装以及测试

TEZ 0.5以后得版本和hive0.13不兼容

1、先解压编译好的tez包
tar -xvf tez-0.7.0.tar.gz -C /home/hadoop/tez
2、在hdfs上面创建一个目录,并且将tez包上传到hdfs上

[hadoop@master tez]$ hadoop fs -mkdir /tez

16/04/04 06:47:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable

[hadoop@master tez]$ hadoop fs -put ../software/tez-0.7.0.tar.gz  /tez

16/04/04 06:48:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable

[hadoop@master tez]$ hadoop fs -ls /tez

16/04/04 06:48:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Found 1 items
-rw-r–r– 3 hadoop supergroup 39070341 2016-04-04 06:48 /tez/tez-0.7.0.tar.gz

3、需要在hadoop的master节点上面的 ${HADOOP_HOME}/etc/hadoop目录下面创建一个tez-site.xml文件,里面填写如下内容



<configuration>
<property>
<name>tez.lib.urisname>
<value>${fs.defaultFS}/tez/tez-0.7.0.tar.gzvalue>
property>
configuration>

以上完成了tez的基本配置,接下来有2中方法使我们的任务运行在tez上,

1、mapreduce on tez
1、1修改 hadoop-env.sh

    export TEZ_HOME=/oneapm/local/tez-0.5.3 
                for jar in `ls $TEZ_HOME |grep jar`; do 
        export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_HOME/$jar 
            done
                  for jar in `ls $TEZ_HOME/lib`; do 
        export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_HOME/lib/$jar 
            done
在HADOOP_CLASSPATH变量声明复制后加入如上代码,将tez的所有jar包添加到hadoop_classpath中
1、2 编辑mapred-site.xml
<property>
         <name>mapreduce.framework.namename>
         <value>yarn-tezvalue>
 property>

将mapreduce.framework.name的值从yarn设置为yarn-tez,即表明本集群中的mr任务将运行在tez之上;
修改上述三个文件后,将三个文件同步到其他的节点上,并重启hadoop集群

在hadoop上提交任务

[hadoop@master tez]$ hadoop jar tez-examples-0.7.0.jar orderedwordcount /input1/EMP /output3 

通过8088端口可以看到运行的的程序是执行在tez上的
TEZ的安装以及测试_第1张图片

如果看到application type的类型是TEZ说明TEZ安装成功,
如果是这样配置的话,这个就是一个全局的,以后所有的mapreduce都会走TEZ,如果我们只想hive的任务走TEZ,应该如下设置

2、hive的mapreduce运行在tez上
注:1中将上述的任务全部回滚回去

2、1将tez安装目录和tez安装目录下来的lib目录的jar全部拷贝到 ${HIVE_HOME}/lib目录下面

[hadoop@master tez]$ cp tez-*.jar ../apache-hive-0.13.0-bin/lib/
[hadoop@master lib]$ cp *.jar ~/apache-hive-0.13.0-bin/lib/

如果是这样的话,我执行任务会发现如下的错误。这个错误意思就是 TEZ和hive0.13不兼容,所以我们需要下载hive0.14的版本
错误信息
java.lang.NoSuchMethodError: org.apache.tez.mapreduce.hadoop.MRHelpers.getBaseMRConfiguration(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/conf/Configuration;
at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createConfiguration(DagUtils.java:857)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:121)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.tez.TezTask. org.apache.tez.mapreduce.hadoop.MRHelpers.getBaseMRConfiguration(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/conf/Configuration;

安装完成hive0.14以后记得修改HIVE_HOME的值
将tez安装目录和tez安装目录下来的lib目录的jar全部拷贝到 ${HIVE_HOME}/lib目录下面

[hadoop@master tez]$ cp *.jar ~/hive/apache-hive-0.14.0-bin/lib/
[hadoop@master tez]$ cp ./lib/*.jar ~/hive/apache-hive-0.14.0-bin/lib/

如果执行hive语句显示的是类似下面的说明TEZ on hive以后成功了

hive> set hive.execution.engine=tez;
hive> select count(1) from xiaomi;  
Query ID = hadoop_20160405065252_0fa4e612-3b2f-468a-8d61-d700c96105d3
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1459863530667_0002)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 10.96 s    
--------------------------------------------------------------------------------
OK
300
Time taken: 23.82 seconds, Fetched: 1 row(s)

如果你想运行任务在mapreduce上面,你可以设置 set hive.execution.engine=mr;

根据如上的配置,集群已经是tez模式了,通过set hive.execution.engine=mr; 似乎没有什么作用,8088任务列表的application type依然为tez

你可能感兴趣的:(tez,SQL,on,hadoop)