tez官网:http://tez.apache.org
在使用tez作为计算引擎使用之前先说明下tez-ui。tez-ui是查看tez任务执行日志的的web界面,依赖于yarn的timeline服务。tez0.8.3中又增加了tez-ui2。
timeline服务是apache hadoop2.6.0之后加入作为yarn的一个子服务。jobhistoryserver只能储存Mapreduce的历史日志,但是不支持诸如tez、spark等其他计算引擎历史日志的访问,所以在2.6.0中增加了timeline服务。timelineserver同时支持mapreduce、tez、spark on yarn等计算引擎任务在非本地模式的历史日志访问,当然jobhistoryserver还是可以同时使用的。
建议使用apache hadoop2.6.4+和apche hadoop2.7.2+,低版本较多的timeline服务bug。
详细的版本改进和.BUG修复可以参照http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-yarn/CHANGES.txt。
1、安装jdk1.7,maven3.3.*,protobuf2.5.0
2、通过http://www.apache.org/dyn/closer.lua/tez/,下载源码(由于tez-ui是0.6.*版本后支持,所以建议使用0.7.*版本或者0.8.*)。
解压至如下目录:${project_home}/apache-tez-0.8.3-src
3、修改pom.xml中参数
指定hadoop版本
<hadoop.version>2.6.4</hadoop.version>
protobuf安装之后protoc命令的位置
<protoc.path>/usr/local/protobuf-2.5.0/bin/protoc</protoc.path>
4、改完配置文件就能在src目录下执行编译命令了(当然你也可以在Eclipse或者IntelliJ IDEA中进行编译):
mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true
然后就是刷屏刷屏...,最后一堆SUCCESS。(当然也可能是Failed)
[INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] tez ............................................... SUCCESS [1.626s] [INFO] hadoop-shim ....................................... SUCCESS [1.720s] [INFO] tez-api ........................................... SUCCESS [5.598s] [INFO] tez-common ........................................ SUCCESS [0.425s] [INFO] tez-runtime-internals ............................. SUCCESS [0.612s] [INFO] tez-runtime-library ............................... SUCCESS [1.688s] [INFO] tez-mapreduce ..................................... SUCCESS [0.988s] [INFO] tez-examples ...................................... SUCCESS [0.202s] [INFO] tez-dag ........................................... SUCCESS [2.407s] [INFO] tez-tests ......................................... SUCCESS [0.572s] [INFO] tez-ext-service-tests ............................. SUCCESS [0.487s] [INFO] tez-ui ............................................ SUCCESS [10.163s] [INFO] tez-ui2 ........................................... SUCCESS [1:51.654s] [INFO] tez-plugins ....................................... SUCCESS [0.023s] [INFO] tez-yarn-timeline-history ......................... SUCCESS [0.383s] [INFO] tez-yarn-timeline-history-with-acls ............... SUCCESS [0.254s] [INFO] tez-history-parser ................................ SUCCESS [7.432s] [INFO] tez-tools ......................................... SUCCESS [0.022s] [INFO] tez-perf-analyzer ................................. SUCCESS [0.022s] [INFO] tez-job-analyzer .................................. SUCCESS [0.272s] [INFO] tez-javadoc-tools ................................. SUCCESS [0.095s] [INFO] hadoop-shim-impls ................................. SUCCESS [0.021s] [INFO] hadoop-shim-2.6 ................................... SUCCESS [0.118s] [INFO] tez-dist .......................................... SUCCESS [9.088s] [INFO] Tez ............................................... SUCCESS [0.052s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 2:36.974s [INFO] Finished at: Sun Apr 24 19:10:56 CST 2016 [INFO] Final Memory: 94M/1298M [INFO] ------------------------------------------------------------------------ Process finished with exit code 0
生成安装包
${project_home}/apache-tez-0.8.3-src/tez-dist/target/tez-0.8.3-minimal.tar.gz ${project_home}/apache-tez-0.8.3-src/tez-dist/target/tez-0.8.3.tar.gz
$HADOOP_HOME/etc/hadoop目录下增加tez-site.xml文件,增加内容如下(还有一堆性能参数,自己根据实际环境添加吧):
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>tez.lib.uris</name> <value>hdfs://beh/engine/tez/tez.tar.gz</value> <!--<value>file:///opt/beh/core/hadoop/lib/tez.tar.gz</value>--> </property> <property> <name>tez.history.logging.service.class</name> <value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value> </property> <property> <description>Publish configuration information to Timeline server.</description> <name>tez.runtime.convert.user-payload.to.history-text</name> <value>true</value> </property> <property> <description>URL for where the Tez UI is hosted</description> <name>tez.tez-ui.history-url.base</name> <value>http://hadoop001:8280/tez-ui/</value> </property> <property> <name>tez.allow.disabled.timeline-domains</name> <value>true</value> </property> </configuration>
备注:
#这个参数指定的是编译完成的tez包,建议将tar包直接传至hdfs,最好不要使用本地存储tar包。这里可以直接使用mini包,也可以使用完整包。 <property> <name>tez.lib.uris</name> <value>hdfs://beh/engine/tez/tez.tar.gz</value> <!--<value>file:///opt/beh/core/hadoop/lib/tez.tar.gz</value>--> </property>
#这个参数是使用tez-ui的web服务相关地址,可以使用主机名或者ip地址,端口自选。由于tez-ui是个web app依赖于web服务器,我这里选的tomcat服务器,怎么使用后面讲。 <property> <description>URL for where the Tez UI is hosted</description> <name>tez.tez-ui.history-url.base</name> <value>http://hadoop001:8280/tez-ui/</value> </property>
$HADOOP_HOME/etc/hadoop/hadoop-env.sh中添加tez的环境变量:
##tez export BEH_HOME=/opt/beh export TEZ_HOME=${BEH_HOME}/core/tez export TEZ_CONF_DIR=${HADOOP_HOME}/etc/hadoop export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:${TEZ_CONF_DIR}:${TEZ_HOME}/*:${TEZ_HOME}/lib/*
TEZ_HOME是你解压tez安装包的位置。
<property> <name>mapreduce.framework.name</name> <value>yarn-tez</value> </property>
Optional: If running existing MapReduce jobs on Tez. Modify mapred-site.xml to change “mapreduce.framework.name” property from its default value of “yarn” to “yarn-tez”
在$HADOOP_HOME/etc/hadoop/yarn-site.xml中设置timeline服务。
相关设置参考yarn官网和tez官网设置。
$HIVE_HOME/conf/hive-site.xml修改并添加如下设置:
<!--tez Start--> <property> <name>hive.execution.engine</name> <value>tez</value> </property> <property> <name>hive.tez.container.size</name> <value>4096</value> </property> <property> <name>hive.tez.java.opts</name> <value>-server -Xmx4096m -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseParallelGC</value> </property> <property> <name>hive.server2.tez.initialize.default.sessions</name> <value>false</value> </property> <property> <name>hive.server2.tez.default.queues</name> <value>default</value> </property> <property> <name>hive.tez.input.format</name> <value>org.apache.hadoop.hive.ql.io.HiveInputFormat</value> </property> <property> <name>hive.server2.tez.sessions.per.default.queue</name> <value>1</value> </property> <!--tez End-->
tez安装包解压后产生tez-ui-0.8.3.war(当然你可能编译的其他版本),在这个war包下的sripts目录下的configs.js中修改resourcemanager服务地址端口和timeline服务地址端口。
/** * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ App.setConfigs({ /* Environment configurations */ envDefaults: { version: "0.8.3", /* * By default TEZ UI looks for timeline server at http://localhost:8188, uncomment and change * the following value for pointing to a different domain. */ // timelineBaseUrl: 'http://localhost:8188', timelineBaseUrl: 'http://hadoop001:8188', /* * By default RM web interface is expected to be at http://localhost:8088, uncomment and change * the following value to point to a different domain. */ // RMWebUrl: 'http://localhost:8088', RMWebUrl: 'http://hadoop001:23188', /* * Ensures that some of the UI features work with old versions of Tez */ compatibilityMode: false, /* * Default time zone for UI display. Set to undefined for local timezone * For configuration see http://momentjs.com/timezone/docs/ */ //timezone: "UTC", }, /* * Visibility of table columns can be controlled using the column selector. Also an optional set of * file system counters can be enabled as columns for most of the tables. For adding more counters * as columns edit the following 'tables' object. Counters must be added as configuration objects * of the following format. * { * counterName: '<Counter ID>', * counterGroupName: '<Group ID>', * } * * Note: Till 0.6.0 the properties were counterId and groupId, their use is deprecated now. */ tables: { /* * Entity specific columns must be added into the respective array. */ entity: { dag: [ // { // Following is a sample configuration object. // counterName: 'FILE_BYTES_READ', // counterGroupName: 'org.apache.tez.common.counters.FileSystemCounter', // } ], vertex: [], task: [], taskAttempt: [], tezApp: [], }, /* * User sharedColumns to add counters that must be displayed in all tables. */ sharedColumns:[] } });
然后将war复制到tomcat安装目录的webapps下。然后就可以启动tomcat并且登录tez-ui网址了。
本文登陆地址是:
http://hadoop001:8280/tez-ui/