1.安装配置TEZ
1.1 环境要求
- CDH5.9.2(hadoop2.6.0)
- 编译环境:gcc, gcc-c++, make, build
- Nodejs、npm (Tez-ui需要)
- Git
- pb2.5.0
- maven3
- Tez0.8.5
1.2编译环境准备
安装gcc, gcc-c++, make, build
yum install gcc gcc-c++ libstdc++-devel make build
安装Nodejs,npm
wget http://nodejs.org/dist/v0.8.14/node-v0.8.14.tar.gz
./configure
make && make install
安装GIt
https://git-scm.com/download ./configuremakemake install
安装ProtocolBuffer.5.0
https://github.com/google/protobuf/releases/tag/v2.5.0
./configure
make && make install
1.3 编译TEZ
1.3.1 官网下载tez
1.3.2 解压
1.3.3修改源码
/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
diff --git a/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java b/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
index 12491ed..b4ca24c 100644
--- a/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
+++ b/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
@@ -475,5 +475,16 @@ public class JobContextImpl implements JobContext {
public Progressable getProgressible() {
return progress;
}
+
+ /**
+ * Get the boolean value for the property that specifies which classpath
+ * takes precedence when tasks are launched. True - user's classes takes
+ * precedence. False - system's classes takes precedence.
+ * @return true if user's classes should take precedence
+ */
+ @Override
+ public boolean userClassesTakesPrecedence() {
+ return getJobConf().getBoolean(MRJobConfig.MAPREDUCE_JOB_USER_CLASSPATH_FIRST, false);
+ }
}
/tez-ext-service-tests/src/test/java/org/apache/tez/shufflehandler/ShuffleHandler.java
I figured out this answer from my coworker we have to use "headers().set, headers().get" instead of "setHeader(), getHeader()".
1.3.4 修改配置
vi pom.xml
cdh5.9.2
false
2.5.0-cdh5.2.5
cloudera
https://repository.cloudera.com/artifactory/cloudera-repos/
cloudera
https://repository.cloudera.com/artifactory/cloudera-repos/
vi tez-ui/pom.xml
v0.12.9
2.14.9
1.3.5 开始编译
mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true -Dfrontend-maven-plugin.version=0.0.23
注意:出现node.gz.tar文件下载失败 到tez-ui下手动编译:
手动编译tez-ui 和tez-ui2 使用taobao的源
npm --registry=https://registry.npm.taobao.org install --verbose
2.Hive On Tez
- 拷贝tez-0.8.2-minimal目录至HDFS
hdfs dfs -put tez-dist/target/tez-0.8.5-minimal tez-dir/
- 把hadoop-mapreduce-client-common-2.6.0-cdh5.9.2.jar拷贝到hdfs的/tez-dir/tez-0.8.5-minimal目录
- 把tez-0.8.5下jar和lib下的jar拷贝到hive客户端部署的lib目录,删除hive/auxlib下的hive-exec-1.1.0-cdh5.9.3-core.jar和hive-exec-core.jar 否则会有kryo错误。
- 创建tez.size.xml 保存到/etc/hive/conf/
tez.lib.uris
${fs.defaultFS}/tez-dir/tez-0.8.5-minimal,${fs.defaultFS}/tez-dir/tez-0.8.5-minimal/lib
tez.use.cluster.hadoop-libs
true
- 在hive中使用tez
set hive.execution.engine=tez;
截止2017-09-27日,已在公司线上环境使用3个多月,由于时间问题tez-ui没有成功整合。有时间会解决下ui问题。有朋友成功整合tez-ui的也可分享下。