CDH5.9.2 整合TEZ

1.安装配置TEZ

1.1 环境要求

  • CDH5.9.2(hadoop2.6.0)
  • 编译环境:gcc, gcc-c++, make, build
  • Nodejs、npm (Tez-ui需要)
  • Git
  • pb2.5.0
  • maven3
  • Tez0.8.5

1.2编译环境准备

安装gcc, gcc-c++, make, build

yum install gcc gcc-c++ libstdc++-devel make build

安装Nodejs,npm

wget http://nodejs.org/dist/v0.8.14/node-v0.8.14.tar.gz
./configure 
make && make install

安装GIt

https://git-scm.com/download  ./configuremakemake install

安装ProtocolBuffer.5.0

https://github.com/google/protobuf/releases/tag/v2.5.0
./configure
make && make install

1.3 编译TEZ

1.3.1 官网下载tez

1.3.2 解压

1.3.3修改源码

/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java

diff --git a/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java b/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
 
index 12491ed..b4ca24c 100644
 
--- a/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
 
+++ b/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
 
@@ -475,5 +475,16 @@ public class JobContextImpl implements JobContext {
 
   public Progressable getProgressible() {
 
     return progress;
 
   }
 
+
 
+  /**
 
+   * Get the boolean value for the property that specifies which classpath
 
+   * takes precedence when tasks are launched. True - user's classes takes
 
+   * precedence. False - system's classes takes precedence.
 
+   * @return true if user's classes should take precedence
 
+   */
 
+   @Override
 
+  public boolean userClassesTakesPrecedence() {
 
+    return getJobConf().getBoolean(MRJobConfig.MAPREDUCE_JOB_USER_CLASSPATH_FIRST, false);
 
+  }
 
    
 
 }

/tez-ext-service-tests/src/test/java/org/apache/tez/shufflehandler/ShuffleHandler.java

I figured out this answer from my coworker we have to use "headers().set, headers().get" instead of "setHeader(), getHeader()".

1.3.4 修改配置

vi pom.xml


    cdh5.9.2
    
      false
    
    
       2.5.0-cdh5.2.5
     
     
       
          cloudera
          https://repository.cloudera.com/artifactory/cloudera-repos/
       
      
      
      
         cloudera
         https://repository.cloudera.com/artifactory/cloudera-repos/
       
     
   

vi tez-ui/pom.xml

v0.12.9
2.14.9

1.3.5 开始编译

mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true  -Dfrontend-maven-plugin.version=0.0.23

注意:出现node.gz.tar文件下载失败 到tez-ui下手动编译:

手动编译tez-ui 和tez-ui2 使用taobao的源

npm --registry=https://registry.npm.taobao.org install --verbose

2.Hive On Tez

  1. 拷贝tez-0.8.2-minimal目录至HDFS
hdfs dfs -put tez-dist/target/tez-0.8.5-minimal tez-dir/
  1. 把hadoop-mapreduce-client-common-2.6.0-cdh5.9.2.jar拷贝到hdfs的/tez-dir/tez-0.8.5-minimal目录
  2. 把tez-0.8.5下jar和lib下的jar拷贝到hive客户端部署的lib目录,删除hive/auxlib下的hive-exec-1.1.0-cdh5.9.3-core.jar和hive-exec-core.jar 否则会有kryo错误。
  3. 创建tez.size.xml 保存到/etc/hive/conf/


 

 

 

 
   tez.lib.uris
   ${fs.defaultFS}/tez-dir/tez-0.8.5-minimal,${fs.defaultFS}/tez-dir/tez-0.8.5-minimal/lib
 
 
   tez.use.cluster.hadoop-libs
   true
 

  1. 在hive中使用tez
set hive.execution.engine=tez;

截止2017-09-27日,已在公司线上环境使用3个多月,由于时间问题tez-ui没有成功整合。有时间会解决下ui问题。有朋友成功整合tez-ui的也可分享下。

你可能感兴趣的:(CDH5.9.2 整合TEZ)