Tez安装

参考

  1. tez依赖于hadoop,需要针对具体版本hadoop进行编译,将根pom中hadoop.version变量更改为对应的版本,编译机器上需安装protobuf2.5,编译过程中tez.ui可能编译不过,可以在根pom中将tez-ui/tz-ui2注释掉

    mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true
    
  2. 编译完成后在tez/dist/target下会生成tez-x.y.z.tar.gz和tez-x.y.z.minimal.tar.gz, 将tez-x.y.z.tar.gz拷贝至hdfs

    hadoop fs -copyFromLocal tez-x.y.z.tar.gz /hdfs/tez/
    tar xzvf x.y.z.minimal.tar.gz -C /usr/local/tez
    
  3. 将tez-site.mxl放在$HADOOP_CONF_DIR下

    
     tez.lib.uris
     /hdfs/tez/tez.tgz 
    
    
  4. 在$HADOOP__CONF__DIR/hadoop-env.sh中将tez lib添加到classpath

    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_JARS/*:$TEZ_JARS/lib/*
    
  5. 测试tez

     yarn jar tez-examples.jar orderedwordcount  
    

    测试过程中可能会有ClassNotFoundException, 我遇到了lzo相关class没找到,只需将在step2之前相关的jar归档到tez-x.y.z.tar.gz中然后再上传至hdfs.

  6. hive开启tez

     set hive.execution.engine=tez;
    

    我用hql查询时碰到了一个错误:

    java.lang.RuntimeException: java.io.IOException: Previous writer likely failed to write hdfs://biphdfs/tmp/hive/pplive/_tez_session_dir/7ecca665-ddbd-4f31-87a7-11d972e096bc/bip.common-0.2.7.518.jar. Failing because I am unlikely to write too.
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:523)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79)
    Caused by: java.io.IOException: Previous writer likely failed to write hdfs://biphdfs/tmp/hive/pplive/_tez_session_dir/7ecca665-ddbd-4f31-87a7-11d972e096bc/bip.common-0.2.7.518.jar. Failing because I am unlikely to write too.
    at org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:978)
    at org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:859)
    at org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:802)
    at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.refreshLocalResourcesFromConf(TezSessionState.java:228)
    at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:154)
    at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:123)
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:520)

    最后发现应该是$HIVE_HOME/auxlib下的jar与hiverc中yar有重复导致,将auxlib中jar去掉即可

7.安装tez-ui,最后验证下来会报错:

    [FATAL] [HistoryEventHandlingThread] |yarn.YarnUncaughtExceptionHandler|: Thread Thread[HistoryEventHandlingThread,5,main] threw an Error.  Shutting down now...
    java.lang.AbstractMethodError: org.codehaus.jackson.map.AnnotationIntrospector.findSerializer(Lorg/codehaus/jackson/map/introspect/Annotated;)Ljava/lang/Object;
    at org.codehaus.jackson.map.ser.BasicSerializerFactory.findSerializerFromAnnotation(BasicSerializerFactory.java:362)
    at org.codehaus.jackson.map.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:252)
    at org.codehaus.jackson.map.ser.StdSerializerProvider._createUntypedSerializer(StdSerializerProvider.java:782)
    at org.codehaus.jackson.map.ser.StdSerializerProvider._createAndCacheUntypedSerializer(StdSerializerProvider.java:735)
    at org.codehaus.jackson.map.ser.StdSerializerProvider.findValueSerializer(StdSerializerProvider.java:344)
    at org.codehaus.jackson.map.ser.StdSerializerProvider.findTypedValueSerializer(StdSerializerProvider.java:420)
    at org.codehaus.jackson.map.ser.StdSerializerProvider._serializeValue(StdSerializerProvider.java:601)
    at org.codehaus.jackson.map.ser.StdSerializerProvider.serializeValue(StdSerializerProvider.java:256)
    at org.codehaus.jackson.map.ObjectMapper.writeValue(ObjectMapper.java:1604)
    at org.codehaus.jackson.jaxrs.JacksonJsonProvider.writeTo(JacksonJsonProvider.java:527)
    at com.sun.jersey.api.client.RequestWriter.writeRequestEntity(RequestWriter.java:300)
    at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:204)
    at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147)
    at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineJerseyRetryFilter$1.run(TimelineClientImpl.java:226)
    at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:162)
    at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineJerseyRetryFilter.handle(TimelineClientImpl.java:237)
    at com.sun.jersey.api.client.Client.handle(Client.java:648)
    at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
    at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
    at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
    at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingObject(TimelineClientImpl.java:472)
    at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPosting(TimelineClientImpl.java:321)
    at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:301)
    at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.handleEvents(ATSHistoryLoggingService.java:357)
    at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.access$700(ATSHistoryLoggingService.java:53)
    at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService$1.run(ATSHistoryLoggingService.java:190)
    at java.lang.Thread.run(Thread.java:745)

应该是tez基于cdh5.5编译后包含的jackson-mapper与jersey-json版本冲突导致,基于cdh编译的jersey jar是1.9,jackson jar是1.8,但基于hadoop2.6编译后都是1.9版本,cdh貌似不会支持tez,暂且只能不用web ui了。

你可能感兴趣的:(BigData)