tez安装

tez
#简介
tez通过允许像Apache Hive和Apache Pig这样的项目运行复杂的任务,Tez可以用来处理数据,这在更早地时候采用了多个MR job,现在可以在一个Tez的执行工作

#官网
http://tez.apache.org/

#特性
1、表达数据流定义的api
2、灵活的Input-Processor-Output运行时模型
3、数据类型不确定
4、简化的部署
5、在Map reduce上的性能提升
6、最优的资源管理
7、计划在运行时重新配置
8、动态物理数据流决策

#安装部署
1、依赖hadoop2.7+
2、复制安装包到hadoop
hadoop fs -mkdir /apps/apache-tez-0.9.0-bin
hadoop fs -copyFromLocal /opt/soft/apache-tez-0.9.0-bin.tar.gz /apps/apache-tez-0.9.0-bin/
3、配置tez-site.xml文件
tez.lib.uris=${fs.defaultFS}/apps/tez-x.y.z-SNAPSHOT/tez-x.y.z-SNAPSHOT.tar.gz
tez.use.cluster.hadoop-libs=false
4、修改 mapred-site.xml
mapreduce.framework.name=yarn-tez
5、解压客户端文件
tar -xvzf tez-dist/target/tez-x.y.z-minimal.tar.gz -C $TEZ_JARS
6、设置tez-site.xml
set TEZ_CONF_DIR to the location of tez-site.xml
7、设置环境变量
export HADOOP_CLASSPATH=${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*
8、运行示例
$HADOOP_PREFIX/bin/hadoop jar tez-examples.jar orderedwordcount
9、To use TEZ sessions, set -DUSE_TEZ_SESSION=true
$HADOOP_PREFIX/bin/hadoop jar tez-tests.jar testorderedwordcount -DUSE_TEZ_SESSION=true
10、提交任务MR 
$HADOOP_PREFIX/bin/hadoop jar hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT-tests.jar sleep -mt 1 -rt 1 -m 1 -r 1

#Tez UI 
1、配置
tez-site.xml
-------------
...

  Enable Tez to use the Timeline Server for History Logging
  tez.history.logging.service.class
  org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService


  URL for where the Tez UI is hosted
  tez.tez-ui.history-url.base
  http://


yarn-site.xml
-------------
...

  Indicate to clients whether Timeline service is enabled or not.
  If enabled, the TimelineClient library used by end-users will post entities
  and events to the Timeline server.

  yarn.timeline-service.enabled
  true


  The hostname of the Timeline service web application.
  yarn.timeline-service.hostname
  localhost


  Enables cross-origin support (CORS) for web services where
  cross-origin web response headers are needed. For example, javascript making
  a web services request to the timeline server.

  yarn.timeline-service.http-cross-origin.enabled
  true


  Publish YARN information to Timeline Server
  yarn.resourcemanager.system-metrics-publisher.enabled
  true

2、下载tez-ui.war
http://tez.apache.org/releases/apache-tez-0-9-0.html

3、修改tez-ui.war 中scripts/configs.js,取消以下代码注释
 // timelineBaseUrl: 'http://localhost:8188',
 // RMWebUrl: 'http://localhost:8088',

4、tomcat中运行


#Tez Shuffle Handler
1、修改配置
tez-site.xml
-------------
...

  tez.am.shuffle.auxiliary-service.id
  tez_shuffle

2、部署Tez Shuffle Handler
The Tez Shuffle Handler jar artifact org.apache.org:tez-aux-services needs to be placed into the Node Manager classpath and restarted

3、安装节点管理(Node Manager)
yarn-site.xml
-------------
...

  yarn.nodemanager.aux-services
  tez_shuffle


  yarn.nodemanager.aux-services.tez_shuffle.class
  org.apache.tez.auxservices.ShuffleHandler

...

#其他
tez.lib.uris支持逗号分隔的简单文件、目录、压缩包(‘tgz’, ‘tar.gz’, ‘zip’, etc)

你可能感兴趣的:(软件)