Tez是Apache最新的支持DAG作业的开源计算框架,它可以将多个有依赖的作业转换为一个作业从而大幅提升DAG作业的性能。Tez并不直接面向最终用户——事实上它允许开发者为最终用户构建性能更快、扩展性更好的应用程序。Hadoop传统上是一个大量数据批处理平台。但是,有很多用例需要近乎实时的查询处理性能。还有一些工作则不太适合MapReduce,例如机器学习。Tez的目的就是帮助Hadoop处理这些用例场景。
sudo mkdir /usr/local/src/tez
sudo chown -R ucmed:ucmed /usr/local/src/tez
cp /tmp/apache-tez-0.9.2-bin.tar.gz /usr/local/src/tez/
cd /usr/local/src/tez/ && tar -xzvf ./apache-tez-0.9.2-bin.tar.gz
2.上传tez依赖到HDFS
hadoop fs -mkdir /tez
hadoop fs -put /usr/local/src/tez/apache-tez-0.9.2-bin/share/tez.tar.gz /tez
3.配置tez-site.xml
在hadoop/etc/hadoop下创建tez-site.xml文件并写上如下配置
tez.lib.uris
${fs.defaultFS}/tez/tez.tar.gz
保存后将文件复制到别的节点
4.配置hadoop-env.sh
追加下列几行
TEZ_CONF_DIR=/usr/local/src/hadoop/hadoop-3.3.1/etc/hadoop/tez-site.xml
TEZ_JARS=/usr/local/src/tez/apache-tez-0.9.2-bin
export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*
5.配置hive-site.xml
hive.execution.engine
tez
6.解决兼容性问题,配置hdfs-site.xml
dfs.client.datanode-restart.timeout
30
(以上这步骤很容易踩坑,hadoop2和hadoop3存在配置文件兼容性问题)
7.重启hadoop
再次提交任务,已经切换到tez
性能对比
三台32g8c机器
每台机器yarn分配资源为16g内存8c
mapreduce性能
切换到tez性能
快了约十倍,而这一切都只是切换引擎提供的优化
tomcat部署
sudo mkdir /usr/local/src/tomcat
sudo chown -R ucmed:ucmed /usr/local/src/tomcat
cp /tmp/apache-tomcat-8.5.60.zip /usr/local/src/tomcat/
cd /usr/local/src/tomcat/ && unzip ./apache-tomcat-8.5.60.zip
部署war包
在webapps下创建tez-ui目录
mkdir /usr/local/src/tomcat/apache-tomcat-8.5.60/webapps/tez-ui
cd /usr/local/src/tomcat/apache-tomcat-8.5.60/webapps/tez-ui
cp /usr/local/src/tez/apache-tez-0.9.2-bin/tez-ui-0.9.2.war ./
unzip ./tez-ui-0.9.2.war
vim config/configs.env
timeline: "http://master:8188"
rm: "http://master:8088"
配置timelineserver
vim yarn-site.xml
yarn.timeline-service.enabled
true
yarn.timeline-service.hostname
master
yarn.timeline-service.http-cross-origin.enabled
true
yarn.resourcemanager.system-metrics-publisher.enabled
true
yarn.timeline-service.generic-application-history.enabled
true
Address for the Timeline server to start the RPC server.
yarn.timeline-service.address
master:10201
The http address of the Timeline service web application.
yarn.timeline-service.webapp.address
master:8188
The https address of the Timeline service web application.
yarn.timeline-service.webapp.https.address
master:2191
yarn.timeline-service.handler-thread-count
24
vim tez-site.xml
tez.history.logging.service.class
org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService
tez.tez-ui.history-url.base
http://master:8880/tez-ui/
/usr/local/src/tomcat/apache-tomcat-8.5.60/bin/startup.sh