oozie是一个任务调度的框架,由cloudera公司开源,所有的调度任务由一个mr程序去启动,主要使用一种有向无环图的方式来管理执行任务,定义的语言使用xml来定义,如果需要单独使用oozie,使用azkaban替换使用,这里可以将oozie和hue整合之后来使用
客户端:主要用于提交任务
服务端:主要用于接收任务,准备执行,运行在tomcat中
sqlDB:主要用于保存提交的任务
workflow:工作流,里面定义了一个个的action,一个action就是一个任务
coordinate:协作器,是oozie的任务定时执行的模块,对workflow进行定时执行
bundle(绑定):多个coordinate 构成了一个bundle
第一步:修改core-site.xml
cd /export/servers/hadoop-2.6.0-cdh5.14.0/etc/hadoop
vim core-site.xml
hadoop.proxyuser.root.hosts
*
hadoop.proxyuser.root.groups
*
修改完成后需要重启Hadoop的hdfs集群和yarn集群以及历史任务服务
第二步:上传安装包并解压
cd /export/softwares/
tar -zxvf oozie-4.1.0-cdh5.14.0.tar.gz -C ../servers/
第三步:解压hadooplibs到与oozie平行的目录
cd /export/servers/oozie-4.1.0-cdh5.14.0
tar -zxvf oozie-hadooplibs-4.1.0-cdh5.14.0.tar.gz -C ../
第四步:创建libext目录
cd /export/servers/oozie-4.1.0-cdh5.14.0
mkdir -p libext
第五步:拷贝依赖包到libext
cd /export/servers/oozie-4.1.0-cdh5.14.0
cp -ra hadooplibs/hadooplib-2.6.0-cdh5.14.0.oozie-4.1.0-cdh5.14.0/* libext/
拷贝mysql的驱动包
cp /export/softwares/mysql-connector-java-5.1.45/mysql-connector-java-5.1.45-bin.jar libext/
第六步:添加ext-2.2.zip压缩包
拷贝ext-2.2.zip这个包到libext目录当中去
第七步:修改oozie-site.xml
cd /export/servers/oozie-4.1.0-cdh5.14.0/conf
vim oozie-site.xml
#如果没有这些属性,直接添加进去即可,oozie默认使用的是UTC的时区,我们需要在我们oozie-site.xml当中记得要配置我们的时区为GMT+0800时区
oozie.service.JPAService.jdbc.driver
com.mysql.jdbc.Driver
oozie.service.JPAService.jdbc.url
jdbc:mysql://node03.hadoop.com:3306/oozie
oozie.service.JPAService.jdbc.username
root
oozie.service.JPAService.jdbc.password
123456
oozie.processing.timezone
GMT+0800
oozie.service.ProxyUserService.proxyuser.hue.hosts
*
oozie.service.ProxyUserService.proxyuser.hue.groups
*
oozie.service.coord.check.maximum.frequency
false
oozie.service.HadoopAccessorService.hadoop.configurations
*=/export/servers/hadoop-2.6.0-cdh5.14.0/etc/hadoop
第八步:创建mysql数据库
mysql -uroot -p
create database oozie;
第九步:上传oozie依赖的jar包到hdfs上
上传oozie的解压后目录的yarn.tar.gz到hdfs目录
bin/oozie-setup.sh sharelib create -fs hdfs://node01:8020 -locallib oozie-sharelib-4.1.0-cdh5.14.0-yarn.tar.gz
第十步:创建oozie的数据库表
bin/oozie-setup.sh db create -run -sqlfile oozie.sql
第十一步:打包项目,生成war包
cd /export/servers/oozie-4.1.0-cdh5.14.0
bin/oozie-setup.sh prepare-war
第十二步:配置oozie的环境变量
vim /etc/profile
export OOZIE_HOME=/export/servers/oozie-4.1.0-cdh5.14.0
export OOZIE_URL=http://node03.hadoop.com:11000/oozie
export PATH=:$OOZIE_HOME/bin:$PATH
source /etc/profile
第十三步:启动与关闭oozie服务
cd /export/servers/oozie-4.1.0-cdh5.14.0
bin/oozied.sh start
bin/oozied.sh stop
第十四步:浏览器访问
http://node03:11000/oozie/
先停止oozie与hue的进程
修改hue的配置文件
修改hue的配置文件hue.ini
cd /export/servers/hue-3.9.0-cdh5.14.0/desktop/conf
vim hue.ini
[liboozie]
# The URL where the Oozie service runs on. This is required in order for
# users to submit jobs. Empty value disables the config check.
oozie_url=http://node03.hadoop.com:11000/oozie
# Requires FQDN in oozie_url if enabled
## security_enabled=false
# Location on HDFS where the workflows/coordinator are deployed when submitted.
remote_deployement_dir=/user/root/oozie_works
#大约在1151行
[oozie]
# Location on local FS where the examples are stored.
# local_data_dir=/export/servers/oozie-4.1.0-cdh5.14.0/examples/apps
# Location on local FS where the data for the examples is stored.
# sample_data_dir=/export/servers/oozie-4.1.0-cdh5.14.0/examples/input-data
# Location on HDFS where the oozie examples and workflows are stored.
# Parameters are $TIME and $USER, e.g. /user/$USER/hue/workspaces/workflow-$TIME
# remote_data_dir=/user/root/oozie_works/examples/apps
# Maximum of Oozie workflows or coodinators to retrieve in one API call.
oozie_jobs_count=100
# Use Cron format for defining the frequency of a Coordinator instead of the old frequency number/unit.
enable_cron_scheduling=true
# Flag to enable the saved Editor queries to be dragged and dropped into a workflow.
enable_document_action=true
# Flag to enable Oozie backend filtering instead of doing it at the page level in Javascript. Requires Oozie 4.3+.
enable_oozie_backend_filtering=true
# Flag to enable the Impala action.
enable_impala_action=true
[filebrowser]
# Location on local filesystem where the uploaded archives are temporary stored.
archive_upload_tempdir=/tmp
# Show Download Button for HDFS file browser.
show_download_button=true
# Show Upload Button for HDFS file browser.
show_upload_button=true
# Flag to enable the extraction of a uploaded archive in HDFS.
enable_extract_uploaded_archive=true
启动hue和oozie进程
cd /export/servers/hue-3.9.0-cdh5.14.0
build/env/bin/supervisor
cd /export/servers/oozie-4.1.0-cdh5.14.0
bin/oozied.sh start
页面访问hue
http://node03.hadoop.com:8888/