目录:
vim simple.job
type=command
command=echo 'simple job started'
zip simpleJob.zip simple.job
将文件下载到本地
sz simpleJob.zip
当前工作流的执行依赖上一个job的执行,即上一个job执行完才能执行当前job
vim start1.job
type=command
command=echo 'start1 started'
vim start2.job
type=command
dependecies=start1
command=echo 'start2 started'
vim start1.job
type=command
command=echo 'start1 started'
vim start2.job
type=command
command=echo 'start2 started'
vim start3.job
type=command
dependecies=start1,start2
command=echo 'start3 started'
注意:start1和start2是并行执行的
使用command.n的形式添加多个命令
vim twoCommand.job
type=command
command=echo 'twoCammandJob started'
command.1=hdfs dfs -mkdir /test1
command.2=hdfs dfs -mkdir /test2
注意:command=这一行千万不可少,否则报错找不到command命令
这里以Hadoop自带的单词统计案例为例,jar包在
/usr/local/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar
vim mrJob.job
type=command
command=hadoop jar hadoop-mapreduce-examples-2.6.4.jar wordcount /wordcount/input /wordcount/output
vim word.txt
hadoop spark
sqoop hadoop
storm phython
hdfs dfs -put word.txt /wordcount/input
zip mrJob.zip mrJob.job hadoop-mapreduce-examples-2.6.4.jar
vim hiveScript.job
type=command
command=hive -e 'use emp_db; select id,name,age from amp;'
vim hiveSql.sql
use emp_db;
drop table az_emp;
create table az_emp(id int,name string,age int) row format delimited fields terminated by ',';
load data local inpath '/usr/local/hivedata/az_emp.txt' into table az_emp;
drop table az_emp_test;
create table az_emp_test as select id,name,age from az_emp;
insert overwrite local directory '/usr/local/hivedata/az_emp_output' select id,name,age from az_emp_test;
vim executeSql.job
type=command
command=hive -f 'executeSql.sql'
vim az_emp.txt
1,honghong,25
2,lanolin,22
3,juanjuan,20
zip executeSql.zip hiveSql.sql excuteSql.job