shell脚本中调用kitchen 和 pan去执行,job和transformation文件。分 windows和 dos系统两种。
举个简单的小例子
shell脚本:
export JAVA_HOME=/usr/local/java/jdk
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/mysql-connector-java-5.1.18-bin.jar
export KETTLE_HOME=/home/www/allyes/a3tracker/bi/etl/kettle/kh_cloud/
export LC_ALL=en_US.UTF-8
echo "KETTLE_HOME=$KETTLE_HOME"
echo "starting..."
yesterdayid=`date -d $yesterday +%Y%m%d`
/home/www/allyes/a3tracker/bi/etl/kettle/data-integration/kitchen.sh -param:Yesterday='2014-02-24' -file /home/www/allyes/a3tracker/bi/etl/kettle/etlscript/playdata_etl_day.kjb>/home/www/allyes/a3tracker/bi/etl/kettle/logs/a3tracker_cloud_etl_"$yesterdayid"_"$vardate".txt
完整的脚本
#!/bin/sh
check_date()
{
[ $# -ne 1 ] && return 1
_lenStr=`expr length "$1"`
[ "$_lenStr" -ne 10 ] && return 1
date -d $1 "+%Y/%m/%d" | grep -q $1
if [ $? -eq 1 ]
then
return 1
else
return 0
fi
return 0
}
vardate=`date +%Y%m%d%H%M%S`
echo today is `date +%Y/%m/%d`
yesterday=`date -d "yesterday" +%Y/%m/%d`
while [ -n "$1" ]; do
case $1 in
-d)
shift
yesterday=$1
echo "your input is $yesterday"
shift;;
*)
echo "$1 is wrong paratism"
break;;
esac
done
check_date $yesterday
if [ $? -eq 1 ];then
echo "date format error! date format:(<yyyy/mm/dd>)"
exit 1
fi
echo Data aggregation date : $yesterday
export JAVA_HOME=/usr/local/java/jdk
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/mysql-connector-java-5.1.18-bin.jar
export KETTLE_HOME=/home/www/allyes/a3tracker/bi/etl/kettle/kh_cloud/
export LC_ALL=en_US.UTF-8
echo "KETTLE_HOME=$KETTLE_HOME"
echo "starting..."
yesterdayid=`date -d $yesterday +%Y%m%d`
/home/www/allyes/a3tracker/bi/etl/kettle/data-integration/kitchen.sh -param:Yesterday=$yesterday -file /home/www/allyes/a3tracker/bi/etl/kettle/etlscript/playdata_etl_day.kjb>/home/www/allyes/a3tracker/bi/etl/kettle/logs/a3tracker_cloud_etl_"$yesterdayid"_"$vardate".txt
echo "done!"
命令行参数传入:
几篇讲解:
http://blog.csdn.net/john_f_lau/article/details/9260863
http://forums.pentaho.com/showthread.php?54423-Passing-parameters-to-jobs-on-kitchen-command-line
http://wiki.pentaho.com/display/EAI/Named+Parameters
http://wiki.pentaho.com/display/EAI/Kitchen+User+Documentation
http://wiki.pentaho.com/display/EAI/Named+Parameters
http://blog.csdn.net/qqzyb/article/details/8939517
http://blog.sina.com.cn/s/blog_543e73a80100k0vz.html
http://www.cnblogs.com/wxjnew/p/3620792.html
两个例子,传入多个参数:
/home/www/allyes/aso/kettle/kitchen.sh -file /home/www/allyes/aso/etl/test.kjb -param:os='1' -param:appstore='all' -param:dt='2014-02-24' >/home/www/allyes/aso/etl/log.txt 2>/home/www/allyes/aso/etl/error.txt
/home/www/allyes/aso/kettle/kitchen.sh -file /home/www/allyes/aso/etl/test.kjb -param:os=1 -param:appstore='all' -param:dt='2014-02-24' -level=Detailed >/home/www/allyes/aso/etl/log.txt
命令行执行,options 后面可以是"="也可以是":"也可以是空格,三者都行,如kitchen.bat /file d:\ 或者 -file=D:\ 或者/file:D:\
kitchen.bat /norep -file=D:/kettledata/mysal2orcle.kjb >> kitchen_%date:~0,10%.log
参数传入后,必须先在transformation中的setting设置里添加对应参数。然后用get variables控件获得
http://wiki.pentaho.com/display/EAI/Named+Parameters
http://type-exit.org/adventures-with-open-source-bi/2010/07/using-named-parameters-in-kettle/
两种格式(住linux下可以没有双引号quotation,windows要求参数parameter必须有双引号)
1:kitchen /file:"MyJob.kjb" /param:ServerName=MyServer
多个param:
Linux: ./kitchen.sh -file:job.kjb -param:files.dir=/opt/files -param:max.date=2010-06-02
Windows: Kitchen.bat -file:job.kjb “-param:files.dir=/opt/files” “-param:max.date=2010-06-02″
2:kitchen /file:"your job name.kjb" "command line argument 1" "command line argument 2" "command line argument 3"....
listparam,也是使用多个parameters,如:
sh pan.sh -file:/tmp/foo.ktr -listparam
Parameter: MASTER_HOST=, default=localhost : The master slave server hostname to connect to
Parameter: MASTER_PORT=, default=8080 : The master slave server HTTP control port
也可以写成,等同于:
user@host:$ sh pan.sh -file:/tmp/foo.ktr -param:MASTER_HOST=192.168.1.3 -param:MASTER_PORT=8181
Windows requires you to use quotes around the parameter otherwise the equals sign is treated as a space by the command interpreter:
c:\> pan.sh -file:/tmp/foo.ktr "-param:MASTER_HOST=192.168.1.3" "-param:MASTER_PORT=8181"
日志的选择,不同参数的设定:
-level 日志级别:(运行界面,log显示框左上角三个小图标,最后一个扳手锤子为设置level)
Rowlevel: print所有在Kettle中的有效日志,包括在大量复杂步骤的信息;
Debugging: 产生大量的日志信息,主要用于调试,但是不是在行级别(row level);
Detailed:允许用户看到比基本日志级别更富比较性的信息,额外的信息实例包括SQL查询语句和一般的DDL都会产生。
Basic:默认的日子级别;仅仅打印这些能够反映在步骤或者任务条目上的信息。
Minimal:通知你仅仅关于一个任务或者转化的信息。
Errorlogging only: 如果那儿有一个错误,显示错误消息;否则,什么都不显示。
Nothingat all: 即使当有错误存在的时候,不要产生任何日志。