- 下载kettle的zip包
- 上传至linux对应的目录下
- 将windows上.kettle目录下的所有文件copy至linux对应的目录下
- 在linux上创建目录作为资源文件目录以便保存.ktr和.kjb文件
- 修改repositories.xml文件中的资源库路径
<?xml version="1.0" encoding="UTF-8"?>
<repositories>
<repository>
<id>KettleFileRepository</id>
<name>product-id</name>
<description>product</description>
<base_directory>/usr/local/wonhigh/o2o/kettle-respository/</base_directory>
<read_only>N</read_only>
<hides_hidden_files>N</hides_hidden_files>
</repository>
</repositories>
- 将需要的db驱动包放入$KETTLE_HOME/lib目录下
- 将windows上配置的db的连接配置信息放入资源库根目录下
- 执行命令:
执行转换:./data-integration/pan.sh -rep:product-id -user:admin -pass:admin -file:/usr/local/wonhigh/o2o/kettle-respository/coloth_t-skap-i2.ktr -level:Debug -logfile:/data/wonhigh/kettle-test.log
执行job:./data-integration/kitchen.sh -rep:product-id -user:admin -pass:admin -file:/usr/local/wonhigh/o2o/kettle-respository/coloth_t-skap-i2.kjb -level:Debug -logfile:/data/wonhigh/kettle-test.log
如何指定系统配置文件路径(kettle.properties、respositories.xml...)
vim /etc/profile
export KETTLE_HOME=/usr/local/wonhigh/o2o/data-integration
. /<span style="font-family: Arial, Helvetica, sans-serif;">etc/profile
cd /usr/local/wonhigh/o2o/data-integration
//这里随便执行以下kettle的指令即可,不要求跑正确
pan.sh -file= ##你会发现.kettle目录已经在$KETTLE_HOME目录下生成/usr/local/wonhigh/o2o/data-integration/.kettle/kettle.properties当然还有其他配置文件需要你手动简历,或者从开发环境拷贝了
增加master1.ksl slave1-6101.ksl slave1-6102.ksl master-slaver-cluster.kcs等文件
master1.ksl:
<slaveserver><name>master1</name><hostname>${WONHIGH.MASTER.IP}</hostname><port>${WONHIGH.MASTER.PORT}</port><webAppName/><username>${WONHIGH.KETTLE.USER}</username><password>${WONHIGH.KETTLE.PASS}</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>Y</master></slaveserver>
slave1-6101.ksl:
<slaveserver><name>master1</name><hostname>${WONHIGH.MASTER.IP}</hostname><port>${WONHIGH.MASTER.PORT}</port><webAppName/><username>${WONHIGH.KETTLE.USER}</username><password>${WONHIGH.KETTLE.PASS}</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>Y</master></slaveserver>
slave1-6102.ksl:
<slaveserver><name>slave1-6102</name><hostname>${WONHIGH.SLAVE2.IP}</hostname><port>${WONHIGH.SLAVE2.PORT}</port><webAppName/><username>${WONHIGH.KETTLE.USER}</username><password>${WONHIGH.KETTLE.PASS}</password><proxy_hostname/><proxy_port/><non_proxy_hosts/><master>N</master></slaveserver>
master-slaver-cluster.kcs:
<clusterschema>
<name>master-slaver-cluster</name>
<base_port>40000</base_port>
<sockets_buffer_size>2000</sockets_buffer_size>
<sockets_flush_interval>5000</sockets_flush_interval>
<sockets_compressed>Y</sockets_compressed>
<dynamic>N</dynamic>
<slaveservers>
<name>master1</name>
<name>slave1-6101</name>
<name>slave1-6102</name>
</slaveservers>
</clusterschema>
kettle.properties:
#resposity
WONHIGH.KETTLE.USER = wonhigh
WONHIGH.KETTLE.PASS = test123
#WONHIGH.KETTLE.USER = cluster
#WONHIGH.KETTLE.PASS = cluster
#master
WONHIGH.MASTER.IP = 172.17.210.95
WONHIGH.MASTER.PORT = 6100
#salve1
WONHIGH.SLAVE1.IP = 172.17.210.95
WONHIGH.SLAVE1.PORT = 6101
#salve2
WONHIGH.SLAVE2.IP = 172.17.210.95
WONHIGH.SLAVE2.PORT = 6102
kettle.pwd:
备注:密码加密使用 encr.sh -carte test123
# Please note that the default password (cluster) is obfuscated using the Encr script provided in this release
# Passwords can also be entered in plain text as before
#
#cluster: OBF:1v8w1uh21z7k1ym71z7i1ugo1v9q
wonhigh: OBF:1m801j1u1lmp1z0f1lj11iz01m4e
kitchen.sh
Options:
-rep = Repository name
-user = Repository username
-pass = Repository password
-job = The name of the job to launch
-dir = The directory (dont forget the leading /)
-file = The filename (Job XML) to launch
-level = The logging level (Basic, Detailed, Debug, Rowlevel, Error, Nothing)
-logfile = The logging file to write to
-listdir = List the directories in the repository
-listjobs = List the jobs in the specified directory
-listrep = List the available repositories
-norep = Do not log into the repository
-version = show the version, revision and build date
-param = Set a named parameter <NAME>=<VALUE>. For example -param:FILE=customers.csv
-listparam = List information concerning the defined parameters in the specified job.
-export = Exports all linked resources of the specified job. The argument is the name of a ZIP file.
-custom = Set a custom plugin specific option as a String value in the job using <NAME>=<Value>, for example: -custom:COLOR=Red
-maxloglines = The maximum number of log lines that are kept internally by Kettle. Set to 0 to keep all rows (default)
-maxlogtimeout = The maximum age (in minutes) of a log line while being kept internally by Kettle. Set to 0 to keep all rows indefinitely (default)
[root@WL-APP-001 data-integration]# ./kitchen.sh -rep:product-id -job:./skap/test-skap -level:Basic -logfile:/data/wonhigh/kettle.log
2014/07/17 10:03:52 - Kitchen - Logging is at level : 基本日志
2014/07/17 10:03:52 - Kitchen - Start of run.
2014/07/17 10:03:52 - RepositoriesMeta - Reading repositories XML file: /usr/local/wonhigh/o2o/data-integration/.kettle/repositories.xml
2014/07/17 10:03:54 - ./skap/test-skap - 开始执行任务
2014/07/17 10:03:54 - ./skap/test-skap - 开始项[coloth_t-skap-i2]
2014/07/17 10:03:54 - coloth_t-skap-i2 - Loading transformation from repository [coloth_t-skap-i2] in directory [/skap]
2014/07/17 10:03:54 - coloth_t-skap-i2 - 为了转换解除补丁开始 [coloth_t-skap-i2]
2014/07/17 10:03:54 - coloth_t-output.0 - Connected to database [I2STG] (commit=1000)
2014/07/17 10:04:17 - coloth_t.0 - Finished reading query, closing connection.
2014/07/17 10:04:17 - coloth_t.0 - 完成处理 (I=34786, O=0, R=0, W=34786, U=0, E=0
2014/07/17 10:04:25 - coloth_t-output.0 - 完成处理 (I=0, O=34786, R=34786, W=34786, U=0, E=0
2014/07/17 10:04:25 - ./skap/test-skap - 完成作业项[coloth_t-skap-i2] (结果=[true])
2014/07/17 10:04:25 - ./skap/test-skap - 任务执行完毕
2014/07/17 10:04:25 - Kitchen - Finished!
2014/07/17 10:04:25 - Kitchen - Start=2014/07/17 10:03:52.683, Stop=2014/07/17 10:04:25.704
2014/07/17 10:04:25 - Kitchen - Processing ended after 33 seconds
注:在无UI界面的情况下要使用集群必须以job为入口,在job的配置中设置