Kettle学习资料分享
Kettle 3.2 使用说明书
目录
概述..........................................................................................................................................7
1.Kettle 资源库管理.................................................................................................................7
1.1 新建资源库.................................................................................................................7
1.2 更新资源库..............................................................................................................11
1.3 资源库登陆和用户管理..........................................................................................12
1.4 资源库登录和没有资源库登录的区别..................................................................16
2.菜单栏介绍..........................................................................................................................18
2.1 文件..........................................................................................................................18
2.2 编辑..........................................................................................................................19
2.3 视图..........................................................................................................................21
2.4 资源库......................................................................................................................21
2.5 转换..........................................................................................................................22
2.6 作业..........................................................................................................................25
2.7 向导..........................................................................................................................26
2.8 帮助..........................................................................................................................26
2.9 变量..........................................................................................................................26
2.9.1 变量使用........................................................................................................26
2.9.2 变量范围.......................................................................................................26
2.9.2.1 环境变量............................................................................................26
2.9.2.2 Kettle 变量.........................................................................................27
2.9.2.3 内部变量............................................................................................27
3.工具栏介绍..........................................................................................................................28
3.1 转换Transformation 工具栏....................................................................................28
3.2 工作Jobs 工具栏......................................................................................................29
4.主对象树..............................................................................................................................30
4.1 转换主对象树..........................................................................................................31
4.1.1 新建转换.......................................................................................................32
4.1.2 转换设置.......................................................................................................32
4.1.3 DB 连接.........................................................................................................37
4.1.4 Steps(步骤) ....................................................................................................40
4.1.5 Hops(节点连接).............................................................................................40
4.1.5.1 右键节点连接,可以新建和排序连接.............................................41
4.1.5.2 右键单击每个具体连接,可以编辑和删除该节点连接的属性.....42
4.1.6 数据库分区schems ......................................................................................42
4.1.7 子服务器.......................................................................................................43
4.1.8 Kettle 集群schems ........................................................................................43
4.2 Jobs 主对象树...........................................................................................................44
4.2.1 新建Job ........................................................................................................44
4.2.2 设置Job 属性...............................................................................................45
4.2.3 DB 连接......................................................................................................45
4.2.4 作业项目....................................................................................................47
4.2.5 子服务器.......................................................................................................47
5. 转换核心对象....................................................................................................................47
5.1 Transform..................................................................................................................48
5.2 Input ..........................................................................................................................48
5.3 输入..........................................................................................................................49
5.3.1 Access Input ...................................................................................................49
5.3.2 CSV file input ................................................................................................50
5.3.3 Cube 输入多维立方体................................................................................51
5.3.4 Excel 输入......................................................................................................51
5.3.5 Fixed file input ...............................................................................................53
5.3.6 Generate random value ..................................................................................54
5.3.7 Get file Names................................................................................................55
5.3.8 Get Files Rows Count ....................................................................................55
5.3.9 Get data from XML........................................................................................55
5.3.10 LDAP Input ..................................................................................................57
5.3.11 LDIF Input....................................................................................................58
5.3.12 Mondrian Input.............................................................................................60
5.3.13 Property Input...............................................................................................60
5.3.14 Streaming XML Input ..................................................................................61
5.3.15 XBase 输入..................................................................................................65
5.3.16 XML 输入....................................................................................................66
5.3.17 文本文件输入.............................................................................................70
5.3.18 生成记录.....................................................................................................71
5.3.19 获取系统信息.............................................................................................71
5.3.20 表输入.........................................................................................................73
5.4 输出..........................................................................................................................75
5.4.1 Access Output.................................................................................................75
5.4.2 Cube 输出......................................................................................................75
5.4.3 Excel Output...................................................................................................76
5.4.4 Properties Output ...........................................................................................76
5.4.5 SQL File Output .............................................................................................78
5.4.6 XML 输出......................................................................................................79
5.4.7 删除...............................................................................................................80
5.4.8 插入/更新......................................................................................................81
5.4.9 文本文件输出...............................................................................................83
5.4.10 更新.............................................................................................................83
5.4.11 表输出.........................................................................................................84
5.5 查询..........................................................................................................................85
5.5.1 Check if a column exists ................................................................................85
5.5.2 File Exists.......................................................................................................86
5.5.3 HTTP client ....................................................................................................87
5.5.4 Table exists.....................................................................................................88
5.5.5 Web 服务查询................................................................................................89
5.5.6 数据库查询...................................................................................................89
5.5.7 数据库连接...................................................................................................91
5.5.8 流查询...........................................................................................................92
5.5.9 调用DB 存储过程.......................................................................................94
5.6 转换..........................................................................................................................94
5.6.1 Abort...............................................................................................................95
5.6.2 Add XML 增加XML....................................................................................96
5.6.3 Add a checksum 增加检查和.......................................................................97
5.6.4 Analytic Query 分析查询.............................................................................98
5.6.5 Append Streams .............................................................................................98
5.6.6 Blocking Step 被冻结的步骤.......................................................................99
5.6.7 Clone row.......................................................................................................99
5.6.8 Closure Generator 闭包生成器..................................................................100
5.6.9 Data Validator 数据检测.............................................................................100
5.6.10 Delay row 延迟行.....................................................................................101
5.6.11 Identify last row in a stream 标记流中最后一行.....................................101
5.6.12 Metadata structure of stream 流中元数据结构.........................................102
5.6.13 Null if 设置为空值...................................................................................102
5.6.14 Row Normaliser 行正规化.......................................................................103
5.6.15 Split field to rows 分离行.........................................................................103
5.6.16 Switch / case...............................................................................................104
5.6.17 XSD Validator ............................................................................................104
5.6.18 XSL Transformation...................................................................................105
5.6.19 值映射.......................................................................................................106
5.6.20 分组...........................................................................................................107
5.6.21 去除重复记录...........................................................................................108
5.6.22 增加常量...................................................................................................109
5.6.23 增加序列...................................................................................................109
5.6.24 字段选择...................................................................................................110
5.6.25 拆分字段................................................................................................... 111
5.6.26 排序记录...................................................................................................112
5.6.27 空操作.......................................................................................................113
5.6.28 行扁平化...................................................................................................113
5.6.29 行转列.......................................................................................................115
5.6.30 计算器.......................................................................................................116
5.6.31 过滤记录...................................................................................................119
5.7 连接.......................................................................................................................120
5.7.1 Merge Join....................................................................................................120
5.7.2 Sorted Merge................................................................................................121
5.7.3 XML Join .....................................................................................................122
5.7.4 合并记录.....................................................................................................122
5.7.5 记录关联(笛卡尔输出).........................................................................123
5.8 脚本........................................................................................................................124
5.8.1 Modified Java Script Calue..........................................................................124
5.8.2 Regex Evaluation .........................................................................................125
5.8.3 执行SQL 脚本...........................................................................................127
5.9 数据仓库................................................................................................................128
5.9.1 维度更新/查询............................................................................................128
5.9.2 联合更新/查询............................................................................................129
5.10 映射......................................................................................................................130
5.10.1 映射(子转换).......................................................................................130
5.10.2 映射输入规范...........................................................................................131
5.10.2 映射输出规范...........................................................................................132
5.11 作业......................................................................................................................132
5.11.1 Get Variables 获得变量.............................................................................132
5.11.2 Get files from result....................................................................................133
5.11.3 Set Variables 设置变量.............................................................................134
5.11.4 Set files in result.........................................................................................135
5.11.5 从结果获取记录.......................................................................................135
5.11.6 复制记录到结果.......................................................................................136
5.12 内联......................................................................................................................136
5.12.1 Injector .......................................................................................................136
5.12.2 Socket reader..............................................................................................137
5.12.3 Socket writer ..............................................................................................137
5.13 实验......................................................................................................................138
5.14 不推荐的..............................................................................................................138
5.14.1 聚合记录...................................................................................................139
5.15 Bulk loading..........................................................................................................140
5.16 History...................................................................................................................142
6. 任务Jobs 核心对象.........................................................................................................143
6.1 General ....................................................................................................................143
6.1.1 Dummy Job ..................................................................................................143
6.2 通用........................................................................................................................144
6.2.1 START..........................................................................................................144
6.2.2 Dummy Job ..................................................................................................144
6.2.3 中断任务.....................................................................................................145
6.2.4 显示消息对话框.........................................................................................145
6.2.5 任务(Job) ....................................................................................................146
6.2.6 Ping a host....................................................................................................147
6.2.7 Success .........................................................................................................148
6.2.8 文本输出.....................................................................................................148
6.2.9 Write to Log .................................................................................................149
6.3 邮件........................................................................................................................149
6.3.1 Write to Log .................................................................................................149
6.3.2 Mail ..............................................................................................................150
6.4 文件管理................................................................................................................151
6.4.1 向结果中添加文件名.................................................................................152
6.4.2 比较文件夹.................................................................................................152
6.4.3 拷贝文件.....................................................................................................153
6.4.4 拷贝或移动结果文件名.............................................................................153
6.4.5 新建文件夹.................................................................................................154
6.4.6 新建文件.....................................................................................................155
6.4.7 删除文件.....................................................................................................155
6.4.8 从结果集中删除文件名.............................................................................155
6.4.9 删除文件.....................................................................................................156
6.4.10 删除文件夹...............................................................................................156
6.4.11 文件比较...................................................................................................157
6.4.12 HTTP..........................................................................................................157
6.4.13 Move FIles .................................................................................................158
6.4.14 文件解压缩................................................................................................159
6.4.15 等待文件...................................................................................................159
6.4.16 文件打包...................................................................................................160
6.5 条件........................................................................................................................161
6.5.1 检查文件夹是否为空.................................................................................161
6.5.2 检查文件是否存在.....................................................................................161
6.5.3 检查数据库表中的列是否存在.................................................................162
6.5.4 检查文件存在.............................................................................................162
6.5.5 检查表是否存在.........................................................................................163
6.5.6 等待.............................................................................................................163
6.6 脚本........................................................................................................................164
6.6.1 Mail ..............................................................................................................164
6.6.2 SQL ..............................................................................................................164
6.6.3 SHELL .........................................................................................................165
6.7 批量加载................................................................................................................166
6.7.1 批量从Mysql 中加载数据至文件.............................................................166
6.7.2 从文件中向MS SQL Server 数据库中批量加载.....................................166
6.7.3 从文件中向Mysql 数据库中批量加载......................................................167
6.8 XML........................................................................................................................168
6.8.1 Check if XML File is well formed ...............................................................168
6.8.2 DTD Validator..............................................................................................169
6.8.3 XSD Validator ..............................................................................................169
6.8.4 XSL Transformation.....................................................................................170
6.9 文件传输................................................................................................................171
6.9.1 FTP...............................................................................................................171
6.9.2 FTP Delete....................................................................................................173
6.9.3 Put a file with FTP .......................................................................................173
6.9.4 Put a file with SFTP .....................................................................................175
6.9.5 SSH2 Get......................................................................................................176
6.9.6 SSH2 Put ......................................................................................................177
6.9.7 Secure FTP...................................................................................................179
6.10 资源库..................................................................................................................180
6.10.1 Check if connected to repository................................................................180
6.10.2 Export repository to XML file....................................................................181
6.11 实验......................................................................................................................181
6.11.1 Evaluate rows number in a table ................................................................182
6.11.2 MS Access Bulk Load ................................................................................182
6.11.3 Set variables ...............................................................................................184
6.11.4 Simple evaluation.......................................................................................184
6.11.5 Truncate tables............................................................................................185
6.11.6 Wait for SQL ..............................................................................................186
附:
1、Kettle+3.2使用说明书.pdf
2、kettle初探--内含配置信息.pdf
3、用Kettle的一套流程完成对整个数据库迁移.pdf