datax 案例

datax

datax也太**好用了!!!!!!!!!!
alibaba牛X!!!!!!!
支持国产!!!!!

github官网:https://github.com/alibaba/DataX

Quick Start:https://github.com/alibaba/DataX/blob/master/userGuid.md

开发宝典:https://github.com/alibaba/DataX/blob/master/dataxPluginDev.md


类型 数据源 Reader(读) Writer(写) 文档
RDBMS 关系型数据库 MySQL 读 、写
Oracle 读 、写
SQLServer 读 、写
PostgreSQL 读 、写
DRDS 读 、写
通用RDBMS(支持所有关系型数据库) 读 、写
阿里云数仓数据存储 ODPS 读 、写
ADS
OSS 读 、写
OCS 读 、写
NoSQL数据存储 OTS 读 、写
Hbase0.94 读 、写
Hbase1.1 读 、写
Phoenix4.x 读 、写
Phoenix5.x 读 、写
MongoDB 读 、写
Hive 读 、写
无结构化数据存储 TxtFile 读 、写
FTP 读 、写
HDFS 读 、写
Elasticsearch
时间序列数据库 OpenTSDB
TSDB

案例

执行命令

cd /opt/app/datax/bin
python datax.py xxx.json

stream2stream.json

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "streamreader", 
                    "parameter": {
                        "column": [
                            {
                                "type": "long",
                                "value": "10"
                            },
                            {
                                "type": "string",
                                "value": "hello,你好,世界-DataX"
                            }
                        ], 
                        "sliceRecordCount": 10
                    }
                }, 
                "writer": {
                    "name": "streamwriter", 
                    "parameter": {
                        "encoding": "UTF-8", 
                        "print": true
                    }
                }
            }
        ], 
        "setting": {
            "speed": {
                "channel": 5
            }
        }
    }
}

mysql2hdfs.json

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "mysqlreader", 
                    "parameter": {
                        "column": [
							"user_id",
                            "user_name",
                            "trade_time"
						], 
                        "connection": [
                            {
                                "jdbcUrl": ["jdbc:mysql://localhost:3306/test"], 
                                "table": ["user"]
                            }
                        ], 
                        "password": "Jingxuan@108", 
                        "username": "root"
                    }
                }, 
                "writer": {
                    "name": "hdfswriter",
                    "parameter": {
                        "defaultFS": "hdfs://hadoop02:9000",
                        "fileType": "orc",
                        "path": "/datax/mysql2hdfs/usertest1",
                        "fileName": "user01",
                        "column": [
                            {
                                "name": "user_id",
                                "type": "INT"
                            },
                            {
                                "name": "user_name",
                                "type": "STRING"
                            },{
                                "name": "trade_time",
                                "type": "DATE"
                            }
                        ],
                        "writeMode": "append",
                        "fieldDelimiter": "\t"
                    }
                }
            }
        ], 
        "setting": {
            "speed": {
                "channel": "1"
            }
        }
    }
}

hdfs2mysql.json

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "hdfsreader",
                    "parameter": {
                        "path": "/datax/mysql2hdfs/stutest1/*",
                        "defaultFS": "hdfs://hadoop02:9000",
                        "column": ["*"],
                        "fileType": "text",
                        "encoding": "UTF-8",
                        "fieldDelimiter": "\t"
                    }

                },
                "writer": {
                    "name": "mysqlwriter", 
                    "parameter": {
                        "column": [
                        "id",
                        "name",
                        "age"
						], 
                        "connection": [
                            {
                                "jdbcUrl": "jdbc:mysql://hadoop01:3306/urldb", 
                                "table": ["stu_copy"]
                            }
                        ], 
                        "password": "Jingxuan@108", 
                        "username": "root"
                    }
                }
            }
        ], 
        "setting": {
            "speed": {
                "channel": "1"
            }
        }
    }
}

你可能感兴趣的:(datax)