DataX mysql与hive间传输数据

@羲凡——只为了更好的活着

DataX mysql与hive间传输数据

官网

前期准备
a.下载并配置DataX,见官网
b.有hive环境和mysql库

0.注意事项

a.mysql数据导入到hiveconnection 中的 jdbcUrl必须是个list,也就是要加中括号
b.hive数据导出到mysqlconnection 中的 jdbcUrl必须是string,也就是别加中括号
c.hive数据导出到mysql 中,hdfsreader数据类型转化,hive中的TINYINT,SMALLINT,INT,BIGINT都对应LONG
d.hive数据导出到mysql 中,对于hdfsreader指定Column信息,type必须填写,index/value必须选择其一,默认情况下,可以全部按照String类型读取数据 “column”: ["*"]

1.mysql数据导入到hive(新建文件script/mysql2hive.json)

{
    "job": {
        "setting": {
            "speed": {
                "channel": 2
            },
            "errorLimit": {
                "record": 0,
                "percentage": 0.02
            }
        },
        "content": [
            {
                "reader": {
                    "name": "mysqlreader",
                    "parameter": {
                        "username": "root",
                        "password": "123456",
                        "column": ["id","name","age","city"],
                        "splitPk": "id",
                        "connection": [
                            {
                                "table": ["t_user_info"],
                                "jdbcUrl": ["jdbc:mysql://10.218.223.96:3306/test"]
                            }
                        ]
                    }
                },
               "writer": {
                    "name": "hdfswriter",
                    "parameter": {
                        "defaultFS": "hdfs://10.218.223.97:8020",
                        "fileType": "text",
                        "path": "/user/hive/warehouse/test.db/h_user_info",
                        "fileName": "mysql2hive",
                        "column": [
                            {"name": "id","type": "TINYINT"},
                            {"name": "name","type": "STRING"},
                            {"name": "age","type": "INT"},
                            {"name": "city","type": "STRING"}
                        ],
                        "writeMode": "nonConflict",
                        "fieldDelimiter": "\t",
						"compress":"gzip"
                    }
                }
            }
        ]
    }
}
cd $DATAX_HOME
python bin/datax.py script/mysql2hive.json

2.hive数据导出到mysql

{
    "job": {
        "setting": {
            "speed": {
                "channel": 2
            },
            "errorLimit": {
                "record": 0,
                "percentage": 0.02
            }
        },
        "content": [
            {
                "reader": {
                    "name": "hdfsreader",
                    "parameter": {
                        "defaultFS": "hdfs://10.218.223.97:8020",
                        "fileType": "text",
                        "encoding": "UTF-8",
                        "path": "/user/hive/warehouse/test.db/h_user_info/*",
                        "column": [
                            {"index": 0,"type": "LONG"},
                            {"index": 1,"type": "STRING"},
                            {"index": 2,"type": "LONG"},
                            {"value": "南京","type": "STRING"}
                        ],
                        "fieldDelimiter": "\t",
                        "compress":"gzip"
                    }
                },
               "writer": {
                    "name": "mysqlwriter",
                    "parameter": {
                        "writeMode": "insert",
                        "username": "root",
                        "password": "123456",
                        "column": ["id","name","age","city"],
                        "session": ["set session sql_mode='ANSI'"],
                        "preSql": ["delete from t_user_info"],
                        "connection": [
                            {
                                "jdbcUrl": "jdbc:mysql://10.218.223.96:3306/test?useUnicode=true&characterEncoding=gbk",
                                "table": ["t_user_info"]
                            }
                        ]
                    }
                }
            }
        ]
    }
}

====================================================================

@羲凡——只为了更好的活着

若对博客中有任何问题,欢迎留言交流

你可能感兴趣的:(ETL,hive,mysql,DataX,导入,导出)