dataX案例 读取hdfs文件,写入到mysql中

#从HDFS中读数据,写到mysql 中,先用下面的语句生成配置模板
# -r -w 可在plugin 中找到 
​​​​​​​python datax.py -r hdfsreader -w mysqlwriter

hdfsreader在官网上的详细参数说明
https://github.com/alibaba/DataX/blob/master/hdfsreader/doc/hdfsreader.md

mysqlwriter在官网上的详细参数说明
https://github.com/alibaba/DataX/blob/master/mysqlwriter/doc/mysqlwriter.md

代码示例,编写文件 hdfs_mysql.json
注意事项

  1. 生成的hdfs文件,当初的列分隔符,文件类型,压缩格式等是怎么设置的,读的时候就要怎么设置,不然解析不了
  2. 如果出错,可以先用 streamwriter 在控制台先打印出来看看
{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "hdfsreader", 
                    "parameter": {
                        "column": [
							   {
                                "index": 0,
                                "type": "string"
                               },
							   {
                                "index": 1,
                                "type": "string"
                               },
							   {
                                "index": 2,
                                "type": "string"
                               },
							   {
                                "index": 3,
                                "type": "string"
                               },
							   {
                                "index": 4,
                                "type": "string"
                               },
							   {
                                "index": 5,
                                "type": "Double"
                               },
							   {
                                "index": 6,
                                "type": "string"
                               },
							   {
                                "index": 7,
                                "type": "string"
                               },
							   {
                                "index": 8,
                                "type": "string"
                               },
							   {
                                "index": 9,
                                "type": "string"
                               },
							   {
                                "index": 10,
                                "type": "Double"
                               }
						], 
                        "defaultFS": "hdfs://kncloud02:8020", 
                        "encoding": "UTF-8", 
                        "fieldDelimiter": ",", 
						"compress": "gzip",
                        "fileType": "text", 
                        "path": "/dir1/e_board/target_e_board__3516dda9_5168_4f2c_abd0_b003ec1295db.gz"
                    }
                }, 
                "writer": {
                    "name": "mysqlwriter", 
                    "parameter": {
                        "column": [
							"WERKS","EKORG","INFNR","LIFNR","ESOKZ","NETPR","WAERS","PRDAT","ERDAT","UKURS","CNY_PRICE"  
                        ], 
                        "connection": [
                            {
                                "jdbcUrl": "jdbc:mysql://18.18.4.2:3306/linshi_1", 
                                "table": ["purchase"]
                            }
                        ], 
						"username": "root", 
                        "password": "123456", 
						"preSql":["truncate table purchase"], 
						"postSql":["select count(*) from purchase"], 
                        "writeMode": "insert"
                    }
                }
            }
        ], 
        "setting": {
            "speed": {
                "channel": "10"
            }
        }
    }
}

进入datax目录,运行datax命令,开始执行
python bin/datax.py job/hdfs_mysql.json

你可能感兴趣的:(#,dataX)