如何读取MySQL中的数据存放到HDFS_大数据培训

大数据培训读取MySQL中的数据存放到HDFS

1 查看官方模板

[atguigu@hadoop102 ~]$ python /opt/module/datax/bin/datax.py -r mysqlreader -w hdfswriter

{

    “job”: {

        “content”: [

            {

                “reader”: {

                    “name”: “mysqlreader”,

                    “parameter”: {

                        “column”: [],

                        “connection”: [

                            {

                                “jdbcUrl”: [],

                                “table”: []

                            }

                        ],

                        “password”: “”,

                        “username”: “”,

                        “where”: “”

                    }

                },

                “writer”: {

                    “name”: “hdfswriter”,

                    “parameter”: {

                        “column”: [],

                        “compress”: “”,

                        “defaultFS”: “”,

                        “fieldDelimiter”: “”,

                        “fileName”: “”,

                        “fileType”: “”,

                        “path”: “”,

                        “writeMode”: “”

                    }

                }

            }

        ],

        “setting”: {

            “speed”: {

                “channel”: “”

            }

        }

    }

}

mysqlreader参数解析:

如何读取MySQL中的数据存放到HDFS_大数据培训_第1张图片

hdfswriter参数解析:

如何读取MySQL中的数据存放到HDFS_大数据培训_第2张图片

大数据培训读取MySQL中的数据存放到HDFS

2 准备数据

1)创建student表

mysql> create database datax;

mysql> use datax;

mysql> create table student(id int,name varchar(20));

2)插入数据

mysql> insert into student values(1001,’zhangsan’),(1002,’lisi’),(1003,’wangwu’);

大数据培训读取MySQL中的数据存放到HDFS

3 编写配置文件

[atguigu@hadoop102 datax]$ vim /opt/module/datax/job/mysql2hdfs.json

{

    “job”: {

        “content”: [

            {

                “reader”: {

                    “name”: “mysqlreader”,

                    “parameter”: {

                        “column”: [

                            “id”,

                            “name”

                        ],

                        “connection”: [

                            {

                                “jdbcUrl”: [

                                    “jdbc:mysql://hadoop102:3306/datax”

                                ],

                                “table”: [

                                    “student”

                                ]

                            }

                        ],

                        “username”: “root”,

                        “password”: “000000”

                    }

                },

                “writer”: {

                    “name”: “hdfswriter”,

                    “parameter”: {

                        “column”: [

                            {

                                “name”: “id”,

                                “type”: “INT”

                            },

                            {

                                “name”: “name”,

                                “type”: “STRING”

                            }

                        ], 

                        “defaultFS”: “hdfs://hadoop102:9000”,

                        “fieldDelimiter”: “\t”,

                        “fileName”: “student.txt”,

                        “fileType”: “text”,

                        “path”: “/”,

                        “writeMode”: “append”

                    }

                }

            }

        ],

        “setting”: {

            “speed”: {

                “channel”: “2”

            }

        }

    }

}

大数据培训读取MySQL中的数据存放到HDFS

4 执行任务

[atguigu@hadoop102 datax]$ bin/datax.py job/mysql2hdfs.json

2019-05-17 16:02:16.581 [job-0] INFO  JobContainer –

任务启动时刻                    : 2019-05-17 16:02:04

任务结束时刻                    : 2019-05-17 16:02:16

任务总计耗时                    :                 12s

任务平均流量                    :                3B/s

记录写入速度                    :              0rec/s

读出记录总数                    :                   3

读写失败总数                    :                   0

5 查看hdfs

注意:HdfsWriter实际执行时会在该文件名后添加随机的后缀作为每个线程写入实际文件名。

你可能感兴趣的:(大数据,mysql,hdfs)