Sqoop2入门之导入关系型数据库数据到HDFS上(sqoop2-1.99.4版本)

sqoop2-1.99.4和sqoop2-1.99.3版本操作略有不同:新版本中使用link代替了老版本的connection,其他使用类似。

sqoop2-1.99.4环境搭建参见:Sqoop2环境搭建

sqoop2-1.99.3版本实现参见:Sqoop2入门之导入关系型数据库数据到HDFS上

 

启动sqoop2-1.99.4版本客户端:

$SQOOP2_HOME/bin/sqoop.sh client 

set server --host hadoop000 --port 12000 --webapp sqoop

 

查看所有connector:

show connector --all
2 connector(s) to show: 

        Connector with id 1:

            Name: hdfs-connector 

            Class: org.apache.sqoop.connector.hdfs.HdfsConnector

            Version: 1.99.4-cdh5.3.0



        Connector with id 2:

            Name: generic-jdbc-connector 

            Class: org.apache.sqoop.connector.jdbc.GenericJdbcConnector

            Version: 1.99.4-cdh5.3.0

 

查询所有link: 

show link

删除指定link:

delete link --lid x

 

查询所有job:

show job

 

删除指定job:

delete job --jid 1

  

创建generic-jdbc-connector类型的connector

create link --cid 2

    Name: First Link

    JDBC Driver Class: com.mysql.jdbc.Driver

    JDBC Connection String: jdbc:mysql://hadoop000:3306/hive

    Username: root

    Password: ****

    JDBC Connection Properties: 

    There are currently 0 values in the map:

    entry# protocol=tcp

    There are currently 1 values in the map:

    protocol = tcp

    entry# 

    New link was successfully created with validation status OK and persistent id 3

 

show link
+----+-------------+-----------+---------+

| Id |    Name     | Connector | Enabled |

+----+-------------+-----------+---------+

| 3  | First Link  | 2         | true    |

+----+-------------+-----------+---------+

 

创建hdfs-connector类型的connector:

create link -cid 1

    Name: Second Link

    HDFS URI: hdfs://hadoop000:8020

    New link was successfully created with validation status OK and persistent id 4

 

show link +----+-------------+-----------+---------+

| Id |    Name     | Connector | Enabled |

+----+-------------+-----------+---------+

| 3  | First Link  | 2         | true    |

| 4  | Second Link | 1         | true    |

+----+-------------+-----------+---------+

 

show link -all 2 link(s) to show: 

    link with id 3 and name First Link (Enabled: true, Created by null at 15-2-2 ??11:28, Updated by null at 15-2-2 ??11:28)

    Using Connector id 2

      Link configuration

        JDBC Driver Class: com.mysql.jdbc.Driver

        JDBC Connection String: jdbc:mysql://hadoop000:3306/hive

        Username: root

        Password: 

        JDBC Connection Properties: 

          protocol = tcp

    link with id 4 and name Second Link (Enabled: true, Created by null at 15-2-2 ??11:32, Updated by null at 15-2-2 ??11:32)

    Using Connector id 1

      Link configuration

        HDFS URI: hdfs://hadoop000:8020

 

根据connector id创建job:

create job -f 3 -t 4

    Creating job for links with from id 3 and to id 4

    Please fill following values to create new job object

    Name: Sqoopy



    From database configuration



    Schema name: hive

    Table name: TBLS

    Table SQL statement: 

    Table column names: 

    Partition column name: 

    Null value allowed for the partition column: 

    Boundary query: 



    ToJob configuration



    Output format: 

      0 : TEXT_FILE

      1 : SEQUENCE_FILE

    Choose: 0

    Compression format: 

      0 : NONE

      1 : DEFAULT

      2 : DEFLATE

      3 : GZIP

      4 : BZIP2

      5 : LZO

      6 : LZ4

      7 : SNAPPY

      8 : CUSTOM

    Choose: 0

    Custom compression format: 

    Output directory: hdfs://hadoop000:8020/sqoop2/tbls_import_demo_sqoop1.99.4



    Throttling resources



    Extractors: 

    Loaders: 

    New job was successfully created with validation status OK  and persistent id 2

 

查询所有job: 

show job +----+--------+----------------+--------------+---------+

| Id |  Name  | From Connector | To Connector | Enabled |

+----+--------+----------------+--------------+---------+

| 2  | Sqoopy | 2              | 1            | true    |

+----+--------+----------------+--------------+---------+

 

启动指定的job:  该job执行完后查看HDFS上的文件(hdfs fs -ls hdfs://hadoop000:8020/sqoop2/tbls_import_demo_sqoop1.99.4/)

start job --jid 2

 

查看指定job的执行状态:

status job --jid 2

 

停止指定的job:

stop job --jid 2

 

在start job(如:start job --jid 2)时常见错误:

Exception has occurred during processing command 

Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception

在sqoop客户端设置查看job详情:

set option --name verbose --value true

show job --jid 2

 

你可能感兴趣的:(sqoop)