Sqoop2——导入关系型数据库数据到HDFS上

启动sqoop2-1.99.4版本客户端:

$SQOOP2_HOME/bin/sqoop.sh client 
set server --host hadoop000 --port 12000 --webapp sqoop

 

查看所有connector:

show connector --all
复制代码
2 connector(s) to show: 
        Connector with id 1:
            Name: hdfs-connector 
            Class: org.apache.sqoop.connector.hdfs.HdfsConnector
            Version: 1.99.4-cdh5.3.0

        Connector with id 2:
            Name: generic-jdbc-connector 
            Class: org.apache.sqoop.connector.jdbc.GenericJdbcConnector
            Version: 1.99.4-cdh5.3.0
复制代码

 

查询所有link: 

show link

删除指定link:

delete link --lid x

 

查询所有job:

show job

 

删除指定job:

delete job --jid 1

  

创建generic-jdbc-connector类型的connector

复制代码
create link --cid 2
    Name: First Link
    JDBC Driver Class: oracle.jdbc.driver.OracleDriver
    JDBC Connection String: jdbc:oracle:thin:@192.168.24.150:1521:dt1
    Username: root
    Password: ****
    JDBC Connection Properties: 
    There are currently 0 values in the map:
    entry# protocol=tcp
    There are currently 1 values in the map:
    protocol = tcp
    entry# 
    New link was successfully created with validation status OK and persistent id 3
复制代码

 

复制代码
show link
+----+-------------+-----------+---------+
| Id |    Name     | Connector | Enabled |
+----+-------------+-----------+---------+
| 3  | First Link  | 2         | true    |
+----+-------------+-----------+---------+
复制代码

 

创建hdfs-connector类型的connector:

create link -cid 1
    Name: Second Link
    HDFS URI: hdfs://dtbigdata1:9000 
    New link was successfully created with validation status OK and persistent id 4

 

复制代码
show link
+----+-------------+-----------+---------+
| Id |    Name     | Connector | Enabled |
+----+-------------+-----------+---------+
| 3  | First Link  | 2         | true    |
| 4  | Second Link | 1         | true    |
+----+-------------+-----------+---------+
复制代码

 

复制代码
show link -all
    2 link(s) to show: 
    link with id 3 and name First Link (Enabled: true, Created by null at 15-2-2 ??11:28, Updated by null at 15-2-2 ??11:28)
    Using Connector id 2
      Link configuration
        JDBC Driver Class: com.mysql.jdbc.Driver
        JDBC Connection String: jdbc:oracle:thin:@192.168.24.150:1521:dt1
        Username: root
        Password: 
        JDBC Connection Properties: 
          protocol = tcp
    link with id 4 and name Second Link (Enabled: true, Created by null at 15-2-2 ??11:32, Updated by null at 15-2-2 ??11:32)
    Using Connector id 1
      Link configuration
        HDFS URI: hdfs://dtbigdata1:9000 
复制代码

 

根据connector id创建job:

复制代码
create job -f 3 -t 4
    Creating job for links with from id 3 and to id 4
    Please fill following values to create new job object
    Name: Sqoopy

    From database configuration

    Schema name: hive --用户名
    Table name: TBLS
    Table SQL statement: 
    Table column names: 
    Partition column name: 
    Null value allowed for the partition column: 
    Boundary query: 

    ToJob configuration

    Output format: 
      0 : TEXT_FILE
      1 : SEQUENCE_FILE
    Choose: 0
    Compression format: 
      0 : NONE
      1 : DEFAULT
      2 : DEFLATE
      3 : GZIP
      4 : BZIP2
      5 : LZO
      6 : LZ4
      7 : SNAPPY
      8 : CUSTOM
    Choose: 0
    Custom compression format: 
    Output directory: hdfs://dtbigdata1:9000/tmp/data

Throttling resources Extractors: 1 --map数 Loaders: 1 --reduce数 New job was successfully created with validation status OK and persistent id 2
复制代码

 

查询所有job: 

复制代码
show job
+----+--------+----------------+--------------+---------+
| Id |  Name  | From Connector | To Connector | Enabled |
+----+--------+----------------+--------------+---------+
| 2  | Sqoopy | 2              | 1            | true    |
+----+--------+----------------+--------------+---------+
复制代码

 

启动指定的job:  该job执行完后查看HDFS上的文件(hdfs fs -ls hdfs://dtbigdata1:9000/tmp/data/)

start job --jid 2

 

查看指定job的执行状态:

status job --jid 2

 

停止指定的job:

stop job --jid 2

 

在start job(如:start job --jid 2)时常见错误:

Exception has occurred during processing command 
Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception

在sqoop客户端设置查看job详情:

set option --name verbose --value true
show job --jid 2

你可能感兴趣的:(大数据-sqoop)