Sqoop2学习(二)—将MySQL数据库中数据导入到HDFS中

问题导读:

        1、Sqoop客户端如何创建服务端连接?        

        Sqoop的安装此处不再介绍,此处接我上一篇博文:Sqoop2学习(一)—Sqoop1.99.3介绍与安装。

一、配置服务端地址及端口

        要想和服务端连接,需要在客户端配置服务端的信息,如下:

set server --host secondmgt --port 12000 --webapp sqoop
        secondmgt:是服务端的主机名,12000是Sqoop的端口号。

二、验证连接是否正常

sqoop:000> show version -all
client version:
  Sqoop 1.99.4 source revision 2475a76ef70a0660f381c75c3d47d0d24f00b57f
  Compiled by gshapira on Sun Nov 16 02:50:00 PST 2014
server version:
  Sqoop 1.99.4 source revision 2475a76ef70a0660f381c75c3d47d0d24f00b57f
  Compiled by gshapira on Sun Nov 16 02:50:00 PST 2014
API versions:
  [v1]
        结果显示客户端和服务端版本信息,说明,客户端已经连上服务端。

三、查看sqoop shell支持的命令

sqoop:000> help
For information about Sqoop, visit: http://sqoop.apache.org/

Available commands:
  exit    (\x  ) Exit the shell
  history (\H  ) Display, manage and recall edit-line history
  help    (\h  ) Display this help message
  set     (\st ) Configure various client options and settings
  show    (\sh ) Display various objects and configuration options
  create  (\cr ) Create new object in Sqoop repository
  delete  (\d  ) Delete existing object in Sqoop repository
  update  (\up ) Update objects in Sqoop repository
  clone   (\cl ) Create new object based on existing one
  start   (\sta) Start job
  stop    (\stp) Stop job
  status  (\stu) Display status of a job
  enable  (\en ) Enable object in Sqoop repository
  disable (\di ) Disable object in Sqoop repository
四、查看Sqoop服务器上注册的连接器

sqoop:000> show connector
+----+------------------------+---------+------------------------------------------------------+
| Id |          Name          | Version |                        Class                         |
+----+------------------------+---------+------------------------------------------------------+
| 1  | generic-jdbc-connector | 1.99.3  | org.apache.sqoop.connector.jdbc.GenericJdbcConnector |
+----+------------------------+---------+------------------------------------------------------+
五、创建连接
sqoop:000> create connection --cid 1
Creating connection for connector with id 1
Please fill following values to create new connection object
Name: My First

Connection configuration

JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://secondmgt:3306/sqoopdb
Username: hive
Password: ****
JDBC Connection Properties:
There are currently 0 values in the map:
entry#  //此处直接敲回车

Security related configuration options

Max connections: 100
New connection was successfully created with validation status FINE and persistent id 1
sqoop:000>
六、查看创建的连接

sqoop:000> show connection
+----+----------+-----------+---------+
| Id |   Name   | Connector | Enabled |
+----+----------+-----------+---------+
| 1  | My First | 1         | true    |
+----+----------+-----------+---------+
七、创建Job

sqoop:000> create job --xid 1 --type import
Creating job for connection with id 1
Please fill following values to create new job object
Name: Second Job

Database configuration

Schema name: sqoopdb
Table name: stu
Table SQL statement:
Table column names: id,name
Partition column name: name
Nulls in partition column:
Boundary query:

Output configuration

Storage type:
  0 : HDFS
Choose: 0
Output format:
  0 : TEXT_FILE
  1 : SEQUENCE_FILE
Choose: 1
Compression format:
  0 : NONE
  1 : DEFAULT
  2 : DEFLATE
  3 : GZIP
  4 : BZIP2
  5 : LZO
  6 : LZ4
  7 : SNAPPY
Choose: 0
Output directory: /output/sqoopsecond

Throttling resources

Extractors:
Loaders:
New job was successfully created with validation status FINE  and persistent id 3
八、提交Job

sqoop:000> start job --jid 3
Submission details
Job ID: 3
Server URL: http://localhost:12000/sqoop/
Created by: hadoopUser
Creation date: 2015-01-16 10:30:17 CST
Lastly updated by: hadoopUser
External ID: job_1421373857783_0001
        http://secondmgt:8088/proxy/application_1421373857783_0001/
2015-01-16 10:30:17 CST: BOOTING  - Progress is not available
九、查看Job执行状态

sqoop:000> status job --jid 3
Submission details
Job ID: 3
Server URL: http://localhost:12000/sqoop/
Created by: hadoopUser
Creation date: 2015-01-16 10:30:17 CST
Lastly updated by: hadoopUser
External ID: job_1421373857783_0001
        http://secondmgt:8088/proxy/application_1421373857783_0001/
2015-01-16 10:31:18 CST: RUNNING  - 45.00 %
sqoop:000> status job --jid 3
Submission details
Job ID: 3
Server URL: http://localhost:12000/sqoop/
Created by: hadoopUser
Creation date: 2015-01-16 10:30:17 CST
Lastly updated by: hadoopUser
External ID: job_1421373857783_0001
        http://secondmgt:8088/proxy/application_1421373857783_0001/
2015-01-16 10:31:24 CST: SUCCEEDED
Counters:
        org.apache.hadoop.mapreduce.JobCounter
                SLOTS_MILLIS_MAPS: 448084
                TOTAL_LAUNCHED_MAPS: 10
                SLOTS_MILLIS_REDUCES: 0
                OTHER_LOCAL_MAPS: 10
        org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
                BYTES_READ: 0
        org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter
                BYTES_WRITTEN: 0
        org.apache.hadoop.mapreduce.TaskCounter
                MAP_INPUT_RECORDS: 0
                MERGED_MAP_OUTPUTS: 0
                PHYSICAL_MEMORY_BYTES: 1562419200
                SPILLED_RECORDS: 0
                FAILED_SHUFFLE: 0
                CPU_MILLISECONDS: 28060
                COMMITTED_HEAP_BYTES: 846725120
                VIRTUAL_MEMORY_BYTES: 8838815744
                MAP_OUTPUT_RECORDS: 4
                SPLIT_RAW_BYTES: 1255
                GC_TIME_MILLIS: 596
        org.apache.hadoop.mapreduce.FileSystemCounter
                FILE_READ_OPS: 0
                FILE_WRITE_OPS: 0
                FILE_BYTES_READ: 0
                FILE_LARGE_READ_OPS: 0
                HDFS_BYTES_READ: 1255
                FILE_BYTES_WRITTEN: 919590
                HDFS_LARGE_READ_OPS: 0
                HDFS_BYTES_WRITTEN: 930
                HDFS_READ_OPS: 40
                HDFS_WRITE_OPS: 20
        org.apache.sqoop.submission.counter.SqoopCounters
                ROWS_READ: 4
Job executed successfully
十、显示创建的Job

sqoop:000> show job
+----+------------+--------+-----------+---------+
| Id |    Name    |  Type  | Connector | Enabled |
+----+------------+--------+-----------+---------+
| 2  | First Job  | IMPORT | 1         | true    |
| 3  | Second Job | IMPORT | 1         | true    |
+----+------------+--------+-----------+---------+
十一、删除创建的Job

sqoop:000> delete job --jid 3
sqoop:000> show job
+----+-----------+--------+-----------+---------+
| Id |   Name    |  Type  | Connector | Enabled |
+----+-----------+--------+-----------+---------+
| 2  | First Job | IMPORT | 1         | true    |
+----+-----------+--------+-----------+---------+
十二、查看HDFS,是否将MySQL中数据已导入

[hadoopUser@secondmgt ~]$ hadoop fs -ls /output/sqoopsecond/*
Found 1 items
-rw-r--r--   2 hadoopUser supergroup          0 2015-01-16 10:31 /output/sqoopsecond/_SUCCESS
Found 1 items
-rw-r--r--   2 hadoopUser supergroup        104 2015-01-16 10:30 /output/sqoopsecond/part-m-00000.seq
Found 1 items
-rw-r--r--   2 hadoopUser supergroup         86 2015-01-16 10:30 /output/sqoopsecond/part-m-00001.seq
Found 1 items
-rw-r--r--   2 hadoopUser supergroup         86 2015-01-16 10:30 /output/sqoopsecond/part-m-00002.seq
Found 1 items
-rw-r--r--   2 hadoopUser supergroup         86 2015-01-16 10:30 /output/sqoopsecond/part-m-00003.seq
Found 1 items
-rw-r--r--   2 hadoopUser supergroup         86 2015-01-16 10:31 /output/sqoopsecond/part-m-00004.seq
Found 1 items
-rw-r--r--   2 hadoopUser supergroup         86 2015-01-16 10:31 /output/sqoopsecond/part-m-00005.seq
Found 1 items
-rw-r--r--   2 hadoopUser supergroup         86 2015-01-16 10:31 /output/sqoopsecond/part-m-00006.seq
Found 1 items
-rw-r--r--   2 hadoopUser supergroup        105 2015-01-16 10:31 /output/sqoopsecond/part-m-00007.seq
Found 1 items
-rw-r--r--   2 hadoopUser supergroup         86 2015-01-16 10:31 /output/sqoopsecond/part-m-00008.seq
Found 1 items
-rw-r--r--   2 hadoopUser supergroup        119 2015-01-16 10:31 /output/sqoopsecond/part-m-00009.seq

你可能感兴趣的:(Sqoop2学习(二)—将MySQL数据库中数据导入到HDFS中)