


sqoop import --connect jdbc:mysql://xxx.xxx.xxx.xxx:3306/database --table tablename --hbase-table hbasetablename --column-family family --hbase-row-key ID --hbase-create-table --username 'root' -P
-m:并行执行sqoop导入程序的map task的数量,在不指定的情况下默认启动4个map
--split-by:并行导入过程中,各个map task根据哪个字段来划分数据段,该参数最好指定一个能相对均匀划分数据的字段,比如创建时间、递增的ID
--hbase-row-key:如果不指定则采用源表的key作为hbase的row key。可以指定一个字段作为row key,或者指定组合行键,当指定组合行键时,用双引号包含多个字段,各字段用逗号分隔


[hdfs@slave1 ~]$ sqoop import --connect jdbc:mysql://xxx.xxx.xxx.xxx:3306/database --table tablename --hbase-table hbasetablename --column-family family --hbase-row-key ID --hbase-create-table --username 'root' -P
Warning: /soft/bigdata/clouderamanager/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/04/28 15:54:37 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.10.0
Enter password: 
17/04/28 15:54:44 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/04/28 15:54:44 INFO tool.CodeGenTool: Beginning code generation
Fri Apr 28 15:54:44 CST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
17/04/28 15:54:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `COMP_DICT` AS t LIMIT 1
17/04/28 15:54:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `COMP_DICT` AS t LIMIT 1
17/04/28 15:54:45 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /soft/bigdata/clouderamanager/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-hdfs/compile/f5c3b693ffb26b66c554308ad32b2880/COMP_DICT.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/04/28 15:54:47 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/f5c3b693ffb26b66c554308ad32b2880/COMP_DICT.jar
7/04/28 15:54:53 INFO mapreduce.Job: The url to track the job: http://master2:8088/proxy/application_1491881598805_0027/
17/04/28 15:54:53 INFO mapreduce.Job: Running job: job_1491881598805_0027
17/04/28 15:54:59 INFO mapreduce.Job: Job job_1491881598805_0027 running in uber mode : false
17/04/28 15:54:59 INFO mapreduce.Job:  map 0% reduce 0%
17/04/28 15:55:05 INFO mapreduce.Job:  map 20% reduce 0%
17/04/28 15:55:06 INFO mapreduce.Job:  map 60% reduce 0%
17/04/28 15:55:09 INFO mapreduce.Job:  map 100% reduce 0%
17/04/28 15:55:10 INFO mapreduce.Job: Job job_1491881598805_0027 completed successfully
17/04/28 15:55:10 INFO mapreduce.Job: Counters: 30
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=925010
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=665
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=5
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Job Counters 
        Launched map tasks=5
        Other local map tasks=5
        Total time spent by all maps in occupied slots (ms)=25663
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=25663
        Total vcore-seconds taken by all map tasks=25663
        Total megabyte-seconds taken by all map tasks=26278912
    Map-Reduce Framework
        Map input records=10353
        Map output records=10353
        Input split bytes=665
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=586
        CPU time spent (ms)=17940
        Physical memory (bytes) snapshot=1619959808
        Virtual memory (bytes) snapshot=14046998528
        Total committed heap usage (bytes)=1686634496
    File Input Format Counters 
        Bytes Read=0
    File Output Format Counters 
        Bytes Written=0
17/04/28 15:55:10 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 20.0424 seconds (0 bytes/sec)
17/04/28 15:55:10 INFO mapreduce.ImportJobBase: Retrieved 10353 records.


sqoop import -D sqoop.hbase.add.row.key=true --connect jdbc:mysql://xxx.xxx.xxx.xxx:3306/database --table tablename --hbase-table hbasetablename --column-family family --hbase-row-key "etl_date,APPLY_ID" --hbase-create-table --username 'root' -P



sqoop create-hive-table --connect jdbc:mysql://xxx.xxx.xxx.xxx:3306/shiro --table UserInfo --hive-database shiro --hive-table userinfo --username root --password xxxxxx --fields-terminated-by "\0001" --lines-terminated-by "\n";
--fields-terminated-by "\0001" 是设置每列之间的分隔符,"\0001"是ASCII码中的1,它也是hive的默认行内分隔符, 而sqoop的默认行内分隔符为","
--lines-terminated-by "\n" 设置的是每行之间的分隔符,此处为换行符,也是默认的分隔符;


sqoop import --connect jdbc:mysql://xxx.xxx.xxx.xxx:3306/testSqoop --table dydata --hive-database testsqoop --hive-import --hive-table dydata --username root --password xxxxxx --fields-terminated-by "\0001";
-m 2 表示由两个map作业执行
--fields-terminated-by "\0001" 需同创建hive表时保持一致;
