环境版本: ·HDP-2.5.3 ·Hive 1.2.1 ·Sqoop 1.4.6 ·SQL Server 2012
[root@hqc-test-hdp1 ~]# sqoop list-databases --connect jdbc:sqlserver://10.35.xx.xx -username xx -password xx
Warning: /usr/hdp/2.5.3.0-37/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/10/29 16:13:24 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.3.0-37
19/10/29 16:13:24 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/10/29 16:13:24 INFO manager.SqlManager: Using default fetchSize of 1000
master
AI
只显示dbo中表,且不显示视图
[root@hqc-test-hdp1 ~]# sqoop list-tables --connect "jdbc:sqlserver://10.35.xx.xx:1433;DatabaseName=AI" -username xx -password xx
Warning: /usr/hdp/2.5.3.0-37/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/10/30 08:52:21 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.3.0-37
19/10/30 08:52:21 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/10/30 08:52:21 INFO manager.SqlManager: Using default fetchSize of 1000
tt
table1
Dictionary
# SELECT TOP 1000 * FROM [dbo].[Dictionary]
[root@hqc-test-hdp1 ~]# sqoop eval --connect jdbc:sqlserver://10.35.xx.xx -username xx -password xx --query "SELECT TOP 5 * from [xx.xx]"
Warning: /usr/hdp/2.5.3.0-37/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/10/29 16:22:22 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.3.0-37
19/10/29 16:22:22 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/10/29 16:22:22 INFO manager.SqlManager: Using default fetchSize of 1000
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Name | Code | Expr1 | Sname | Cname | Aname | Longitude | Latitude | Position |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
略
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
因为视图不是表,不能使用–table,只能使用–query。
[root@hqc-test-hdp1 ~]# su hdfs
[hdfs@hqc-test-hdp1 root]$ sqoop import --connect "jdbc:sqlserver://10.35.xx.xx:1433;DatabaseName=ICP" -username xx -password xx --query "SELECT * from [xx.xx] WHERE \$CONDITIONS" --target-dir /apps/hive/warehouse/hqc.db/xx -m 1
Warning: /usr/hdp/2.5.3.0-37/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
find: failed to restore initial working directory: 权限不够
19/10/30 09:27:22 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.3.0-37
19/10/30 09:27:22 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/10/30 09:27:22 INFO manager.SqlManager: Using default fetchSize of 1000
19/10/30 09:27:22 INFO tool.CodeGenTool: Beginning code generation
19/10/30 09:27:23 INFO manager.SqlManager: Executing SQL statement: SELECT * from [xx.xx] WHERE 1=1 AND (1 = 0)
19/10/30 09:27:23 INFO manager.SqlManager: Executing SQL statement: SELECT * from [xx.xx] WHERE 1=1 AND (1 = 0)
19/10/30 09:27:23 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.5.3.0-37/hadoop-mapreduce
错误: 读取/usr/hdp/2.5.3.0-37/sqoop/lib/mysql-connector-java.jar时出错; cannot read zip file
错误: 读取/usr/hdp/2.5.3.0-37/hive/lib/mysql-connector-java.jar时出错; cannot read zip file
注: /tmp/sqoop-hdfs/compile/3c7b7eafcd1020b0b4e6d390fb32265b/QueryResult.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
19/10/30 09:27:26 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/3c7b7eafcd1020b0b4e6d390fb32265b/QueryResult.jar
19/10/30 09:27:26 INFO mapreduce.ImportJobBase: Beginning query import.
19/10/30 09:27:27 INFO impl.TimelineClientImpl: Timeline service address: http://hqc-test-hdp2:8188/ws/v1/timeline/
19/10/30 09:27:27 INFO client.RMProxy: Connecting to ResourceManager at hqc-test-hdp1/10.35:8050
19/10/30 09:27:28 INFO client.AHSProxy: Connecting to Application History server at hqc-test-hdp2/10.35:10200
19/10/30 09:27:30 INFO db.DBInputFormat: Using read commited transaction isolation
19/10/30 09:27:30 INFO mapreduce.JobSubmitter: number of splits:1
19/10/30 09:27:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1564035532438_0008
19/10/30 09:27:30 INFO impl.YarnClientImpl: Submitted application application_1564035532438_0008
19/10/30 09:27:30 INFO mapreduce.Job: The url to track the job: http://hqc-test-hdp1:8088/proxy/application_1564035532438_0008/
19/10/30 09:27:30 INFO mapreduce.Job: Running job: job_1564035532438_0008
19/10/30 09:27:46 INFO mapreduce.Job: Job job_1564035532438_0008 running in uber mode : false
19/10/30 09:27:46 INFO mapreduce.Job: map 0% reduce 0%
19/10/30 09:27:55 INFO mapreduce.Job: map 100% reduce 0%
19/10/30 09:27:55 INFO mapreduce.Job: Job job_1564035532438_0008 completed successfully
19/10/30 09:27:55 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=158794
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=717075
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=6086
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=6086
Total vcore-milliseconds taken by all map tasks=6086
Total megabyte-milliseconds taken by all map tasks=31160320
Map-Reduce Framework
Map input records=5676
Map output records=5676
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=133
CPU time spent (ms)=6280
Physical memory (bytes) snapshot=369975296
Virtual memory (bytes) snapshot=6399401984
Total committed heap usage (bytes)=329777152
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=717075
19/10/30 09:27:55 INFO mapreduce.ImportJobBase: Transferred 700.2686 KB in 28.168 seconds (24.8605 KB/sec)
19/10/30 09:27:55 INFO mapreduce.ImportJobBase: Retrieved 5676 records.
不需要手动提前建hive表!不需要手动提前建hive表!不需要手动提前建hive表!–hive-table 指定表名就可以了,sqoop不同版本情况可能不同,请注意文章开头的环境版本。
因为视图不是表,不能使用–table,只能使用–query。使用 import --query时,–split-by 可以没有,但–target-dir必须有,路径写一个临时路径即可。据我观察,它是先把数据抽取到hdfs该路径下 ,然后把数据载入hive(表对应的hdfs目录),最后把该文件夹及里面文件删除。
[hdfs@hqc-test-hdp1 root]$ sqoop import --connect "jdbc:sqlserver://10.35.xx.xx:1433;DatabaseName=ICP" -username xx -password xx --query "SELECT * from [xx.xx] WHERE \$CONDITIONS" --hive-import -hive-database hqc --hive-table xx --target-dir /apps/hive/warehouse/hqc.db/xx -m 1
Warning: /usr/hdp/2.5.3.0-37/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
find: failed to restore initial working directory: 权限不够
19/10/30 10:14:24 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.3.0-37
19/10/30 10:14:24 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/10/30 10:14:24 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
19/10/30 10:14:24 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
19/10/30 10:14:25 INFO manager.SqlManager: Using default fetchSize of 1000
19/10/30 10:14:25 INFO tool.CodeGenTool: Beginning code generation
19/10/30 10:14:25 INFO manager.SqlManager: Executing SQL statement: SELECT * from [xx.xx] WHERE (1 = 0)
19/10/30 10:14:26 INFO manager.SqlManager: Executing SQL statement: SELECT * from [xx.xx] WHERE (1 = 0)
19/10/30 10:14:26 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.5.3.0-37/hadoop-mapreduce
错误: 读取/usr/hdp/2.5.3.0-37/sqoop/lib/mysql-connector-java.jar时出错; cannot read zip file
错误: 读取/usr/hdp/2.5.3.0-37/hive/lib/mysql-connector-java.jar时出错; cannot read zip file
注: /tmp/sqoop-hdfs/compile/f1f7d212fbef24d849cb7d0604d2b0e5/QueryResult.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
19/10/30 10:14:28 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/f1f7d212fbef24d849cb7d0604d2b0e5/QueryResult.jar
19/10/30 10:14:29 INFO mapreduce.ImportJobBase: Beginning query import.
19/10/30 10:14:30 INFO impl.TimelineClientImpl: Timeline service address: http://hqc-test-hdp2:8188/ws/v1/timeline/
19/10/30 10:14:30 INFO client.RMProxy: Connecting to ResourceManager at hqc-test-hdp1/10.35:8050
19/10/30 10:14:31 INFO client.AHSProxy: Connecting to Application History server at hqc-test-hdp2/10.35:10200
19/10/30 10:14:33 INFO db.DBInputFormat: Using read commited transaction isolation
19/10/30 10:14:33 INFO mapreduce.JobSubmitter: number of splits:1
19/10/30 10:14:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1564035532438_0010
19/10/30 10:14:34 INFO impl.YarnClientImpl: Submitted application application_1564035532438_0010
19/10/30 10:14:34 INFO mapreduce.Job: The url to track the job: http://hqc-test-hdp1:8088/proxy/application_1564035532438_0010/
19/10/30 10:14:34 INFO mapreduce.Job: Running job: job_1564035532438_0010
19/10/30 10:14:44 INFO mapreduce.Job: Job job_1564035532438_0010 running in uber mode : false
19/10/30 10:14:44 INFO mapreduce.Job: map 0% reduce 0%
19/10/30 10:14:59 INFO mapreduce.Job: map 100% reduce 0%
19/10/30 10:14:59 INFO mapreduce.Job: Job job_1564035532438_0010 completed successfully
19/10/30 10:14:59 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=158786
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=717075
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=12530
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=12530
Total vcore-milliseconds taken by all map tasks=12530
Total megabyte-milliseconds taken by all map tasks=64153600
Map-Reduce Framework
Map input records=5676
Map output records=5676
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=99
CPU time spent (ms)=5950
Physical memory (bytes) snapshot=362852352
Virtual memory (bytes) snapshot=6400389120
Total committed heap usage (bytes)=336068608
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=717075
19/10/30 10:14:59 INFO mapreduce.ImportJobBase: Transferred 700.2686 KB in 29.3386 seconds (23.8685 KB/sec)
19/10/30 10:14:59 INFO mapreduce.ImportJobBase: Retrieved 5676 records.
19/10/30 10:14:59 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners
19/10/30 10:14:59 INFO manager.SqlManager: Executing SQL statement: SELECT * from [xx.xx] WHERE (1 = 0)
19/10/30 10:15:00 INFO manager.SqlManager: Executing SQL statement: SELECT * from [xx.xx] WHERE (1 = 0)
19/10/30 10:15:00 WARN hive.TableDefWriter: Column Longitude had to be cast to a less precise type in Hive
19/10/30 10:15:00 WARN hive.TableDefWriter: Column Latitude had to be cast to a less precise type in Hive
19/10/30 10:15:00 INFO hive.HiveImport: Loading uploaded data into Hive
Logging initialized using configuration in jar:file:/usr/hdp/2.5.3.0-37/hive/lib/hive-common-1.2.1000.2.5.3.0-37.jar!/hive-log4j.properties
OK
Time taken: 2.898 seconds
Loading data to table hqc.xx
Table hqc.xx stats: [numFiles=1, numRows=0, totalSize=717075, rawDataSize=0]
OK
Time taken: 0.627 seconds
加入 --hive-overwrite
[hdfs@hqc-test-hdp1 root]$ sqoop import --connect "jdbc:sqlserver://10.35.xx.xx:1433;DatabaseName=ICP" -username xx -password xx --query "SELECT * from [xx.xx] WHERE \$CONDITIONS" --hive-overwrite --hive-import -hive-database hqc --hive-table xx --target-dir /apps/hive/warehouse/hqc.db/xx -m 1
Warning: /usr/hdp/2.5.3.0-37/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
find: failed to restore initial working directory: 权限不够
19/10/30 16:12:59 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.3.0-37
19/10/30 16:12:59 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/10/30 16:12:59 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
19/10/30 16:12:59 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
19/10/30 16:13:00 INFO manager.SqlManager: Using default fetchSize of 1000
19/10/30 16:13:00 INFO tool.CodeGenTool: Beginning code generation
19/10/30 16:13:00 INFO manager.SqlManager: Executing SQL statement: SELECT * from [xx.xx] WHERE (1 = 0)
19/10/30 16:13:01 INFO manager.SqlManager: Executing SQL statement: SELECT * from [xx.xx] WHERE (1 = 0)
19/10/30 16:13:01 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.5.3.0-37/hadoop-mapreduce
错误: 读取/usr/hdp/2.5.3.0-37/sqoop/lib/mysql-connector-java.jar时出错; cannot read zip file
错误: 读取/usr/hdp/2.5.3.0-37/hive/lib/mysql-connector-java.jar时出错; cannot read zip file
注: /tmp/sqoop-hdfs/compile/2dcbcc5ee20eac3b80e2f2109726b44c/QueryResult.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
19/10/30 16:13:03 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/2dcbcc5ee20eac3b80e2f2109726b44c/QueryResult.jar
19/10/30 16:13:03 INFO mapreduce.ImportJobBase: Beginning query import.
19/10/30 16:13:05 INFO impl.TimelineClientImpl: Timeline service address: http://hqc-test-hdp2:8188/ws/v1/timeline/
19/10/30 16:13:05 INFO client.RMProxy: Connecting to ResourceManager at hqc-test-hdp1/10.35:8050
19/10/30 16:13:06 INFO client.AHSProxy: Connecting to Application History server at hqc-test-hdp2/10.35:10200
19/10/30 16:13:08 INFO db.DBInputFormat: Using read commited transaction isolation
19/10/30 16:13:08 INFO mapreduce.JobSubmitter: number of splits:1
19/10/30 16:13:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1564035532438_0023
19/10/30 16:13:09 INFO impl.YarnClientImpl: Submitted application application_1564035532438_0023
19/10/30 16:13:09 INFO mapreduce.Job: The url to track the job: http://hqc-test-hdp1:8088/proxy/application_1564035532438_0023/
19/10/30 16:13:09 INFO mapreduce.Job: Running job: job_1564035532438_0023
19/10/30 16:13:19 INFO mapreduce.Job: Job job_1564035532438_0023 running in uber mode : false
19/10/30 16:13:19 INFO mapreduce.Job: map 0% reduce 0%
19/10/30 16:13:29 INFO mapreduce.Job: map 100% reduce 0%
19/10/30 16:13:29 INFO mapreduce.Job: Job job_1564035532438_0023 completed successfully
19/10/30 16:13:29 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=158455
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=5217236
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=7426
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=7426
Total vcore-milliseconds taken by all map tasks=7426
Total megabyte-milliseconds taken by all map tasks=38021120
Map-Reduce Framework
Map input records=16993
Map output records=16993
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=108
CPU time spent (ms)=9300
Physical memory (bytes) snapshot=398262272
Virtual memory (bytes) snapshot=6417539072
Total committed heap usage (bytes)=415760384
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=5217236
19/10/30 16:13:29 INFO mapreduce.ImportJobBase: Transferred 4.9755 MB in 24.7647 seconds (205.7342 KB/sec)
19/10/30 16:13:29 INFO mapreduce.ImportJobBase: Retrieved 16993 records.
19/10/30 16:13:29 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners
19/10/30 16:13:30 INFO manager.SqlManager: Executing SQL statement: SELECT * from [xx.xx] WHERE (1 = 0)
19/10/30 16:13:30 INFO manager.SqlManager: Executing SQL statement: SELECT * from [xx.xx] WHERE (1 = 0)
19/10/30 16:13:30 WARN hive.TableDefWriter: Column AlarmTimeStamp had to be cast to a less precise type in Hive
19/10/30 16:13:30 WARN hive.TableDefWriter: Column HandlerTime had to be cast to a less precise type in Hive
19/10/30 16:13:30 INFO hive.HiveImport: Loading uploaded data into Hive
Logging initialized using configuration in jar:file:/usr/hdp/2.5.3.0-37/hive/lib/hive-common-1.2.1000.2.5.3.0-37.jar!/hive-log4j.properties
OK
Time taken: 2.809 seconds
Loading data to table hqc.xx
Table hqc.xx stats: [numFiles=1, numRows=0, totalSize=5217236, rawDataSize=0]
OK
Time taken: 1.321 seconds
1.create-hive-table & hive-import 区别
问题:
Can anyone tell the difference between create-hive-table & hive-import method? Both will create a hive table, but still what is the significance of each?
回答:
The difference is that create-hive-table will create table in Hive based on the source table in database but will NOT transfer any data. Command "import --hive-import" will both create table in Hive and import data from the source table.
只创建表:
sqoop create-hive-table --connect "jdbc:sqlserver://192.168.13.1:1433;username=root;password=12345;databasename=test" --table test --hive-table myhive2 --hive-partition-key partition_time --map-column-hive ID=String,name=String,addr=String
全量导入(sqlserver条件用where,oracle用and):
sqoop import --connect "jdbc:sqlserver://192.168.13.1:1433;username=root;password=12345;databasename=test" --query "select * from test i where \$CONDITIONS" --target-dir /user/hive/warehouse/myhive2/partition_time=20171023 --hive-import -m 5 --hive-table myhive2 --split-by ID --hive-partition-key partition_time --hive-partition-value 20171023
2.sqoop help
Warning: /usr/hdp/2.5.3.0-37/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/10/30 08:42:05 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.3.0-37
usage: sqoop COMMAND [ARGS]
Available commands:
codegen Generate code to interact with database records
create-hive-table Import a table definition into Hive
eval Evaluate a SQL statement and display the results
export Export an HDFS directory to a database table
help List available commands
import Import a table from a database to HDFS
import-all-tables Import tables from a database to HDFS
import-mainframe Import datasets from a mainframe server to HDFS
job Work with saved jobs
list-databases List available databases on a server
list-tables List available tables in a database
merge Merge results of incremental imports
metastore Run a standalone Sqoop metastore
version Display version information
3.sqoop export 从hdfs/hive等导入关系型数据库,支持全量、增量、更新三种模式
--update-mode updateonly/allowinsert
全量导出
HQL示例:insert overwrite directory ‘/user/root/export/test’ row format delimited fields terminated by ‘,’ STORED AS textfile select F1,F2,F3 from ;
SQOOP脚本:sqoop export --connect jdbc:mysql://localhost:3306/wht --username root --password cloudera --table --fields-terminated-by ',' --columns F1,F2,F3 --export-dir /user/root/export/test
--update-mode allowinsert 更新且增加
HQL示例:insert overwrite directory ‘/user/root/export/test’ row format delimited fields terminated by ‘,’ STORED AS textfile select F1,F2,F3 from where ;
SQOOP脚本:sqoop export --connect jdbc:mysql://localhost:3306/wht --username root --password cloudera --table --fields-terminated-by ‘,’ --columns F1,F2,F3 --update-key F4 --update-mode allowinsert --export-dir /user/root/export/test
--update-mode updateonly 只更新
HQL示例:insert overwrite directory ‘/user/root/export/test’ row format delimited fields terminated by ‘,’ STORED AS textfile select F1,F2,F3 from where ;
SQOOP脚本:sqoop export --connect jdbc:mysql://localhost:3306/wht --username root --password cloudera --table --fields-terminated-by ‘,’ --columns F1,F2,F3 --update-key F4 --update-mode updateonly --export-dir /user/root/export/test
usage: sqoop export [GENERIC-ARGS] [TOOL-ARGS]
Common arguments:
--connect Specify JDBC connect
string
--connection-manager Specify connection manager
class name
--connection-param-file Specify connection
parameters file
--driver Manually specify JDBC
driver class to use
--hadoop-home Override
$HADOOP_MAPRED_HOME_ARG
--hadoop-mapred-home Override
$HADOOP_MAPRED_HOME_ARG
--help Print usage instructions
-P Read password from console
--password Set authentication
password
--password-alias Credential provider
password alias
--password-file Set authentication
password file path
--relaxed-isolation Use read-uncommitted
isolation for imports
--skip-dist-cache Skip copying jars to
distributed cache
--temporary-rootdir Defines the temporary root
directory for the import
--username Set authentication
username
--verbose Print more information
while working
Export control arguments:
--batch Indicates
underlying
statements
to be
executed in
batch mode
--call Populate the
table using
this stored
procedure
(one call
per row)
--clear-staging-table Indicates
that any
data in
staging
table can be
deleted
--columns Columns to
export to
table
--direct Use direct
export fast
path
--export-dir HDFS source
path for the
export
-m,--num-mappers Use 'n' map
tasks to
export in
parallel
--mapreduce-job-name Set name for
generated
mapreduce
job
--staging-table Intermediate
staging
table
--table Table to
populate
--update-key Update
records by
specified
key column
--update-mode Specifies
how updates
are
performed
when new
rows are
found with
non-matching
keys in
database
--validate Validate the
copy using
the
configured
validator
--validation-failurehandler Fully
qualified
class name
for
ValidationFa
ilureHandler
--validation-threshold Fully
qualified
class name
for
ValidationTh
reshold
--validator Fully
qualified
class name
for the
Validator
Input parsing arguments:
--input-enclosed-by Sets a required field encloser
--input-escaped-by Sets the input escape
character
--input-fields-terminated-by Sets the input field separator
--input-lines-terminated-by Sets the input end-of-line
char
--input-optionally-enclosed-by Sets a field enclosing
character
Output line formatting arguments:
--enclosed-by Sets a required field enclosing
character
--escaped-by Sets the escape character
--fields-terminated-by Sets the field separator character
--lines-terminated-by Sets the end-of-line character
--mysql-delimiters Uses MySQL's default delimiter set:
fields: , lines: \n escaped-by: \
optionally-enclosed-by: '
--optionally-enclosed-by Sets a field enclosing character
Code generation arguments:
--bindir Output directory for compiled
objects
--class-name Sets the generated class name.
This overrides --package-name.
When combined with --jar-file,
sets the input class.
--input-null-non-string Input null non-string
representation
--input-null-string Input null string representation
--jar-file Disable code generation; use
specified jar
--map-column-java Override mapping for specific
columns to java types
--null-non-string Null non-string representation
--null-string Null string representation
--outdir Output directory for generated
code
--package-name Put auto-generated classes in
this package
HCatalog arguments:
--hcatalog-database HCatalog database name
--hcatalog-home Override $HCAT_HOME
--hcatalog-partition-keys Sets the partition
keys to use when
importing to hive
--hcatalog-partition-values Sets the partition
values to use when
importing to hive
--hcatalog-table HCatalog table name
--hive-home Override $HIVE_HOME
--hive-partition-key Sets the partition key
to use when importing
to hive
--hive-partition-value Sets the partition
value to use when
importing to hive
--map-column-hive Override mapping for
specific column to
hive types.
Generic Hadoop command-line arguments:
(must preceed any tool-specific arguments)
Generic options supported are
-conf specify an application configuration file
-D use value for given property
-fs specify a namenode
-jt specify a ResourceManager
-files specify comma separated files to be copied to the map reduce cluster
-libjars specify comma separated jar files to include in the classpath.
-archives specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
At minimum, you must specify --connect, --export-dir, and --table
4.sqoop import help
usage: sqoop import [GENERIC-ARGS] [TOOL-ARGS]
Common arguments:
--connect Specify JDBC connect
string
--connection-manager Specify connection manager
class name
--connection-param-file Specify connection
parameters file
--driver Manually specify JDBC
driver class to use
--hadoop-home Override
$HADOOP_MAPRED_HOME_ARG
--hadoop-mapred-home Override
$HADOOP_MAPRED_HOME_ARG
--help Print usage instructions
-P Read password from console
--password Set authentication
password
--password-alias Credential provider
password alias
--password-file Set authentication
password file path
--relaxed-isolation Use read-uncommitted
isolation for imports
--skip-dist-cache Skip copying jars to
distributed cache
--temporary-rootdir Defines the temporary root
directory for the import
--username Set authentication
username
--verbose Print more information
while working
Import control arguments:
--append Imports data
in append
mode
--as-avrodatafile Imports data
to Avro data
files
--as-parquetfile Imports data
to Parquet
files
--as-sequencefile Imports data
to
SequenceFile
s
--as-textfile Imports data
as plain
text
(default)
--autoreset-to-one-mapper Reset the
number of
mappers to
one mapper
if no split
key
available
--boundary-query Set boundary
query for
retrieving
max and min
value of the
primary key
--columns Columns to
import from
table
--compression-codec Compression
codec to use
for import
--delete-target-dir Imports data
in delete
mode
--direct Use direct
import fast
path
--direct-split-size Split the
input stream
every 'n'
bytes when
importing in
direct mode
-e,--query Import
results of
SQL
'statement'
--fetch-size Set number
'n' of rows
to fetch
from the
database
when more
rows are
needed
--inline-lob-limit Set the
maximum size
for an
inline LOB
-m,--num-mappers Use 'n' map
tasks to
import in
parallel
--mapreduce-job-name Set name for
generated
mapreduce
job
--merge-key Key column
to use to
join results
--split-by Column of
the table
used to
split work
units
--split-limit Upper Limit
of rows per
split for
split
columns of
Date/Time/Ti
mestamp and
integer
types. For
date or
timestamp
fields it is
calculated
in seconds.
split-limit
should be
greater than
0
--table Table to
read
--target-dir HDFS plain
table
destination
--validate Validate the
copy using
the
configured
validator
--validation-failurehandler Fully
qualified
class name
for
ValidationFa
ilureHandler
--validation-threshold Fully
qualified
class name
for
ValidationTh
reshold
--validator Fully
qualified
class name
for the
Validator
--warehouse-dir HDFS parent
for table
destination
--where WHERE clause
to use
during
import
-z,--compress Enable
compression
Incremental import arguments:
--check-column Source column to check for incremental
change
--incremental Define an incremental import of type
'append' or 'lastmodified'
--last-value Last imported value in the incremental
check column
Output line formatting arguments:
--enclosed-by Sets a required field enclosing
character
--escaped-by Sets the escape character
--fields-terminated-by Sets the field separator character
--lines-terminated-by Sets the end-of-line character
--mysql-delimiters Uses MySQL's default delimiter set:
fields: , lines: \n escaped-by: \
optionally-enclosed-by: '
--optionally-enclosed-by Sets a field enclosing character
Input parsing arguments:
--input-enclosed-by Sets a required field encloser
--input-escaped-by Sets the input escape
character
--input-fields-terminated-by Sets the input field separator
--input-lines-terminated-by Sets the input end-of-line
char
--input-optionally-enclosed-by Sets a field enclosing
character
Hive arguments:
--create-hive-table Fail if the target hive
table exists
--hive-compute-stats Overwrite existing data in
the Hive table
--hive-database Sets the database name to
use when importing to hive
--hive-delims-replacement Replace Hive record \0x01
and row delimiters (\n\r)
from imported string fields
with user-defined string
--hive-drop-import-delims Drop Hive record \0x01 and
row delimiters (\n\r) from
imported string fields
--hive-home Override $HIVE_HOME
--hive-import Import tables into Hive
(Uses Hive's default
delimiters if none are
set.)
--hive-overwrite Overwrite existing data in
the Hive table
--hive-partition-key Sets the partition key to
use when importing to hive
--hive-partition-value Sets the partition value to
use when importing to hive
--hive-table Sets the table name to use
when importing to hive
--map-column-hive Override mapping for
specific column to hive
types.
HBase arguments:
--column-family Sets the target column family for the
import
--hbase-bulkload Enables HBase bulk loading
--hbase-create-table If specified, create missing HBase tables
--hbase-row-key Specifies which input column to use as the
row key
--hbase-table Import to in HBase
HCatalog arguments:
--hcatalog-database HCatalog database name
--hcatalog-home Override $HCAT_HOME
--hcatalog-partition-keys Sets the partition
keys to use when
importing to hive
--hcatalog-partition-values Sets the partition
values to use when
importing to hive
--hcatalog-table HCatalog table name
--hive-home Override $HIVE_HOME
--hive-partition-key Sets the partition key
to use when importing
to hive
--hive-partition-value Sets the partition
value to use when
importing to hive
--map-column-hive Override mapping for
specific column to
hive types.
HCatalog import specific options:
--create-hcatalog-table Create HCatalog before import
--drop-and-create-hcatalog-table Drop and Create HCatalog before
import
--hcatalog-storage-stanza HCatalog storage stanza for table
creation
Accumulo arguments:
--accumulo-batch-size Batch size in bytes
--accumulo-column-family Sets the target column family for
the import
--accumulo-create-table If specified, create missing
Accumulo tables
--accumulo-instance Accumulo instance name.
--accumulo-max-latency Max write latency in milliseconds
--accumulo-password Accumulo password.
--accumulo-row-key Specifies which input column to
use as the row key
--accumulo-table Import to in Accumulo
--accumulo-user Accumulo user name.
--accumulo-visibility Visibility token to be applied to
all rows imported
--accumulo-zookeepers Comma-separated list of
zookeepers (host:port)
Code generation arguments:
--bindir Output directory for compiled
objects
--class-name Sets the generated class name.
This overrides --package-name.
When combined with --jar-file,
sets the input class.
--input-null-non-string Input null non-string
representation
--input-null-string Input null string representation
--jar-file Disable code generation; use
specified jar
--map-column-java Override mapping for specific
columns to java types
--null-non-string Null non-string representation
--null-string Null string representation
--outdir Output directory for generated
code
--package-name Put auto-generated classes in
this package
Generic Hadoop command-line arguments:
(must preceed any tool-specific arguments)
Generic options supported are
-conf specify an application configuration file
-D use value for given property
-fs specify a namenode
-jt specify a ResourceManager
-files specify comma separated files to be copied to the map reduce cluster
-libjars specify comma separated jar files to include in the classpath.
-archives specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
At minimum, you must specify --connect and --table
Arguments to mysqldump and other subprograms may be supplied
after a '--' on the command line.
你可能感兴趣的:(大数据动物园,Hive)