3、运行opt 文件
sqoop --options-file m2h-d-acounttype.opt
4、启动hive
hive
5、查询tables
select * from d_accounttype;
6、show date
遇到问题:
1、没有在hive创建mysql中得相同的表,就直接导入数据。系统自动生成输出目录就不在指定hive路径了。
再次运行就会报 目录以存在 error。
2、sqoop 所有参数说明
usage: sqoop import [GENERIC-ARGS] [TOOL-ARGS]
Common arguments:
--connect <jdbc-uri> Specify JDBC connect string
--connection-manager <class-name> Specify connection manager class name
--connection-param-file <properties-file> Specify connection parameters file
--driver <class-name> Manually specify JDBC driver class to use
--hadoop-mapred-home <dir> Override
$HADOOP_MAPRED_HOME_ARG
--help Print usage instructions
-P Read password from console
--password <password> Set authentication password
--username <username> Set authentication username
--verbose Print more information while working
Import control arguments:
--append Imports data in append mode
--as-avrodatafile Imports data to Avro data files
--as-sequencefile Imports data to SequenceFile
--as-textfile Imports data as plain text (default)
--boundary-query <statement> Set boundary
query for retrieving max and min value of the primary key
--columns <col,col,col...> Columns to import from table
--compression-codec <codec> Compression codec to use for import
--direct Use direct import fast path
--direct-split-size <n> Split the input stream every 'n' bytes when importing in direct mode
-e,--query <statement> Import results of SQL 'statement'
--fetch-size <n> Set number 'n' of rows to fetch from the database when more rows are needed
--inline-lob-limit <n> Set the maximum size for an inline LOB
-m,--num-mappers <n> Use 'n' map tasks to import in parallel
--mapreduce-job-name <name> Set name for generated mapreduce job
--split-by <column-name> Column of the table used to split work units
--table <table-name> Table to read
--target-dir <dir> HDFS plain table destination
--validate Validate the copy using the configured validator
--validation-failurehandler <validation-failurehandler> Fully qualified class name for ValidationFa
ilureHandler
--validation-threshold <validation-threshold> Fully qualified class name for ValidationTh reshold
--validator <validator> Fully qualified class name for the Validator
--warehouse-dir <dir> HDFS parent for table destination
--where <where clause> WHERE clause to use during import
-z,--compress Enable compression
Incremental import arguments:
--check-column <column> Source column to check for incremental change
--incremental <import-type> Define an incremental import of type 'append' or 'lastmodified'
--last-value <value> Last imported value in the incremental check column
Output line formatting arguments:
--enclosed-by <char> Sets a required field enclosing character
--escaped-by <char> Sets the escape character
--fields-terminated-by <char> Sets the field separator character
--lines-terminated-by <char> Sets the end-of-line character
--mysql-delimiters Uses MySQL's default delimiter set:
fields: , lines: \n escaped-by: \optionally-enclosed-by: '
--optionally-enclosed-by <char> Sets a field enclosing character
Input parsing arguments:
--input-enclosed-by <char> Sets a required field encloser
--input-escaped-by <char> Sets the input escape character
--input-fields-terminated-by <char> Sets the input field separator
--input-lines-terminated-by <char> Sets the input end-of-line char
--input-optionally-enclosed-by <char> Sets a field enclosing character
Hive arguments:
--create-hive-table Fail if the target hive table exists
--hive-delims-replacement <arg> Replace Hive record \0x01 and row delimiters (\n\r) from imported string fields with user-defined string
--hive-drop-import-delims Drop Hive record \0x01 and row delimiters (\n\r) from imported string fields
--hive-home <dir> Override $HIVE_HOME
--hive-import Import tables into Hive (Uses Hive's default delimiters if none are set.)
--hive-overwrite Overwrite existing data in the Hive table
--hive-partition-key <partition-key> Sets the partition key to use when importing to hive
--hive-partition-value <partition-value> Sets the partition value to use when importing to hive
--hive-table <table-name> Sets the table name to use when importing to hive
--map-column-hive <arg> Override mapping for specific column to hive types.
HBase arguments:
--column-family <family> Sets the target column family for the import
--hbase-create-table If specified, create missing HBase tables
--hbase-row-key <col> Specifies which input column to use as the row key
--hbase-table <table> Import to <table> in HBase
Code generation arguments:
--bindir <dir> Output directory for compiled objects
--class-name <name> Sets the generated class name. This overrides --package-name.
When combined with --jar-file, sets the input class.
--input-null-non-string <null-str> Input null non-string representation
--input-null-string <null-str> Input null string representation
--jar-file <file> Disable code generation; use specified jar
--map-column-java <arg> Override mapping for specific columns to java types
--null-non-string <null-str> Null non-string representation
--null-string <null-str> Null string representation
--outdir <dir> Output directory for generated code
--package-name <name> Put auto-generated classes in this package
3、增量导入 怎样动态修改 last-value ?
另外一个问题,如果我们要导入某个表中的新加入的部分,可以使用--check-column、--incremental、--last-value三个参数,--check-column指定要检查的列,--incremental指定某种增加的模式,只有两个合法的值,append 和lastmodified。如果--incremental为append,则Sqoop会导入--check-column指定的列的值大于--last-value所指定的值的记录。如果--incremental为lastmodified,则Sqoop会导入--check-column指定的列的值(这是一个时间戳)近于--last-value所指定的时间戳的记录。例如下面的命令可以导入列ID的值大于4的所有记录。
$ sqoop import --hive-import --connect jdbc:oracle:thin:@192.168.56.111:1521:DBNAME --username USERNAME --password PASSWORD --verbose -m 1 --table TABLENAME --check-column ID --incremental append --last-value 4