使用sqoop将us_order表中的数据导入到hive中,hive的库名为exam_ods,表名叫ods_us_order,根据order_date的日期来实现分区导入数据,形成脚本。
#!/bin/bash
#先删除hive中的此表信息
/usr/local/hive/bin/hive -e "drop table exam_ods.ods_us_order" > /dev/null 2>&1
ALL_DATE=`/usr/bin/mysql -h mypc01 -uroot -p123456 -s -e 'select date(order_date) from sz.us_order group by date(order_date)'`
# 设置一些下面用到的变量
SQOOP_HOME=/usr/local/sqoop
MYSQL_CONNECT=jdbc:mysql://mypc01:3306/sz
MYSQL_USERNAME=root
MYSQL_PWD=123456
for ONE_DAY in ${ALL_DATE}
do
${SQOOP_HOME}/bin/sqoop import \
--connect ${MYSQL_CONNECT} \
--username ${MYSQL_USERNAME} \
--password ${MYSQL_PWD} \
--table 'us_order' \
--where "date(order_date)='${ONE_DAY}'" \
--hive-import \
--hive-overwrite \
--hive-table 'exam_ods.ods_us_order' \
--fields-terminated-by ',' \
--hive-partition-key 'dt' \
--hive-partition-value "${ONE_DAY}" \
--num-mappers 3
done
usage: hive
-d,--define <key=value> Variable subsitution to apply to hive
commands. e.g. -d A=B or --define A=B
--database <databasename> Specify the database to use
-e <quoted-query-string> SQL from command line
-f <filename> SQL from files
-H,--help Print help information
--hiveconf <property=value> Use value for given property
--hivevar <key=value> Variable subsitution to apply to hive
commands. e.g. --hivevar A=B
-i <filename> Initialization SQL file
-S,--silent Silent mode in interactive shell
-v,--verbose Verbose mode (echo executed SQL to the
console)
> /dev/null 2>&1
表示将输出送入黑洞,就是不在console上面显示.具体原因不深究了./usr/bin/mysql -h mypc01 -uroot -p123456 -s -e 'select date(order_date) from sz.us_order group by date(order_date)
表示远程启动mysql并执行一条sql语句.# mysql --help
Usage: mysql [OPTIONS] [database]
-?, --help Display this help and exit.
-I, --help Synonym for -?
--auto-rehash Enable automatic rehashing. One doesn't need to use
'rehash' to get table and field completion, but startup
and reconnecting may take a longer time. Disable with
--disable-auto-rehash.
(Defaults to on; use --skip-auto-rehash to disable.)
-A, --no-auto-rehash
No automatic rehashing. One has to use 'rehash' to get
table and field completion. This gives a quicker start of
mysql and disables rehashing on reconnect.
--auto-vertical-output
Automatically switch to vertical output mode if the
result is wider than the terminal width.
-B, --batch Don't use history file. Disable interactive behavior.
(Enables --silent.)
--bind-address=name IP address to bind to.
--binary-as-hex Print binary data as hex
--character-sets-dir=name
Directory for character set files.
--column-type-info Display column type information.
-c, --comments Preserve comments. Send comments to the server. The
default is --skip-comments (discard comments), enable
with --comments.
-C, --compress Use compression in server/client protocol.
-#, --debug[=#] This is a non-debug version. Catch this and exit.
--debug-check This is a non-debug version. Catch this and exit.
-T, --debug-info This is a non-debug version. Catch this and exit.
-D, --database=name Database to use.
--default-character-set=name
Set the default character set.
--delimiter=name Delimiter to be used.
--enable-cleartext-plugin
Enable/disable the clear text authentication plugin.
-e, --execute=name Execute command and quit. (Disables --force and history
file.)
-E, --vertical Print the output of a query (rows) vertically.
-f, --force Continue even if we get an SQL error.
--histignore=name A colon-separated list of patterns to keep statements
from getting logged into syslog and mysql history.
-G, --named-commands
Enable named commands. Named commands mean this program's
internal commands; see mysql> help . When enabled, the
named commands can be used from any line of the query,
otherwise only from the first line, before an enter.
Disable with --disable-named-commands. This option is
disabled by default.
-i, --ignore-spaces Ignore space after function names.
--init-command=name SQL Command to execute when connecting to MySQL server.
Will automatically be re-executed when reconnecting.
--local-infile Enable/disable LOAD DATA LOCAL INFILE.
-b, --no-beep Turn off beep on error.
-h, --host=name Connect to host.
-H, --html Produce HTML output.
-X, --xml Produce XML output.
--line-numbers Write line numbers for errors.
(Defaults to on; use --skip-line-numbers to disable.)
-L, --skip-line-numbers
Don't write line number for errors.
-n, --unbuffered Flush buffer after each query.
--column-names Write column names in results.
(Defaults to on; use --skip-column-names to disable.)
-N, --skip-column-names
Don't write column names in results.
--sigint-ignore Ignore SIGINT (CTRL-C).
-o, --one-database Ignore statements except those that occur while the
default database is the one named at the command line.
--pager[=name] Pager to use to display results. If you don't supply an
option, the default pager is taken from your ENV variable
PAGER. Valid pagers are less, more, cat [> filename],
etc. See interactive help (\h) also. This option does not
work in batch mode. Disable with --disable-pager. This
option is disabled by default.
-p, --password[=name]
Password to use when connecting to server. If password is
not given it's asked from the tty.
-P, --port=# Port number to use for connection or 0 for default to, in
order of preference, my.cnf, $MYSQL_TCP_PORT,
/etc/services, built-in default (3306).
--prompt=name Set the mysql prompt to this value.
--protocol=name The protocol to use for connection (tcp, socket, pipe,
memory).
-q, --quick Don't cache result, print it row by row. This may slow
down the server if the output is suspended. Doesn't use
history file.
-r, --raw Write fields without conversion. Used with --batch.
--reconnect Reconnect if the connection is lost. Disable with
--disable-reconnect. This option is enabled by default.
(Defaults to on; use --skip-reconnect to disable.)
-s, --silent Be more silent. Print results with a tab as separator,
each row on new line.
-S, --socket=name The socket file to use for connection.
--ssl-mode=name SSL connection mode.
--ssl Deprecated. Use --ssl-mode instead.
(Defaults to on; use --skip-ssl to disable.)
--ssl-verify-server-cert
Deprecated. Use --ssl-mode=VERIFY_IDENTITY instead.
--ssl-ca=name CA file in PEM format.
--ssl-capath=name CA directory.
--ssl-cert=name X509 cert in PEM format.
--ssl-cipher=name SSL cipher to use.
--ssl-key=name X509 key in PEM format.
--ssl-crl=name Certificate revocation list.
--ssl-crlpath=name Certificate revocation list path.
--tls-version=name TLS version to use, permitted values are: TLSv1, TLSv1.1,
TLSv1.2
--server-public-key-path=name
File path to the server public RSA key in PEM format.
--get-server-public-key
Get server public key
-t, --table Output in table format.
--tee=name Append everything into outfile. See interactive help (\h)
also. Does not work in batch mode. Disable with
--disable-tee. This option is disabled by default.
-u, --user=name User for login if not current user.
-U, --safe-updates Only allow UPDATE and DELETE that uses keys.
-U, --i-am-a-dummy Synonym for option --safe-updates, -U.
-v, --verbose Write more. (-v -v -v gives the table output format).
-V, --version Output version information and exit.
-w, --wait Wait and retry if connection is down.
--connect-timeout=# Number of seconds before connection timeout.
--max-allowed-packet=#
The maximum packet length to send to or receive from
server.
--net-buffer-length=#
The buffer size for TCP/IP and socket communication.
--select-limit=# Automatic limit for SELECT when using --safe-updates.
--max-join-size=# Automatic limit for rows in a join when using
--safe-updates.
--secure-auth Refuse client connecting to server if it uses old
(pre-4.1.1) protocol. Deprecated. Always TRUE
--server-arg=name Send embedded server this as a parameter.
--show-warnings Show warnings after every statement.
-j, --syslog Log filtered interactive commands to syslog. Filtering of
commands depends on the patterns supplied via histignore
option besides the default patterns.
--plugin-dir=name Directory for client-side plugins.
--default-auth=name Default authentication client-side plugin to use.
--binary-mode By default, ASCII '\0' is disallowed and '\r\n' is
translated to '\n'. This switch turns off both features,
and also turns off parsing of all clientcommands except
\C and DELIMITER, in non-interactive mode (for input
piped to mysql or loaded using the 'source' command).
This is necessary when processing output from mysqlbinlog
that may contain blobs.
--connect-expired-password
Notify the server that this client is prepared to handle
expired password sandbox mode.
example
[root@mypc01 openresty]# /usr/bin/mysql -h mypc01 -uroot -p123456 -e 'select date(order_date) from sz.us_order group by date(order_date)'
+------------------+
| date(order_date) |
+------------------+
| 2019-07-14 |
| 2019-07-15 |
| 2019-07-16 |
+------------------+
如何把命令的查询结果赋给一个shell变量呢?需要给命令加反单引号
ALL_DATE=`/usr/bin/mysql -h mypc01 -uroot -p123456 -e 'select date(order_date) from sz.us_order group by date(order_date)'`
echo $ALL_DATE
执行结果如下
date(order_date) 2019-07-14 2019-07-15 2019-07-16
如果要去掉第一个,需要加上-s
或者如下写法也可以
ALL_DATE=$(/usr/bin/mysql -h mypc01 -uroot -p123456 -e 'select date(order_date) from sz.us_order group by date(order_date)')
echo $ALL_DATE
接下来为shell循环的写法
流类比
for i in $(seq 1 10)
do
echo $i
done