sqoop1.4.7部署及其使用之旅

为什么使用sqoop?

  1、对于hadoop数据的处理有时候要用户关系型数据库(mysql,oracle)中的数据进行清洗,或处理的数据需要导入到关系型数据库中;

  2、由于没有工具的支持hadoop的hdfs和数据库之间的交互,手工写map-reduce中来处理复杂繁琐,维护成功高。

  3、sqoop是连接关系型数据库和hadoop的桥梁,主要有两个方面(导入和导出);

开始部署sqoop

 1、下载sqoop-1.4.7的sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

      路径:https://downloads.apache.org/sqoop/1.4.7/

2、部署sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

    a、解压sqoop包 tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz  ,并重命名sqoop-1.4.7

    b、修改sqoop配置文件,进入sqoop-1.4.7/conf目录 ;复制 sqoop-env-template.sh文件为sqoop-env.sh,并修改改文件中         HADOOP_COMMON_HOME,HADOOP_MAPRED_HOME,HIVE_HOME参数

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# included in all the hadoop scripts with source command
# should not be executable directly
# also should not be passed any arguments, since we need original $*

# Set Hadoop-specific environment variables here.

#Set path to where bin/hadoop is available
# hadoop部署目录
export HADOOP_COMMON_HOME=/usr/local/hadoop

#Set path to where hadoop-*-core.jar is available
# hadoop部署目录
export HADOOP_MAPRED_HOME=/usr/local/hadoop

#set the path to where bin/hbase is available
#export HBASE_HOME=

#Set the path to where bin/hive is available
#hive部署目录
export HIVE_HOME=/usr/local/hive-3.1.2
#Set the path for where zookeper config dir is
#export ZOOCFGDIR=

   c、修改系统变量/etc/profile文件

       export SQOOP_HOME=/usr/local/sqoop-1.4.7

       export PATH=$PATH:$SQOOP_HOME/bin:

   d、将mysql驱动包上传至sqoop-1.4.7/lib目录下,我使用的是mysql-connector-java-5.1.30.jar

3、测试sqoop:

       a. 查看数据库的名称:sqoop list-databases --connect jdbc:mysql://ip:3306/ --username 用户名 --password 密码

       b. 列举出数据库中的表名:sqoop list-tables --connect jdbc:mysql://ip:3306/hadoop_mr --username 用户名 --password 密码

以下是linux操作命令:

#解压文件
[root@hadoop02 local]# tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz  
重命名文件夹
[root@hadoop02 local]# mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop-1.4.7
[root@hadoop02 local]# cd sqoop-1.4.7
[root@hadoop02 sqoop-1.4.7]# cd conf/
[root@hadoop02 conf]# ls
oraoop-site-template.xml  sqoop-env-template.cmd  sqoop-env-template.sh  sqoop-site-template.xml  sqoop-site.xml
[root@hadoop02 conf]# cp sqoop-env-template.sh sqoop-env.sh
[root@hadoop02 conf]# vim sqoop-env.sh   #修改完成后记得上传sql驱动包
#测试
[root@hadoop02 bin]# sqoop list-databases --connect jdbc:mysql://ip:3306/ --username kfyw --password ***
Warning: /usr/local/sqoop-1.4.7/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/local/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
2020-03-31 16:44:42,941 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2020-03-31 16:44:42,967 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
2020-03-31 16:44:43,052 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
content_center_manager
content_manager
content_manager_t
hadoop_mr
hive_metadata
mysql
performance_schema
permissions
user_center_manager
[root@hadoop02 bin]# sqoop list-tables --connect jdbc:mysql://ip:3306/hadoop_mr --username kfyw --password ***
Warning: /usr/local/sqoop-1.4.7/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/local/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
2020-03-31 16:46:15,435 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2020-03-31 16:46:15,458 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
2020-03-31 16:46:15,545 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
album_play_log
album_play_log_20191104_20191110
album_play_log_20191113_20191115
album_play_log_20191209_20191215
auth_suc_user
clean_user
detail_page_log
detail_page_log_20191104_20191110
detail_page_log_20191113_20191115
export_table_info
first_page_log
first_page_log_20191104_20191110
first_page_log_20191113_20191115
first_page_log_20191209_20191215
global_user_day
global_user_day_copy
global_user_week
persist_column
run_class
table_column_info
table_info
user_info
user_stay_duration_day
user_stay_duration_week

4、sqoop指导文档,详细可参考:http://sqoop.apache.org/docs/

5、注意:使用--password-file 指定密码存放文件的位置时,密码输入格式

echo -n "password"  >> /home/app/mr_script/passwd.pwd

 

你可能感兴趣的:(sqoop,hadoop)