Sqoop2学习(一)—Sqoop1.99.3介绍与安装

问题导读:

         1、Sqoop2 server安装需要依赖什么条件?

         2、common.loader值怎么配置?

         3、Sqoop默认两个端口是12000、12001

一、Sqoop2简介

         Sqoop2是用来在Hadoop与结构化数据存储如关系型数据库之间进行批量数据传输的一个有效工具。Sqoop2 1.99.3目前支持将数据导入到HDFS中,期待更多的功能。

         Sqoop2安装包分为两个部分:服务器和客户端,你需要在Hadoop集群上的一个节点上(一般是Master节点)上安装Sqoop服务器,客户端安装不依赖Hadoop集群。Sqoop服务端扮演着从客户端收集的入口点,同时也充当着MapReduce的客户机。

二、下载Sqoop2安装包

         1、点击下载sqoop-1.99.3-bin-hadoop200.tar.gz,解压到对应目录,我的目录如下:

[hadoopUser@secondmgt sqoop-1.99.3-bin-hadoop200]$ pwd
/home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200

         2、修改.bashrc,添加以下配置

#Sqoop Configure
export SQOOP_HOME=/home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200
export PATH=$PATH:$SQOOP_HOME/bin
export CATALINA_HOME=$SQOOP_HOME/server
export LOGDIR=$SQOOP_HOME/logs
          执行source  ~/.bashrc,即可生效

三、服务端安装

        1、切换到安装包的server目录下的conf中

[hadoopUser@secondmgt ~]$ cd cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server/conf/
[hadoopUser@secondmgt conf]$ pwd
/home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server/conf
[hadoopUser@secondmgt conf]$ ls
catalina.policy  catalina.properties  context.xml  logging.properties  server.xml  sqoop_bootstrap.properties  sqoop.properties  tomcat-users.xml  web.xml
               由上可知,server/conf目录中包含着服务端的一些配置文件,我们需要修改为我们对应的环境需求。

        2、首先修改catalina.properties文件

               修改common.loader值,将Hadoop相关包引进来,具体如下:

# common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../lib/*.jar,/usr/lib/hadoop/*.jar,/usr/lib/hadoop/lib/*.jar,/usr/lib/hadoop-hdfs/*.jar,/usr/lib/hadoop-hdfs/lib/*.jar,/usr/lib/hadoop-mapreduce/*.jar,/usr/lib/hadoop-mapreduce/lib/*.jar,/usr/lib/hadoop-yarn/*.jar,/usr/lib/hadoop-yarn/lib/*.jar

common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../lib/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/common/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/common/lib/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/hdfs/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/hdfs/lib/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/mapreduce/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/mapreduce/lib/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/tools/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/tools/lib/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/yarn/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/yarn/lib/*.jar,/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/httpfs/tomcat/lib/*.jar
        3、其次sqoop_bootstrap.properties采用默认值,不作修改

sqoop.config.provider=org.apache.sqoop.core.PropertiesConfigurationProvider
        4、最后修改sqoop.properties

              Derby数据库的JDBC相关配置

# JDBC repository provider configuration
org.apache.sqoop.repository.jdbc.handler=org.apache.sqoop.repository.derby.DerbyRepositoryHandler
org.apache.sqoop.repository.jdbc.transaction.isolation=READ_COMMITTED
org.apache.sqoop.repository.jdbc.maximum.connections=10
org.apache.sqoop.repository.jdbc.url=jdbc:derby:@BASEDIR@/repository/sqoopdb;create=true
org.apache.sqoop.repository.jdbc.driver=org.apache.derby.jdbc.EmbeddedDriver
org.apache.sqoop.repository.jdbc.user=sa
org.apache.sqoop.repository.jdbc.password=
               同时将org.apache.sqoop.submission.engine.mapreduce.configuration.directory属性修改为自己对应Hadoop的配置文件目录
# Hadoop configuration directory
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/etc/hadoop/

         5、将MySQL JDBC驱动包拷贝到server/lib目录下

[hadoopUser@secondmgt lib]$ pwd
/home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server/lib
[hadoopUser@secondmgt lib]$ ls
annotations-api.jar  catalina.jar         el-api.jar     jsp-api.jar                          tomcat-coyote.jar   tomcat-i18n-fr.jar
catalina-ant.jar     catalina-tribes.jar  jasper-el.jar  mysql-connector-java-5.1.18-bin.jar  tomcat-dbcp.jar     tomcat-i18n-ja.jar
catalina-ha.jar      ecj-3.7.2.jar        jasper.jar     servlet-api.jar                      tomcat-i18n-es.jar
          如果需要将转换传输的是Oracle中的数据,则将Oracle的 JDBC驱动包导入即可

四、启动与停止Sqoop2 Server服务

       1、启动

[hadoopUser@secondmgt sqoop-1.99.3-bin-hadoop200]$ sqoop.sh server start
Sqoop home directory: /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200
Setting SQOOP_HTTP_PORT:     12000
Setting SQOOP_ADMIN_PORT:     12001
Using   CATALINA_OPTS:       
Adding to CATALINA_OPTS:    -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001
Using CATALINA_BASE:   /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server
Using CATALINA_HOME:   /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server
Using CATALINA_TMPDIR: /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server/temp
Using JRE_HOME:        /home/hadoopUser/cloud/jdk/jdk1.7.0_60
Using CLASSPATH:       /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server/bin/bootstrap.jar

        使用jps查看进程,发现多了一个Bootstrap进程,启动成功

[hadoopUser@secondmgt ~]$ jps
19877 ResourceManager
19637 SecondaryNameNode
16591 org.eclipse.equinox.launcher_1.3.0.v20130327-1440.jar
27234 Jps
19424 NameNode
27149 Bootstrap

      2、停止

[hadoopUser@secondmgt sqoop-1.99.3-bin-hadoop200]$ sqoop.sh server stop
Sqoop home directory: /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200
Setting SQOOP_HTTP_PORT:     12000
Setting SQOOP_ADMIN_PORT:     12001
Using   CATALINA_OPTS:       
Adding to CATALINA_OPTS:    -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001
Using CATALINA_BASE:   /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server
Using CATALINA_HOME:   /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server
Using CATALINA_TMPDIR: /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server/temp
Using JRE_HOME:        /home/hadoopUser/cloud/jdk/jdk1.7.0_60
Using CLASSPATH:       /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/server/bin/bootstrap.jar
Jan 16, 2015 12:30:24 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
WARNING: Problem with directory [/home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200/lib], exists: [false], isDirectory: [false], canRead: [false]

五、Sqoop客户端安装

        Sqoop客户端不需要额外的配置和安装步骤,只需要将安装包解压拷贝到指定目录,配置.bashrc文件,直接执行如下命令即可:

[hadoopUser@secondmgt ~]$ sqoop.sh client
Sqoop home directory: /home/hadoopUser/cloud/sqoop/sqoop-1.99.3-bin-hadoop200
Sqoop Shell: Type 'help' or '\h' for help.

sqoop:000>

你可能感兴趣的:(Sqoop2.X学习)