HADOOP环境为hadoop2.2.0

下载的sqoop包为sqoop-1.99.3-bin-hadoop200,下载地址:

http://apache.fayea.com/apache-mirror/sqoop/1.99.3/sqoop-1.99.3-bin-hadoop200.tar.gz

1.解压文件到工作目录:

tar -xzvf sqoop-1.99.3-bin-hadoop200.tar.gz

mv sqoop-1.99.3-bin-hadoop200 /usr/app/sqoop


2.修改环境变量:

vim /etc/profile

增加SQOOP主目录

export SQOOP_HOME=/use/app/sqoop

export PATH=$PATH:$SQOOP_HOME/bin

export CATALINA_BASE=$SQOOP_HOME/server

export LOGDIR=$SQOOP_HOME/logs/


即时生效:

source /etc/profile


3.修改sqoop配置:

vi server/conf/sqoop.properties

将org.apache.sqoop.submission.engine.mapreduce.configuration.directory后面hadoop的位置修改为自己安装的hadoop配置文件位置,我的为:/usr/app/hadoop-2.2.0/etc/hadoop/


4.修改sqoop读取hadoop的jar包的路径:

vi /sqoop/server/conf/catalina.properties

将common.loader行后的/usr/lib/hadoop/lib/*.jar改成自己的hadoop jar 包目录,我的为:

/usr/app/hadoop-2.2.0/share/hadoop/common/*.jar,

/usr/app/hadoop-2.2.0/share/hadoop/common/lib/*.jar,

/usr/app/hadoop-2.2.0/share/hadoop/hdfs/*.jar,

/usr/app/hadoop-2.2.0/share/hadoop/hdfs/lib/*.jar,

/usr/app/hadoop-2.2.0/share/hadoop/mapreduce/*.jar,

/usr/app/hadoop-2.2.0/share/hadoop/mapreduce/lib/*.jar,

/usr/app/hadoop-2.2.0/share/hadoop/tools/*.jar,

/usr/app/hadoop-2.2.0/share/hadoop/tools/lib/*.jar,

/usr/app/hadoop-2.2.0/share/hadoop/yarn/*.jar,

/usr/app/hadoop-2.2.0/share/hadoop/yarn/lib/*.jar

到此sqoop就基本配置完成可以直接运行./bin/sqoop.sh server start启动sqoop了


5.拷贝mysql-connector-java包到sqoop根目录lib下(PS:根目下没有lib目录,新建一个)


6.使用SQOOP:


启动SQOOP:

$./bin/sqoop.sh server start

停止SQOOP:

$./bin/sqoop.sh server stop


进入客户端交互模式:

$./bin/sqoop.sh client

Sqoop home directory: /usr/lib/sqoop

Sqoop Shell: Type 'help' or '\h' for help.


sqoop:000>


为客户端配置服务器:

sqoop:000> set server --host master --port 12000 --webapp sqoop

Server is set successfully


查看版本信息:

sqoop:000> show version --all

client version:

 Sqoop 1.99.3 revision 2404393160301df16a94716a3034e31b03e27b0b

 Compiled by mengweid on Fri Oct 18 14:15:53 EDT 2013

server version:

 Sqoop 1.99.3 revision 2404393160301df16a94716a3034e31b03e27b0b

 Compiled by mengweid on Fri Oct 18 14:15:53 EDT 2013

Protocol version:

 [1]


显示连接器:

sqoop:000> show connector --all

1 connector(s) to show:

Connector with id 1:

 Name: generic-jdbc-connector

 Class: org.apache.sqoop.connector.jdbc.GenericJdbcConnector

 Supported job types: [EXPORT, IMPORT]

...

创建数据库连接

sqoop:000> create connection --cid 1  

Creating connection for connector with id 1  

Please fill following values to create new connection object  

Name: Mysql-H216  


Connection configuration  


JDBC Driver Class: com.mysql.jdbc.Driver  

JDBC Connection String: jdbc:mysql://192.168.1.120:3306/test

Username: admin  

Password: *****  

JDBC Connection Properties:  

There are currently 0 values in the map:  

entry#  


Security related configuration options  


Max connections: 100  


New connection was successfully created with validation status FINE and persistent id 1


创建导入任务

sqoop:000> create job --xid 1 --type import  

Creating job for connection with id 1  

Please fill following values to create new job object  

Name: HeartBeat  


Database configuration  


Schema name: mic_db_out  

Table name: t_heart_beat  

Table SQL statement:  

Table column names:  

Partition column name:  

Nulls in partition column:  

Boundary query:  


Output configuration  


Storage type:  

 0 : HDFS  

Choose: 0  

Output format:  

 0 : TEXT_FILE  

 1 : SEQUENCE_FILE  

Choose: 0  

Compression format:  

 0 : NONE  

 1 : DEFAULT  

 2 : DEFLATE  

 3 : GZIP  

 4 : BZIP2  

 5 : LZO  

 6 : LZ4  

 7 : SNAPPY  

Choose: 0  

Output directory: /user/jarcec/users


Throttling resources  


Extractors:  

Loaders:  

New job was successfully created with validation status FINE  and persistent id 1  

sqoop:000>


查看导入状态:

sqoop:000> status job --jid 1  

Submission details  

Job ID: 1  

Server URL: http://master:12000/sqoop/  

Created by: dev  

Creation date: 2014-04-19 18:54:25 CST  

Lastly updated by: dev  

External ID: job_local1638775039_0002  

2014-04-19 18:54:50 CST: UNKNOWN  


相关命令:

启动任务同步执行:start job --jid 1 -s
显示任务:status job --jid 1
显示所有任务:show job -a
停止任务:stop job --jid 1
克隆连接:clone connection --xid 1

克隆任务:clone job --jid 1

删除连接:delete connection --xid 1