CentOS环境下安装Sqoop导入数据到HDFS

一、准备工作

1.Hadoop安装(上一篇有介绍,不解释)
2.下载,解压Sqoop(不解释,因为我Hadoop用的是2.3所以Sqoop使用sqoop-1.99.3-bin-hadoop200 )

二、基本配置

1.环境配置

sudo vi /etc/profile

#set sqoop Environment
export SQOOP_HOME=/home/dev/sqoop
export HADOOP_COMMON_HOME=/home/dev/hadoop
export HADOOP_MAPRED_HOME=/home/dev/hadoop
export PATH=${SQOOP_HOME}/bin:$PATH</pre>
2.修改包名路径

vi ${SQOOP_HOME}/server/conf/catalina.properties

common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../server/lib/*.jar,/home/dev/hadoop/share/hadoop/common/*.jar,/home/dev/hadoop/share/hadoop/common/lib/*.jar,/home/dev/hadoop/share/hadoop/yarn/*.jar,/home/dev/hadoop/share/hadoop/hdfs/*.jar,/home/dev/hadoop/share/hadoop/mapreduce/*.jar
3.修改Hadoop配置文件路径

vi ${SQOOP_HOME}/server/conf/sqoop.properties

org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/home/dev/hadoop/etc/hadoop
4.下载jar包mysql-connector-5.1.8.jar到 ${SQOOP_HOME}/server/lib

5.将Jar包路径加入HDFS

./bin/hadoop fs -mkdir -p /home/dev/sqoop/server/
./bin/hadoop fs -put -p /home/dev/sqoop/server/lib /home/dev/sqoop/server/

./bin/hadoop fs -mkdir -p /home/dev/sqoop/server/webapps/sqoop/WEB-INF/
./bin/hadoop fs -put -p /home/dev/sqoop/server/webapps/sqoop/WEB-INF/lib  /home/dev/sqoop/server/webapps/sqoop/WEB-INF/

./bin/hadoop fs -mkdir -p /home/dev/hadoop/share/hadoop/common/
./bin/hadoop fs -put -p /home/dev/hadoop/share/hadoop/common/lib  /home/dev/hadoop/share/hadoop/common/

三、使用方法

1.首先启动自带的Web服务

[dev@master sqoop]$ sqoop.sh server start
Sqoop home directory: /home/dev/sqoop
Setting SQOOP_HTTP_PORT:     12000
Setting SQOOP_ADMIN_PORT:     12001
Using   CATALINA_OPTS:       
Adding to CATALINA_OPTS:    -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001
Using CATALINA_BASE:   /home/dev/sqoop/server
Using CATALINA_HOME:   /home/dev/sqoop/server
Using CATALINA_TMPDIR: /home/dev/sqoop/server/temp
Using JRE_HOME:        /usr/local/jdk
Using CLASSPATH:       /home/dev/sqoop/server/bin/bootstrap.jar

注:若出现log4j相关的错误,可以把${SQOOP_HOME}/server/conf/sqoop.properties中log4j相关的注释掉

2.然后进入客户端进行操作

[dev@master sqoop]$ sqoop.sh client
Sqoop home directory: /home/dev/sqoop
Sqoop Shell: Type 'help' or '\h' for help.

sqoop:000> 
3.接下来就Sqoop的命令使用,输入help可以查询具体使用方法

a.设置服务

sqoop:000> set server --host master --port 12000 --webapp sqoop
Server is set successfully
b.查看版本信息
sqoop:000> show version --all
client version:
  Sqoop 1.99.3 revision 2404393160301df16a94716a3034e31b03e27b0b 
  Compiled by mengweid on Fri Oct 18 14:15:53 EDT 2013
server version:
  Sqoop 1.99.3 revision 2404393160301df16a94716a3034e31b03e27b0b 
  Compiled by mengweid on Fri Oct 18 14:15:53 EDT 2013
Protocol version:
  [1]
sqoop:000> 
c.创建数据库连接
sqoop:000> create connection --cid 1
Creating connection for connector with id 1
Please fill following values to create new connection object
Name: Mysql-H216

Connection configuration

JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://172.16.10.216:3306/mic_db_out?characterEncoding=UTF-8
Username: admin
Password: *****
JDBC Connection Properties: 
There are currently 0 values in the map:
entry# 

Security related configuration options

Max connections: 100
New connection was successfully created with validation status FINE and persistent id 1
d.创建导入任务
sqoop:000> create job --xid 1 --type import
Creating job for connection with id 1
Please fill following values to create new job object
Name: HeartBeat

Database configuration

Schema name: mic_db_out
Table name: t_heart_beat
Table SQL statement: 
Table column names: 
Partition column name: 
Nulls in partition column: 
Boundary query: 

Output configuration

Storage type: 
  0 : HDFS
Choose: 0
Output format: 
  0 : TEXT_FILE
  1 : SEQUENCE_FILE
Choose: 0
Compression format: 
  0 : NONE
  1 : DEFAULT
  2 : DEFLATE
  3 : GZIP
  4 : BZIP2
  5 : LZO
  6 : LZ4
  7 : SNAPPY
Choose: 0
Output directory: /mysql/h216/t_heart_beat

Throttling resources

Extractors: 
Loaders: 
New job was successfully created with validation status FINE  and persistent id 1
sqoop:000> 

d.然后就可以导入数据了

sqoop:000> start job --jid 1
Submission details
Job ID: 1
Server URL: http://localhost:12000/sqoop/
Created by: dev
Creation date: 2014-04-19 18:54:25 CST
Lastly updated by: dev
External ID: job_local1638775039_0002
2014-04-19 18:54:50 CST: UNKNOWN 
e.进入Hadoop中查看数据
[dev@master hadoop]$ ./bin/hadoop fs -ls /mysql/h216/t_heart_beat
Found 11 items
-rw-r--r--   2 dev supergroup          0 2014-04-19 18:54 /mysql/h216/t_heart_beat/_SUCCESS
-rw-r--r--   2 dev supergroup     666901 2014-04-19 18:54 /mysql/h216/t_heart_beat/part-m-00000
-rw-r--r--   2 dev supergroup     664052 2014-04-19 18:54 /mysql/h216/t_heart_beat/part-m-00001
-rw-r--r--   2 dev supergroup     663652 2014-04-19 18:54 /mysql/h216/t_heart_beat/part-m-00002
-rw-r--r--   2 dev supergroup     663113 2014-04-19 18:54 /mysql/h216/t_heart_beat/part-m-00003
-rw-r--r--   2 dev supergroup     669380 2014-04-19 18:54 /mysql/h216/t_heart_beat/part-m-00004
-rw-r--r--   2 dev supergroup     664585 2014-04-19 18:54 /mysql/h216/t_heart_beat/part-m-00005
-rw-r--r--   2 dev supergroup     657891 2014-04-19 18:54 /mysql/h216/t_heart_beat/part-m-00006
-rw-r--r--   2 dev supergroup     662798 2014-04-19 18:54 /mysql/h216/t_heart_beat/part-m-00007
-rw-r--r--   2 dev supergroup     660777 2014-04-19 18:54 /mysql/h216/t_heart_beat/part-m-00008
-rw-r--r--   2 dev supergroup     641260 2014-04-19 18:54 /mysql/h216/t_heart_beat/part-m-00009
[dev@master hadoop]$ 


OK,导入成功


你可能感兴趣的:(CentOS环境下安装Sqoop导入数据到HDFS)