(一):简介和环境说明
sqoop是apache旗下一款“Hadoop和关系数据库服务器之间传送数据”的工具。
导入数据:MySQL,Oracle导入数据到Hadoop的HDFS、HIVE、HBASE等数据存储系统;
导出数据:从Hadoop的文件系统中导出数据到关系数据库
环境:centos7,hadoop-2.7.6,hbase-1.3.2,zookeeper-3.4.10
3台Linux虚拟机
# 主机名称,ip地址
master: 192.168.61.132
slaver1: 192.168.61.130
slaver2: 192.168.61.131
下载地址:http://mirrors.hust.edu.cn/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
(二):安装部署
1.解压文件
cd /usr/local/services/
tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
mv sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz sqoop
2.修改环境变量
sudo vim /etc/profile
添加以下内容:
export SQOOP_HOME=/usr/local/sqoop-1.4.6
export PATH=$SQOOP_HOME/bin:$PATH
保存文件,执行source /etc/profile使环境变量生效。
检验生效方法:
sqoop version
3.修改sqoop-env.sh配置文件
cd /usr/local/services/sqoop/conf/
cp sqoop-env-template.sh sqoop-env.sh
vi sqoop-env.sh
#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/usr/local/services/hadoop
#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/usr/local/services/hadoop
#set the path to where bin/hbase is available
export HBASE_HOME=/usr/local/services/hbase
#Set the path to where bin/hive is available
#export HIVE_HOME=
#Set the path for where zookeper config dir is
export ZOOCFGDIR=/usr/local/services/zookeeper
4.修改configure-sqoop文件
cd /usr/local/services/sqoop/bin/
vi configure-sqoop
注释掉以下代码
#if [ -z "${HCAT_HOME}" ]; then
# if [ -d "/usr/lib/hive-hcatalog" ]; then
# HCAT_HOME=/usr/lib/hive-hcatalog
# elif [ -d "/usr/lib/hcatalog" ]; then
# HCAT_HOME=/usr/lib/hcatalog
# else
# HCAT_HOME=${SQOOP_HOME}/../hive-hcatalog
# if [ ! -d ${HCAT_HOME} ]; then
# HCAT_HOME=${SQOOP_HOME}/../hcatalog
# fi
# fi
#fi
#if [ -z "${ACCUMULO_HOME}" ]; then
# if [ -d "/usr/lib/accumulo" ]; then
# ACCUMULO_HOME=/usr/lib/accumulo
# else
# ACCUMULO_HOME=${SQOOP_HOME}/../accumulo
# fi
#fi
## Moved to be a runtime check in sqoop.
#if [ ! -d "${HCAT_HOME}" ]; then
# echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
# echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
#fi
#if [ ! -d "${ACCUMULO_HOME}" ]; then
# echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
# echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'
#fi
5.放入数据库连接jar包
(1)oracle:将ojdbc6.jar 放入/usr/local/services/sqoop/lib/下
cp ojdbc6.jar /usr/local/services/sqoop/lib/
(2)mysql:将mysql-connector-java-5.1.34.jar 放入/usr/local/services/sqoop/lib/下
cp mysql-connector-java-5.1.34.jar /usr/local/services/sqoop/lib/
6.启动sqoop测试
sqoop list-tables --connect jdbc:oracle:thin:@192.168.61.1:1521:orcl --username user --password password
list-tables:显示数据库下的所有表名
orcl:你的数据库
user:用户名
password:密码
结果展示:
(7):sqoop从oracle导入hbase
前提:hadoop,hbase,zookeeper启动成功
sqoop import --append --connect jdbc:oracle:thin:@192.168.61.1:1521:orcl --username user--password passowrd --m 1 --table BAYONET --columns ID,BAYONETID,BAYONETNAME --hbase-create-table --hbase-table Bayonet --hbase-row-key ID --column-family BayonetInfo
结果展示: