0.准备工作 hadoop 服务器
10.156.50.35 yanfabu2-35.base.app.dev.yf zk1 hadoop1 master1 master 10.156.50.36 yanfabu2-36.base.app.dev.yf zk2 hadoop2 master2 10.156.50.37 yanfabu2-37.base.app.dev.yf zk3 hadoop3 slaver1
1.解压 sqoop
tar xf sqoop-1.99.7-bin-hadoop200.tar.gz rm -rf sqoop-1.99.7-bin-hadoop200.tar.gz mv sqoop-1.99.7-bin-hadoop200 sqoop2
2.创建文件夹
mkdir -p /home/zkkafka/sqoop2/extra mkdir -p /home/zkkafka/sqoop2/logs
3.配置环境变量
vi ~/.bash_profile export SQOOP_HOME=/home/zkkafka/sqoop2 export PATH=$PATH:$SQOOP_HOME/bin export SQOOP_SERVER_EXTRA_LIB=$SQOOP_HOME/extra export CATALINA_BASE=$SQOOP_HOME/server export LOGDIR=$SQOOP_HOME/logs/ source ~/.bash_profile
================================================================================= PATH=$PATH:$HOME/.local/bin:$HOME/bin export PATH export LANG="zh_CN.utf8" export JAVA_HOME=/home/zkkafka/jdk1.8.0_151 export ZOOKEEPER_HOME=/home/zkkafka/zookeeper-3.4.6 export CLASSPATH=$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar: export PATH=$JAVA_HOME/bin:$PATH export PATH=$PATH:$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/conf export KAFKA_HOME=/home/zkkafka/kafka_2.11-2.1.1 export PATH=$KAFKA_HOME/bin:$PATH export HADOOP_HOME=/home/zkkafka/hadoop export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH export HBASE_HOME=/home/zkkafka/hbase export PATH=$HBASE_HOME/bin:$PATH export HIVE_HOME=/home/zkkafka/hive export PATH=$HIVE_HOME/bin:$PATH export HIVE_CONF_DIR=$HIVE_HOME/conf #export SQOOP_HOME=/home/zkkafka/sqoop #export PATH=$PATH:$SQOOP_HOME/bin export SQOOP_HOME=/home/zkkafka/sqoop2 export PATH=$PATH:$SQOOP_HOME/bin export SQOOP_SERVER_EXTRA_LIB=$SQOOP_HOME/extra export CATALINA_BASE=$SQOOP_HOME/server export LOGDIR=$SQOOP_HOME/logs/ =================================================================================
4.修改环境变量
vim /home/zkkafka/sqoop2/conf/sqoop.properties 修改 org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/home/zkkafka/hadoop/etc/hadoop vim /home/zkkafka/sqoop2/conf/catalina.properties common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../lib/*.jar,${HADOOP_HOME}/share/hadoop/common/*.jar,${HADOOP_HOME}/share/hadoop/common/lib/*.jar,${HADOOP_HOME}/share/hadoop/hdfs/*.jar,${HADOOP_HOME}/share/hadoop/hdfs/lib/*.jar,${HADOOP_HOME}/share/hadoop/mapreduce/*.jar,${HADOOP_HOME}/share/hadoop/mapreduce/lib/*.jar,${HADOOP_HOME}/share/hadoop/tools/lib/*.jar,${HADOOP_HOME}/share/hadoop/yarn/*.jar,${HADOOP_HOME}/share/hadoop/yarn/lib/*.jar,${HADOOP_HOME}/share/hadoop/httpfs/tomcat/lib/*.jar
5.验证
sqoop2-tool verify [zkkafka@yanfabu2-37 ~]$ sqoop2-tool verify Setting conf dir: /home/zkkafka/sqoop2/bin/../conf Sqoop home directory: /home/zkkafka/sqoop2 Sqoop tool executor: Version: 1.99.7 Revision: 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb Compiled on Tue Jul 19 16:08:27 PDT 2016 by abefine Running tool: class org.apache.sqoop.tools.tool.VerifyTool 0 [main] INFO org.apache.sqoop.core.SqoopServer - Initializing Sqoop server. 7 [main] INFO org.apache.sqoop.core.PropertiesConfigurationProvider - Starting config file poller thread SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/zkkafka/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/zkkafka/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. Verification was successful. Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly.
6.启动 sqoop2
sqoop.sh server start [zkkafka@yanfabu2-37 ~]$ sqoop.sh server start Setting conf dir: /home/zkkafka/sqoop2/bin/../conf Sqoop home directory: /home/zkkafka/sqoop2 Starting the Sqoop2 server... 0 [main] INFO org.apache.sqoop.core.SqoopServer - Initializing Sqoop server. 7 [main] INFO org.apache.sqoop.core.PropertiesConfigurationProvider - Starting config file poller thread SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/zkkafka/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/zkkafka/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. Sqoop2 server started.
7.启动 sqoop2 客户端
sqoop.sh client [zkkafka@yanfabu2-37 ~]$ sqoop.sh client Setting conf dir: /home/zkkafka/sqoop2/bin/../conf Sqoop home directory: /home/zkkafka/sqoop2 五月 30, 2019 5:05:35 下午 java.util.prefs.FileSystemPreferences$1 run 信息: Created user preferences directory. Sqoop Shell: Type 'help' or '\h' for help.
8.准备mysql
/* Navicat MySQL Data Transfer Source Server : 10.156.50.36 hive Source Server Version : 50717 Source Host : 10.156.50.36:3306 Source Database : mqh Target Server Type : MYSQL Target Server Version : 50717 File Encoding : 65001 Date: 2019-05-29 11:48:53 */ SET FOREIGN_KEY_CHECKS=0; -- ---------------------------- -- Table structure for `app` -- ---------------------------- DROP TABLE IF EXISTS `app`; CREATE TABLE `app` ( `appid` int(11) DEFAULT NULL, `appname` varchar(255) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; -- ---------------------------- -- Records of app -- ---------------------------- INSERT INTO `app` VALUES ('1', 'kaxinxiaoxiaole'); INSERT INTO `app` VALUES ('2', 'wangzherognyao'); INSERT INTO `app` VALUES ('3', 'cijizhanchang'); -- ---------------------------- -- Table structure for `device` -- ---------------------------- DROP TABLE IF EXISTS `device`; CREATE TABLE `device` ( `deviceid` int(11) DEFAULT NULL, `appid` int(11) DEFAULT NULL, `register_time` datetime DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; -- ---------------------------- -- Records of device -- ---------------------------- INSERT INTO `device` VALUES ('1', '1', '2019-05-29 11:25:54'); INSERT INTO `device` VALUES ('2', '1', '2019-05-29 11:26:04'); INSERT INTO `device` VALUES ('3', '1', '2019-05-30 11:26:14'); INSERT INTO `device` VALUES ('4', '1', '2019-05-29 11:26:30'); INSERT INTO `device` VALUES ('5', '2', '2019-05-29 11:26:38'); INSERT INTO `device` VALUES ('6', '3', '2019-05-29 11:26:45');
9.操作 sqoop2
9.1. 查看 connector
sqoop:000> show connector 0 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable +------------------------+---------+------------------------------------------------------------+----------------------+ | Name | Version | Class | Supported Directions | +------------------------+---------+------------------------------------------------------------+----------------------+ | generic-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO | | kite-connector | 1.99.7 | org.apache.sqoop.connector.kite.KiteConnector | FROM/TO | | oracle-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.oracle.OracleJdbcConnector | FROM/TO | | ftp-connector | 1.99.7 | org.apache.sqoop.connector.ftp.FtpConnector | TO | | hdfs-connector | 1.99.7 | org.apache.sqoop.connector.hdfs.HdfsConnector | FROM/TO | | kafka-connector | 1.99.7 | org.apache.sqoop.connector.kafka.KafkaConnector | TO | | sftp-connector | 1.99.7 | org.apache.sqoop.connector.sftp.SftpConnector | TO | +------------------------+---------+------------------------------------------------------------+----------------------+
9.2. 创建 mysql link
sqoop:000> create link --connector generic-jdbc-connector Creating link for connector with name generic-jdbc-connector Please fill following values to create new link object Name: mysql-connector Database connection Driver class: com.mysql.jdbc.Driver Connection String: jdbc:mysql://10.156.50.36:3306/mqh?useSSL=false Username: root Password: **** Fetch Size: Connection Properties: There are currently 0 values in the map: entry# SQL Dialect Identifier enclose: Thu May 30 17:50:03 CST 2019 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. New link was successfully created with validation status OK and name mysql-connector
9.3. 创建 hdfs link
sqoop:000> create link --connector hdfs-connector Creating link for connector with name hdfs-connector Please fill following values to create new link object Name: hdfs-link HDFS cluster URI: hdfs://master:9000 Conf directory: /home/zkkafka/hadoop/etc/hadoop Additional configs:: There are currently 0 values in the map: entry# New link was successfully created with validation status OK and name hdfs-link
9.4. show link
sqoop:000> show link +-----------------+------------------------+---------+ | Name | Connector Name | Enabled | +-----------------+------------------------+---------+ | mysql-connector | generic-jdbc-connector | true | | hdfs-link | hdfs-connector | true | +-----------------+------------------------+---------+
9.5. 创建job
sqoop:000> create job -f mysql-link -t hdfs-link Creating job for links with from name mysql-link and to name hdfs-link Please fill following values to create new job object Name: job-m2h Database source Schema name: mqh Table name: device SQL statement: Column names: There are currently 0 values in the list: element# Partition column: Partition column nullable: Boundary query: Incremental read Check column: deviceid Last value: 6 Target configuration Override null value: Null value: File format: 0 : TEXT_FILE 1 : SEQUENCE_FILE 2 : PARQUET_FILE Choose: 0 Compression codec: 0 : NONE 1 : DEFAULT 2 : DEFLATE 3 : GZIP 4 : BZIP2 5 : LZO 6 : LZ4 7 : SNAPPY 8 : CUSTOM Choose: 0 Custom codec: Output directory: /hdfs/sqoop/device Append mode: Throttling resources Extractors: 2 Loaders: 2 Classpath configuration Extra mapper jars: There are currently 0 values in the list: element# New job was successfully created with validation status OK and name job-m2h
9.6. show job
sqoop:000> show job +----+---------+-------------------------------------+----------------------------+---------+ | Id | Name | From Connector | To Connector | Enabled | +----+---------+-------------------------------------+----------------------------+---------+ | 4 | job-m2h | mysql-link (generic-jdbc-connector) | hdfs-link (hdfs-connector) | true | +----+---------+-------------------------------------+----------------------------+---------+
9.7. 启动 job
start job -name job-m2h delete job -name job-m2h delete link -name mysql-link update job -name job-m2h update link -name mysql-link status job -name job-m2h upstatusdate link -name mysql-link sqoop:000> start job -name job-m2h Submission details Job Name: job-m2h Server URL: http://localhost:12000/sqoop/ Created by: zkkafka Creation date: 2019-06-01 11:06:26 CST Lastly updated by: zkkafka External ID: job_1558676658010_0014 http://master1:8088/proxy/application_1558676658010_0014/ 2019-06-01 11:06:26 CST: BOOTING - Progress is not available ls: `/hdfs': No such file or directory [zkkafka@yanfabu2-35 ~]$ hdfs dfs -ls / Found 3 items drwxr-xr-x - zkkafka supergroup 0 2019-05-24 14:09 /home drwx-wx-wx - zkkafka supergroup 0 2019-05-24 14:03 /tmp drwx------ - zkkafka supergroup 0 2019-05-29 11:36 /user [zkkafka@yanfabu2-35 ~]$ hdfs dfs -ls / Found 4 items drwxr-xr-x - zkkafka supergroup 0 2019-06-01 11:07 /hdfs drwxr-xr-x - zkkafka supergroup 0 2019-05-24 14:09 /home drwx-wx-wx - zkkafka supergroup 0 2019-05-24 14:03 /tmp drwx------ - zkkafka supergroup 0 2019-05-29 11:36 /user [zkkafka@yanfabu2-35 ~]$ hdfs dfs -ls /hdfs Found 1 items drwxr-xr-x - zkkafka supergroup 0 2019-06-01 11:07 /hdfs/sqoop [zkkafka@yanfabu2-35 ~]$ hdfs dfs -lsr /hdfs/sqoop lsr: DEPRECATED: Please use 'ls -R' instead. drwxr-xr-x - zkkafka supergroup 0 2019-06-01 11:07 /hdfs/sqoop/device -rw-r--r-- 2 zkkafka supergroup 30 2019-06-01 11:07 /hdfs/sqoop/device/59327654-418e-44c7-b046-2bf231136301.txt -rw-r--r-- 2 zkkafka supergroup 0 2019-06-01 11:07 /hdfs/sqoop/device/7dc2e869-348b-4af3-8900-7825b1bbead3.txt
sqoop:000> status job -name job-m2h Submission details Job Name: job-m2h Server URL: http://localhost:12000/sqoop/ Created by: zkkafka Creation date: 2019-06-01 14:38:52 CST Lastly updated by: zkkafka External ID: job_1559370613628_0001 http://master1 2019-06-01 14:42:24 CST: SUCCEEDED Counters: org.apache.hadoop.mapreduce.FileSystemCounter FILE_LARGE_READ_OPS: 0 FILE_WRITE_OPS: 0 HDFS_READ_OPS: 1 HDFS_BYTES_READ: 154 HDFS_LARGE_READ_OPS: 0 FILE_READ_OPS: 0 FILE_BYTES_WRITTEN: 791854 FILE_BYTES_READ: 45 HDFS_WRITE_OPS: 2 HDFS_BYTES_WRITTEN: 30 org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter BYTES_WRITTEN: 0 org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter BYTES_READ: 0 org.apache.hadoop.mapreduce.JobCounter TOTAL_LAUNCHED_MAPS: 1 VCORES_MILLIS_REDUCES: 17952 MB_MILLIS_MAPS: 5934080 TOTAL_LAUNCHED_REDUCES: 2 SLOTS_MILLIS_REDUCES: 17952 VCORES_MILLIS_MAPS: 5795 MB_MILLIS_REDUCES: 18382848 SLOTS_MILLIS_MAPS: 5795 MILLIS_REDUCES: 17952 OTHER_LOCAL_MAPS: 1 MILLIS_MAPS: 5795 org.apache.sqoop.submission.counter.SqoopCounters ROWS_READ: 1 ROWS_WRITTEN: 1 org.apache.hadoop.mapreduce.TaskCounter MAP_OUTPUT_MATERIALIZED_BYTES: 45 REDUCE_INPUT_RECORDS: 1 SPILLED_RECORDS: 2 MERGED_MAP_OUTPUTS: 2 VIRTUAL_MEMORY_BYTES: 6397804544 MAP_INPUT_RECORDS: 0 SPLIT_RAW_BYTES: 154 FAILED_SHUFFLE: 0 MAP_OUTPUT_BYTES: 31 REDUCE_SHUFFLE_BYTES: 45 PHYSICAL_MEMORY_BYTES: 837005312 GC_TIME_MILLIS: 1845 REDUCE_INPUT_GROUPS: 1 COMBINE_OUTPUT_RECORDS: 0 SHUFFLED_MAPS: 2 REDUCE_OUTPUT_RECORDS: 1 MAP_OUTPUT_RECORDS: 1 COMBINE_INPUT_RECORDS: 0 CPU_MILLISECONDS: 12850 COMMITTED_HEAP_BYTES: 567803904 Shuffle Errors CONNECTION: 0 WRONG_LENGTH: 0 BAD_ID: 0 WRONG_MAP: 0 WRONG_REDUCE: 0 IO_ERROR: 0 Job executed successfully
捐助开发者
在兴趣的驱动下,写一个免费
的东西,有欣喜,也还有汗水,希望你喜欢我的作品,同时也能支持一下。 当然,有钱捧个钱场(支持支付宝和微信 以及扣扣群),没钱捧个人场,谢谢各位。
个人主页:http://knight-black-bob.iteye.com/