Sqoop2 mysql->hdfs & mysql->kafka

一.从mysql到hdfs

1.启动sqoop2-shell,配置sqoop server参数

[sivan@sivan ~]$ sqoop2-shell 
Setting conf dir: /home/sivan/software/sqoop/sqoop-1.99.7-bin-hadoop200/bin/../conf
Sqoop home directory: /home/sivan/software/sqoop/sqoop-1.99.7-bin-hadoop200
Sqoop Shell: Type 'help' or '\h' for help.

sqoop:000> set server --host localhost --port 12000 --webapp sqoop
Server is set successfully

2.查看所有支持的连接

sqoop:000> show connector 
+------------------------+---------+------------------------------------------------------------+----------------------+
|          Name          | Version |                           Class                            | Supported Directions |
+------------------------+---------+------------------------------------------------------------+----------------------+
| generic-jdbc-connector | 1.99.7  | org.apache.sqoop.connector.jdbc.GenericJdbcConnector       | FROM/TO              |
| kite-connector         | 1.99.7  | org.apache.sqoop.connector.kite.KiteConnector              | FROM/TO              |
| oracle-jdbc-connector  | 1.99.7  | org.apache.sqoop.connector.jdbc.oracle.OracleJdbcConnector | FROM/TO              |
| ftp-connector          | 1.99.7  | org.apache.sqoop.connector.ftp.FtpConnector                | TO                   |
| hdfs-connector         | 1.99.7  | org.apache.sqoop.connector.hdfs.HdfsConnector              | FROM/TO              |
| kafka-connector        | 1.99.7  | org.apache.sqoop.connector.kafka.KafkaConnector            | TO                   |
| sftp-connector         | 1.99.7  | org.apache.sqoop.connector.sftp.SftpConnector              | TO                   |
+------------------------+---------+------------------------------------------------------------+----------------------+

3. 创建mysql link

sqoop:000> create link -connector generic-jdbc-connector
Creating link for connector with name generic-jdbc-connector
Please fill following values to create new link object
Name: mysql_link

Database connection

Driver class: com.mysql.jdbc.Driver
Connection String: jdbc:mysql://172.19.225.156:3306/test
Username: root
Password: ****
Fetch Size: 
Connection Properties: 
There are currently 0 values in the map:
entry# protocol=tcp
There are currently 1 values in the map:
protocol = tcp
entry# 

SQL Dialect

Identifier enclose:  #此处为空格
New link was successfully created with validation status OK and name mysql_link

 4.创建hdfs link

sqoop:000> create link -connector hdfs-connector
Creating link for connector with name hdfs-connector
Please fill following values to create new link object
Name: hdfs_link

HDFS cluster

URI: hdfs://172.19.225.156:9000
Conf directory: /home/sivan/bigdata/softwares/hadoop/etc/hadoop
Additional configs:: 
There are currently 0 values in the map:
entry# 
New link was successfully created with validation status OK and name hdfs_link

5.创建job 

sqoop:000> create job -f "mysql_link" -t "hdfs_link"
Creating job for links with from name mysql_link and to name hdfs_link
Please fill following values to create new job object
Name: mysql_to_hdfs

Database source

Schema name: test
Table name: user
SQL statement: 
Column names: 
There are currently 0 values in the list:
element# 
Partition column: 
Partition column nullable: 
Boundary query: 

Incremental read

Check column: 
Last value: 

Target configuration

Override null value: 
Null value: 
File format: 
  0 : TEXT_FILE
  1 : SEQUENCE_FILE
  2 : PARQUET_FILE
Choose: 0
Compression codec: 
  0 : NONE
  1 : DEFAULT
  2 : DEFLATE
  3 : GZIP
  4 : BZIP2
  5 : LZO
  6 : LZ4
  7 : SNAPPY
  8 : CUSTOM
Choose: 0
Custom codec: 
Output directory: hdfs://172.19.225.156:9000/user/sivan/sqoop
Append mode: 

Throttling resources

Extractors: 
Loaders: 

Classpath configuration

Extra mapper jars: 
There are currently 0 values in the list:
element# 
New job was successfully created with validation status OK  and name mysql_to_hdfs

6.执行job

#启动job
sqoop:000> start job -n mysql_to_hdfs

#查看job的状态
sqoop:000> status job -n mysql_to_hdfs

二.从mysql到kafka 

1.创建mysql link,同上

2.创建kafka link

sqoop:000> create link -c kafka-connector
Creating link for connector with name kafka-connector
Please fill following values to create new link object
Name: kafka_link

Kafka cluster

Kafka brokers: sivan:9092 #kafka IP:port,多个用逗号隔开
Zookeeper quorum: sivan:2181 #zookeeper IP:port,多个用逗号隔开
New link was successfully created with validation status OK and name kafka_link

3.创建job 

sqoop:000> create job -f "mysql_link" -t "kafka_link"
Creating job for links with from name mysql_link and to name kafka_link
Please fill following values to create new job object
Name: mysql_to_kafka

Database source

Schema name: test
Table name: user
SQL statement: 
Column names: 
There are currently 0 values in the list:
element# 
Partition column: 
Partition column nullable: 
Boundary query: 

Incremental read

Check column: 
Last value: 

Output configuration

Topic: test #kafka主题

Throttling resources

Extractors: 
Loaders: 

Classpath configuration

Extra mapper jars: 
There are currently 0 values in the list:
element# 
New job was successfully created with validation status OK  and name mysql_to_kafka

 4.执行job

sqoop:000> start job -n mysql_to_kafka
Submission details
Job Name: mysql_to_kafka
Server URL: http://localhost:12000/sqoop/
Created by: sivan
Creation date: 2019-01-02 16:25:36 CST
Lastly updated by: sivan
External ID: job_1546391357312_0009
	http://sivan:8088/proxy/application_1546391357312_0009/
2019-01-02 16:25:36 CST: BOOTING  - Progress is not available


sqoop:000> status job -n mysql_to_kafka
Submission details
Job Name: mysql_to_kafka
Server URL: http://localhost:12000/sqoop/
Created by: sivan
Creation date: 2019-01-02 16:25:36 CST
Lastly updated by: sivan
External ID: job_1546391357312_0009
	http://sivan:8088/proxy/application_1546391357312_0009/
2019-01-02 16:25:49 CST: RUNNING  - 16.67 %


sqoop:000> status job -n mysql_to_kafka
Submission details
Job Name: mysql_to_kafka
Server URL: http://localhost:12000/sqoop/
Created by: sivan
Creation date: 2019-01-02 16:25:36 CST
Lastly updated by: sivan
External ID: job_1546391357312_0009
	http://sivan:8088/proxy/application_1546391357312_0009/
2019-01-02 16:25:53 CST: SUCCEEDED 
Counters:
	org.apache.hadoop.mapreduce.FileSystemCounter
		FILE_LARGE_READ_OPS: 0
		FILE_WRITE_OPS: 0
		HDFS_READ_OPS: 3
		HDFS_BYTES_READ: 424
		HDFS_LARGE_READ_OPS: 0
		FILE_READ_OPS: 0
		FILE_BYTES_WRITTEN: 523026
		FILE_BYTES_READ: 0
		HDFS_WRITE_OPS: 0
		HDFS_BYTES_WRITTEN: 0
	org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter
		BYTES_WRITTEN: 0
	org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
		BYTES_READ: 0
	org.apache.hadoop.mapreduce.JobCounter
		TOTAL_LAUNCHED_MAPS: 3
		MB_MILLIS_MAPS: 11265024
		SLOTS_MILLIS_REDUCES: 0
		VCORES_MILLIS_MAPS: 11001
		SLOTS_MILLIS_MAPS: 11001
		OTHER_LOCAL_MAPS: 3
		MILLIS_MAPS: 11001
	org.apache.sqoop.submission.counter.SqoopCounters
		ROWS_READ: 4
		ROWS_WRITTEN: 4
	org.apache.hadoop.mapreduce.TaskCounter
		SPILLED_RECORDS: 0
		MERGED_MAP_OUTPUTS: 0
		VIRTUAL_MEMORY_BYTES: 6396149760
		MAP_INPUT_RECORDS: 0
		SPLIT_RAW_BYTES: 424
		MAP_OUTPUT_RECORDS: 4
		FAILED_SHUFFLE: 0
		PHYSICAL_MEMORY_BYTES: 770457600
		GC_TIME_MILLIS: 528
		CPU_MILLISECONDS: 7140
		COMMITTED_HEAP_BYTES: 521142272
Job executed successfully

 

你可能感兴趣的:(学习记录)