Sqoop导入Mysql数据表到Hbase中

1.在mysql表中创建一个千万条数据的测试表card

2.在Hbase中创建对应的test表,指定一个列族info

hbase shell
create 'test','info'

3.将mysql数据导入hbase中

sqoop import 
--connect jdbc:mysql://192.168.20.160/test 
--username root 
--password 111111 
--table card 
--hbase-table 'test' # 指定hbase表的列族名
--hbase-row-key card_id  # 指定hbase表的rowkey对应为mysql表的card_id主键
--column-family 'info' #指定hbase表列族
--hbase-create-table # 自动在hbase数据库中创建"test"这张表,如果之前创建了,请忽略这一句

中间碰到了一个报错:

 tool.ImportTool: Import failed: java.io.IOException: java.sql.SQLException: Incorrect key file for table './test/card.MYI'; try to repair it

这是mysql中的表索引损坏了,重新用repair命令执行一下就好了

repair table card;

4.执行成功

19/05/15 15:08:25 INFO mapreduce.Job: Job job_1557888023370_0004 completed successfully
19/05/15 15:08:26 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=747592
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=476
		HDFS: Number of bytes written=0
		HDFS: Number of read operations=4
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=0
	Job Counters 
		Launched map tasks=4
		Other local map tasks=4
		Total time spent by all maps in occupied slots (ms)=1347125
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=1347125
		Total vcore-milliseconds taken by all map tasks=2694250
		Total megabyte-milliseconds taken by all map tasks=1379456000
	Map-Reduce Framework
		Map input records=10000000
		Map output records=10000000
		Input split bytes=476
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=3329
		CPU time spent (ms)=1170200
		Physical memory (bytes) snapshot=1686634496
		Virtual memory (bytes) snapshot=11169898496
		Total committed heap usage (bytes)=1444937728
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=0
19/05/15 15:08:26 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 486.5645 seconds (0 bytes/sec)
19/05/15 15:08:26 INFO mapreduce.ImportJobBase: Retrieved 10000000 records.

显示成功导入hbase test表一千万条数据

5.进入hbase,验证数据是否导入成功

利用hbase jar中自带的统计行数的工具类,查询test表总条数

hbase org.apache.hadoop.hbase.mapreduce.RowCounter 'test'

执行结果出来,test表里面新增了一千万条数据
并且可以执行

scan 'test'

查看表中数据

你可能感兴趣的:(Sqoop)