记一次sqoop将hive数据导入到mysql报错

这是我hive表的数据结构

+------------+------------+----------+--+
|  col_name  | data_type  | comment  |
+------------+------------+----------+--+
| ipcount    | bigint     |          |
| pv         | bigint     |          |
| jump_num   | bigint     |          |
| jump_rate  | double     |          |
| reg_num    | bigint     |          |
+------------+------------+----------+--+

这是我创建的mysql表的数据结构:

+-----------+-------------+------+-----+---------+-------+
| Field     | Type        | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| ipcount   | varchar(20) | YES  |     | NULL    |       |
| pv        | varchar(20) | YES  |     | NULL    |       |
| jump_num  | varchar(20) | YES  |     | NULL    |       |
| jump_rate | varchar(20) | YES  |     | NULL    |       |
| reg_num   | varchar(20) | YES  |     | NULL    |       |
+-----------+-------------+------+-----+---------+-------+

这是我的sqoop语句:

bin/sqoop export \
--connect jdbc:mysql://Maricle05:3306/weblog \
--username root \
--password 123456 \
--table weblog_anlay \
--num-mappers 1 \
--input-fields-terminated-by "\t" \
--export-dir /user/hive/warehouse/weblog.db/weblog_anlay

这是报错信息:

Error: java.io.IOException: Can't export data, please check failed map task logs
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.RuntimeException: Can't parse input data: '317301948789105300.005403355622389083335'
        at weblog_anlay.__loadFromFields(weblog_anlay.java:378)
        at weblog_anlay.parse(weblog_anlay.java:306)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
        ... 10 more
Caused by: java.util.NoSuchElementException
        at java.util.ArrayList$Itr.next(ArrayList.java:834)
        at weblog_anlay.__loadFromFields(weblog_anlay.java:358)
        ... 12 more

19/08/07 19:05:36 INFO mapreduce.Job: Task Id : attempt_1565174983974_0002_m_000000_1, Status : FAILED
Error: java.io.IOException: Can't export data, please check failed map task logs
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.RuntimeException: Can't parse input data: '317301948789105300.005403355622389083335'
        at weblog_anlay.__loadFromFields(weblog_anlay.java:378)
        at weblog_anlay.parse(weblog_anlay.java:306)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
        ... 10 more
Caused by: java.util.NoSuchElementException
        at java.util.ArrayList$Itr.next(ArrayList.java:834)
        at weblog_anlay.__loadFromFields(weblog_anlay.java:358)
        ... 12 more

19/08/07 19:05:40 INFO mapreduce.Job: Task Id : attempt_1565174983974_0002_m_000000_2, Status : FAILED
Error: java.io.IOException: Can't export data, please check failed map task logs
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.RuntimeException: Can't parse input data: '317301948789105300.005403355622389083335'
        at weblog_anlay.__loadFromFields(weblog_anlay.java:378)
        at weblog_anlay.parse(weblog_anlay.java:306)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
        ... 10 more
Caused by: java.util.NoSuchElementException
        at java.util.ArrayList$Itr.next(ArrayList.java:834)
        at weblog_anlay.__loadFromFields(weblog_anlay.java:358)
        ... 12 more

很明显说的是解析错误,字段不同,在认真确认两个表的字段相同后,认真检查报错发现一个语句:

Can't parse input data: '317301948789105300.005403355622389083335'

然后更改我的sqoop代码:


bin/sqoop export \
--connect jdbc:mysql://Maricle05:3306/weblog \
--username root \
--password 123456 \
--table weblog_anlay \
--num-mappers 1 \
--input-fields-terminated-by "\001" \
--export-dir /user/hive/warehouse/weblog.db/weblog_anlay

成功导入;

原因:因为我在hive中创建表是通过create table tablename as select …语句,没有指定他的分隔符,而hive默认的分隔符是\001,因此解析时,出现字段数量不同而无法导入。

再次强调:hive默认的分隔符是\001

你可能感兴趣的:(Sqoop)