SQOOP2 使用报错分析与解决

项目场景:

Hadoop 版本:3.1.2

SQOOP2 版本: 1.99.7

关系 型数据库: Mariadb 10.3.15

项目目的: 使用sqoop2 进行从关系型数据库导入 HDFS 中。

 


问题描述:

问题一: 

Exception has occurred during processing command 
Exception: org.apache.sqoop.common.SqoopException Message: GENERIC_JDBC_CONNECTOR_0016:Can't fetch schema - 

问题二: 

Exception has occurred during processing command 
Exception: org.apache.sqoop.common.SqoopException Message: GENERIC_HDFS_CONNECTOR_0007:Invalid input/output directory - Unexpected exception

问题三:

2021-06-05 20:30:47,895 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1622895077117_0002 failed 2 times due to AM Container for appattempt_1622895077117_0002_000002 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2021-06-05 20:30:47.886]Exception from container-launch.
Container id: container_e05_1622895077117_0002_02_000001
Exit code: 1

[2021-06-05 20:30:47.892]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :

原因分析:

开启日志debug :

sqoop:000> set option --name verbose --value true
Verbose option was changed to true

问题一出现错误:

Exception: org.apache.sqoop.common.SqoopException Message: GENERIC_JDBC_CONNECTOR_0016:Can't fetch schema - 
Stack trace:
         ...
         at  org.codehaus.groovy.runtime.callsite.AbstractCallSite (AbstractCallSite.java:112)  
         at  org.codehaus.groovy.tools.shell.Groovysh (Groovysh.groovy:585)  
         at  org.apache.sqoop.shell.SqoopShell (SqoopShell.java:156)  
Caused by: Exception: java.lang.Throwable Message: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '"mydevops"."useripinfo"' at line 1
Stack trace:

原因:

Identifier enclose: 什么都不填写时,默认解析成SQL语句时,使用双引号作为标识符,影响语句语法。因而create link 输入Identifier enclose: 不能直接回车,需要使用空格或者“·”作为标识分隔。

如下:

sqoop:001> update link -n mysql-test
Updating link with name mysql-test
Please update link:
Name: mysql-test

Database connection

Driver class: com.mysql.jdbc.Driver
Connection String: jdbc:mysql://192.168.0.114/mydevops
Username: lostar
Password: ************
Fetch Size: 
Connection Properties: 
There are currently 0 values in the map:
entry# 

SQL Dialect

Identifier enclose: `
link was successfully updated with status OK

解决方法:

Identifier enclose: `    #或者为空格

 

问题二出现问题:

sqoop:006> start job -n mysql2hdfs
Exception has occurred during processing command 
Exception: org.apache.sqoop.common.SqoopException Message: GENERIC_HDFS_CONNECTOR_0007:Invalid input/output directory - Unexpected exception
Stack trace:
         ...
         (AbstractCallSite.java:112)  
         at  org.codehaus.groovy.tools.shell.Groovysh (Groovysh.groovy:585)  
         at  org.apache.sqoop.shell.SqoopShell (SqoopShell.java:156)  
Caused by: Exception: java.lang.Throwable Message: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:108)

原因:

Hdfs Link 中的URI 使用standby 的地址,该地址,必须是active 的namenode url.

解决方法:

sqoop:000> update link -n hdfs-test
Updating link with name hdfs-test
Please update link:
Name: hdfs-test

HDFS cluster

URI: hdfs://master01:9000
Conf directory: /home/hadoop/hadoop-3.1.4/etc/hadoop
Additional configs:: 
There are currently 0 values in the map:
entry# 
link was successfully updated with status OK

URI: hdfs://master-ip:9000

总结: URI 必须为master active 节点地址。

 

问题三错误出现原因分析:

新的hadoop 版本需要yarn-site.xml配置如下参数:

 
    
        yarn.resourcemanager.webapp.address.rm1
        master01
    
    
        yarn.resourcemanager.scheduler.address.rm2
        master02
    
    
        yarn.resourcemanager.webapp.address.rm2
        master02
    

开始通过找资料说是 hadoop classpath 问题,但是添加classpath 还是同样报错,添加上述配置后

重启hdfs,yarn 问题解决。

 

总结:

使用SQOOP2 当遇到报错时,首先使用如下命令开启Debug

set option --name verbose --value true

根据详细的报错去排查问题,创建关系型数据库Link 时: Identifier enclose: 后不能直接回车,需要留空格或者` 进行覆盖。错误出现Standby 说明Hdfs Link 中URI 中使用了非active namenode 地址,应修改为active namenode 地址;出现问题三的报错,可以在yarn-site.xml 配置如上参数,这机器参数会开启yarn containner 运行任务。最后如果还有其他问题,SQOOP2 中又没有明显错误那么查看 resourcemanager 的日志进行排查问题。

 

你可能感兴趣的:(大数据,hadoop,sqoop,etl)