Hadoop 版本:3.1.2
SQOOP2 版本: 1.99.7
关系 型数据库: Mariadb 10.3.15
项目目的: 使用sqoop2 进行从关系型数据库导入 HDFS 中。
问题一:
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: GENERIC_JDBC_CONNECTOR_0016:Can't fetch schema -
问题二:
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: GENERIC_HDFS_CONNECTOR_0007:Invalid input/output directory - Unexpected exception
问题三:
2021-06-05 20:30:47,895 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1622895077117_0002 failed 2 times due to AM Container for appattempt_1622895077117_0002_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2021-06-05 20:30:47.886]Exception from container-launch.
Container id: container_e05_1622895077117_0002_02_000001
Exit code: 1
[2021-06-05 20:30:47.892]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
开启日志debug :
sqoop:000> set option --name verbose --value true
Verbose option was changed to true
问题一出现错误:
Exception: org.apache.sqoop.common.SqoopException Message: GENERIC_JDBC_CONNECTOR_0016:Can't fetch schema -
Stack trace:
...
at org.codehaus.groovy.runtime.callsite.AbstractCallSite (AbstractCallSite.java:112)
at org.codehaus.groovy.tools.shell.Groovysh (Groovysh.groovy:585)
at org.apache.sqoop.shell.SqoopShell (SqoopShell.java:156)
Caused by: Exception: java.lang.Throwable Message: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '"mydevops"."useripinfo"' at line 1
Stack trace:
原因:
Identifier enclose: 什么都不填写时,默认解析成SQL语句时,使用双引号作为标识符,影响语句语法。因而create link 输入Identifier enclose: 不能直接回车,需要使用空格或者“·”作为标识分隔。
如下:
sqoop:001> update link -n mysql-test
Updating link with name mysql-test
Please update link:
Name: mysql-test
Database connection
Driver class: com.mysql.jdbc.Driver
Connection String: jdbc:mysql://192.168.0.114/mydevops
Username: lostar
Password: ************
Fetch Size:
Connection Properties:
There are currently 0 values in the map:
entry#
SQL Dialect
Identifier enclose: `
link was successfully updated with status OK
解决方法:
Identifier enclose: ` #或者为空格
问题二出现问题:
sqoop:006> start job -n mysql2hdfs
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: GENERIC_HDFS_CONNECTOR_0007:Invalid input/output directory - Unexpected exception
Stack trace:
...
(AbstractCallSite.java:112)
at org.codehaus.groovy.tools.shell.Groovysh (Groovysh.groovy:585)
at org.apache.sqoop.shell.SqoopShell (SqoopShell.java:156)
Caused by: Exception: java.lang.Throwable Message: Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:108)
原因:
Hdfs Link 中的URI 使用standby 的地址,该地址,必须是active 的namenode url.
解决方法:
sqoop:000> update link -n hdfs-test
Updating link with name hdfs-test
Please update link:
Name: hdfs-test
HDFS cluster
URI: hdfs://master01:9000
Conf directory: /home/hadoop/hadoop-3.1.4/etc/hadoop
Additional configs::
There are currently 0 values in the map:
entry#
link was successfully updated with status OK
URI: hdfs://master-ip:9000
总结: URI 必须为master active 节点地址。
问题三错误出现原因分析:
新的hadoop 版本需要yarn-site.xml配置如下参数:
yarn.resourcemanager.webapp.address.rm1
master01
yarn.resourcemanager.scheduler.address.rm2
master02
yarn.resourcemanager.webapp.address.rm2
master02
开始通过找资料说是 hadoop classpath 问题,但是添加classpath 还是同样报错,添加上述配置后
重启hdfs,yarn 问题解决。
总结:
使用SQOOP2 当遇到报错时,首先使用如下命令开启Debug
set option --name verbose --value true
根据详细的报错去排查问题,创建关系型数据库Link 时: Identifier enclose: 后不能直接回车,需要留空格或者` 进行覆盖。错误出现Standby 说明Hdfs Link 中URI 中使用了非active namenode 地址,应修改为active namenode 地址;出现问题三的报错,可以在yarn-site.xml 配置如上参数,这机器参数会开启yarn containner 运行任务。最后如果还有其他问题,SQOOP2 中又没有明显错误那么查看 resourcemanager 的日志进行排查问题。