前提:公司决定将Oracle版本从现在的11g升级到19c,中间涉及到sqoop的数据抽取,所以在测试环境中进行测试使用最简单的sqoop命令测试时没有发现任何问题,但是在脚本中sqoop使用了–direct参数时发现了一个Error
Unable to obtain the Oracle database version
因为银行要求比较高,所以开始解决这个error
首先考虑驱动与版本之间的关系,因为之前oracle是11g版本,使用的驱动是ojdbc5.jar和ojdbc6.jar,查看了oracle版本对应的驱动,19c对应的是ojdbc8.jar
下载了ojdbc8.jar后,出现了新的报错
java.sql.SQLException: Non supported character set (add orai18n.jar in your classpath): ZHS16GBK
这个问题需要下载orai18n.jar文件放到ojdbc8.jar同级目录下即可
都处理完成后执行sqoop export命令后发现问题依然存在,于是开始了下一步的排查
通过更新驱动并没有解决这个问题,于是开始排查这个报错出现的原因以及产生的影响
通过报错发现在
org.apache.sqoop.manager.oracle.OraOopManagerFactory.java中调用了
org.apache.sqoop.manager.oracle.OraOopOracleQueries.getOracleVersion是使用一段sql去oracle中查询oracle版本
public static OracleVersion getOracleVersion(Connection connection)
throws SQLException {
String sql =
"SELECT \n"
+ " v.banner, \n"
+ " rtrim(v.version) full_version, \n"
+ " rtrim(v.version_bit) version_bit, \n"
+ " SUBSTR(v.version, 1, INSTR(v.version, '.', 1, 1)-1) major, \n"
+ " SUBSTR(v.version, INSTR(v.version, '.', 1, 1) + 1, "
+ " INSTR(v.version, '.', 1, 2) - INSTR(v.version, '.', 1, 1) - 1) "
+ " minor, \n"
+ " SUBSTR(v.version, INSTR(v.version, '.', 1, 2) + 1, "
+ " INSTR(v.version, '.', 1, 3) - INSTR(v.version, '.', 1, 2) - 1) "
+ " version, \n"
+ " SUBSTR(v.version, INSTR(v.version, '.', 1, 3) + 1, "
+ " INSTR(v.version, '.', 1, 4) - INSTR(v.version, '.', 1, 3) - 1) "
+ " patch, \n"
+ " DECODE(instr(v.banner, '64bit'), 0, 'False', 'True') isDb64bit, \n"
+ " DECODE(instr(b.banner, 'HPUX'), 0, 'False', 'True') isHPUX \n"
+ "FROM (SELECT rownum row_num, \n"
+ " banner,\n"
+ " SUBSTR(SUBSTR(banner,INSTR(banner,'Release ')+8), 1) version_bit,\n"
+ " SUBSTR(SUBSTR(banner,INSTR(banner,'Release ')+8), 1,\n"
+ " INSTR(SUBSTR(banner,INSTR(banner,'Release ')+8),' ')) version\n"
+ "FROM v$version\n" + " WHERE banner LIKE 'Oracle%'\n"
+ " OR banner LIKE 'Personal Oracle%') v,\n" + "v$version b\n"
+ " WHERE v.row_num = 1\n" + " and b.banner like 'TNS%'\n";
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(sql);
resultSet.next();
OracleVersion result =
new OracleVersion(resultSet.getInt("major"), resultSet.getInt("minor"),
resultSet.getInt("version"), resultSet.getInt("patch"), resultSet
.getString("banner"));
resultSet.close();
statement.close();
return result;
}
在OraOopManagerFactory中这个Error是自定义的日志,对程序执行并没有产生影响
// Get the Oracle database version...
try {
OracleVersion oracleVersion =
OraOopOracleQueries.getOracleVersion(result.getConnection());
LOG.info(String.format("Oracle Database version: %s",
oracleVersion.getBanner()));
sqoopOptions.getConf().setInt(
OraOopConstants.ORAOOP_ORACLE_DATABASE_VERSION_MAJOR,
oracleVersion.getMajor());
sqoopOptions.getConf().setInt(
OraOopConstants.ORAOOP_ORACLE_DATABASE_VERSION_MINOR,
oracleVersion.getMinor());
} catch (SQLException ex) {
LOG.error("Unable to obtain the Oracle database version.", ex);
}
于是查看
ORAOOP_ORACLE_DATABASE_VERSION_MAJOR
ORAOOP_ORACLE_DATABASE_VERSION_MINOR
这两个参数在程序运行中是否有被使用过,结果发现只有一个函数使用过这两个参数
protected boolean canUseOracleAppendValuesHint(TaskAttemptContext context) {
Configuration conf = context.getConfiguration();
// Should we use the APPEND_VALUES Oracle hint?...
// (Yes, if this is Oracle 11.2 or above)...
OracleVersion oracleVersion =
new OracleVersion(conf.getInt(
OraOopConstants.ORAOOP_ORACLE_DATABASE_VERSION_MAJOR, 0), conf
.getInt(OraOopConstants.ORAOOP_ORACLE_DATABASE_VERSION_MINOR, 0),
0, 0, "");
boolean result = oracleVersion.isGreaterThanOrEqualTo(11, 2, 0, 0);
// If there is a BINARY_DOUBLE or BINARY_FLOAT column, then we'll avoid
// using
// the APPEND_VALUES hint. If there is a NULL in the HDFS file, then we'll
// encounter
// "ORA-12838: cannot read/modify an object after modifying it in parallel"
// due to the JDBC driver issuing the INSERT statement twice to the database
// without a COMMIT in between (as was observed via WireShark).
// We're not sure why this happens - we just know how to avoid it.
if (result) {
boolean binaryDoubleColumnExists =
conf.getBoolean(OraOopConstants.TABLE_CONTAINS_BINARY_DOUBLE_COLUMN,
false);
boolean binaryFloatColumnExists =
conf.getBoolean(OraOopConstants.TABLE_CONTAINS_BINARY_FLOAT_COLUMN,
false);
if (binaryDoubleColumnExists || binaryFloatColumnExists) {
result = false;
LOG.info("The APPEND_VALUES Oracle hint will not be used for the "
+ "INSERT SQL statement, as the Oracle table "
+ "contains either a BINARY_DOUBLE or BINARY_FLOAT column.");
}
}
return result;
}
而这个函数的主要目的是用来判断是否对Oracle 11.2以上版本使用APPEND_VALUES提示,这个对于我们来说是可有可无的,完全不会影响我们整个数据抽取的流程,于是最终决定
无视这个Error