Spring SQLErrorCodeSQLExceptionTranslator在Mysql/Oracle并存多数据源下的一个BUG

发现问题

最近公司想把原Oracle数据库都迁移到Mysql,这个切换需要一段时间过渡,所以存在Oracle、Mysql在项目中同时使用的情况。这样就需要使用多数据源的技术。多数据源配置本身比较简单,但有一个场景出现了一点小意外。考虑如下代码:

    // 自己通过try-catch实现insertOrUpdate语义
    Data data = new Data();
    try{
        dataMapper.insert(data);
    } catch (DuplicateKeyException e) {
        dataMapper.update(data);
    } 

可是意外发生了,这里DuplicateKeyException异常并没有被捕获,或者说这里抛出的异常并不是我们想要捕获的,而是一个名叫DataAccessResourceFailureException的异常,异常栈信息片段如下。

org.springframework.dao.DataAccessResourceFailureException: 
### Error updating database.  Cause: java.sql.SQLIntegrityConstraintViolationException: ORA-00001: unique constraint (...) violated

### The error may involve DataMapper.insert-Inline
### The error occurred while setting parameters
### SQL: INSERT INTO ... VALUES (?, ?, ?, ?, ?)
### Cause: java.sql.SQLIntegrityConstraintViolationException: ORA-00001: unique constraint (PBOC.PK_PB_NDES_DATA_RELATION) violated

; SQL []; ORA-00001: unique constraint (...) violated
; nested exception is java.sql.SQLIntegrityConstraintViolationException: ORA-00001: unique constraint (...) violated

    at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:251) ~[spring-jdbc-4.2.0.RELEASE.jar:4.2.0.RELEASE]
    at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73) ~[spring-jdbc-4.2.0.RELEASE.jar:4.2.0.RELEASE]
    at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:73) ~[mybatis-spring-1.2.2.jar:1.2.2]
    at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:371) ~[mybatis-spring-1.2.2.jar:1.2.2]
    at com.sun.proxy.$Proxy29.insert(Unknown Source) ~[na:na]
    at org.mybatis.spring.SqlSessionTemplate.insert(SqlSessionTemplate.java:240) ~[mybatis-spring-1.2.2.jar:1.2.2]

开始排查

先打开DataAccessResourceFailureException这个类,看了一眼注释,发现跟预期完全不对路啊。上面的异常栈已经说了,driver层给出的异常 java.sql.SQLIntegrityConstraintViolationException: ORA-00001: unique constraint (…) violated 说明这个异常确实是唯一键冲突,但到spring这里异常类型出问题了。

/**
 * Data access exception thrown when a resource fails completely:
 * for example, if we can't connect to a database using JDBC.
 *
 * @author Rod Johnson
 * @author Thomas Risberg
 */
@SuppressWarnings("serial")
public class DataAccessResourceFailureException extends NonTransientDataAccessResourceException {

没办法,只能去看spring在异常转换的逻辑了,先根据异常栈定位到SQLErrorCodeSQLExceptionTranslator,很快就找到了如下代码片段。

    else if (Arrays.binarySearch(this.sqlErrorCodes.getDuplicateKeyCodes(), errorCode) >= 0) {
        logTranslation(task, sql, sqlEx, false);
        return new DuplicateKeyException(buildMessage(task, sql, sqlEx), sqlEx);
    }
    else if (Arrays.binarySearch(this.sqlErrorCodes.getDataIntegrityViolationCodes(), errorCode) >= 0) {
        logTranslation(task, sql, sqlEx, false);
        return new DataIntegrityViolationException(buildMessage(task, sql, sqlEx), sqlEx);
    }

发现spring抛出的异常类型是根据sqlErrorCodes来判断的,那么下一步就得看sqlErrorCodes是如何被定义的。通过跟踪代码,找到了sql-error-codes.xml,相关代码如下:

    <bean id="MySQL" class="org.springframework.jdbc.support.SQLErrorCodes">
        <property name="badSqlGrammarCodes">
            <value>1054,1064,1146value>
        property>
        <property name="duplicateKeyCodes">
            <value>1062value>
        property>
        <property name="dataIntegrityViolationCodes">
            <value>630,839,840,893,1169,1215,1216,1217,1364,1451,1452,1557value>
        property>
        <property name="dataAccessResourceFailureCodes">
            <value>1value>
        property>
        <property name="cannotAcquireLockCodes">
            <value>1205value>
        property>
        <property name="deadlockLoserCodes">
            <value>1213value>
        property>
    bean>

    <bean id="Oracle" class="org.springframework.jdbc.support.SQLErrorCodes">
        <property name="badSqlGrammarCodes">
            <value>900,903,904,917,936,942,17006,6550value>
        property>
        <property name="invalidResultSetAccessCodes">
            <value>17003value>
        property>
        <property name="duplicateKeyCodes">
            <value>1value>
        property>
        <property name="dataIntegrityViolationCodes">
            <value>1400,1722,2291,2292value>
        property>
        <property name="dataAccessResourceFailureCodes">
            <value>17002,17447value>
        property>
        <property name="cannotAcquireLockCodes">
            <value>54,30006value>
        property>
        <property name="cannotSerializeTransactionCodes">
            <value>8177value>
        property>
        <property name="deadlockLoserCodes">
            <value>60value>
        property>
    bean>

DataAccessResourceFailureException对应的code是dataAccessResourceFailureCodes,我们可以通过下表猜出一些端倪了。我在Oracle中执行sql,返回错误码ORA-00001,却被当成Mysql的错误码进行了转化,从而得到DataAccessResourceFailureException。

DB TYPE dataAccessResourceFailureCodes duplicateKeyCodes
Mysql 1 1062
Oracle 17002,17447 1

问题实锤

虽然有一些端倪,但要我们的目标是确认bug并修复问题。那继续看代码,下一个问题在于SQLErrorCodeSQLExceptionTranslator中的sqlErrorCodes变量是如何被初始化的,这里把SQLErrorCodeSQLExceptionTranslator做了一些精简如下:

    /** Error codes used by this translator */
    private SQLErrorCodes sqlErrorCodes;

    public SQLErrorCodeSQLExceptionTranslator(DataSource dataSource) {
        this();
        setDataSource(dataSource);
    }

    public void setDataSource(DataSource dataSource) {
        this.sqlErrorCodes = SQLErrorCodesFactory.getInstance().getErrorCodes(dataSource);
    }

这里即可以想到逻辑错误在哪了,sqlErrorCodes是根据dataSource得到的,而我们的dataSource是DynamicDataSource,是无法直接知道DbType的,那么还是继续往里看SQLErrorCodesFactory.getInstance().getErrorCodes是怎么做的,直接上代码:

    public SQLErrorCodes getErrorCodes(DataSource dataSource) {
        Assert.notNull(dataSource, "DataSource must not be null");
        if (logger.isDebugEnabled()) {
            logger.debug("Looking up default SQLErrorCodes for DataSource [" + dataSource + "]");
        }

        synchronized (this.dataSourceCache) {
            // Let's avoid looking up database product info if we can.
            SQLErrorCodes sec = this.dataSourceCache.get(dataSource);
            if (sec != null) {
                if (logger.isDebugEnabled()) {
                    logger.debug("SQLErrorCodes found in cache for DataSource [" +
                            dataSource.getClass().getName() + '@' + Integer.toHexString(dataSource.hashCode()) + "]");
                }
                return sec;
            }
            // We could not find it - got to look it up.
            try {
                String dbName = (String) JdbcUtils.extractDatabaseMetaData(dataSource, "getDatabaseProductName");
                if (dbName != null) {
                    if (logger.isDebugEnabled()) {
                        logger.debug("Database product name cached for DataSource [" +
                                dataSource.getClass().getName() + '@' + Integer.toHexString(dataSource.hashCode()) +
                                "]: name is '" + dbName + "'");
                    }
                    sec = getErrorCodes(dbName);
                    this.dataSourceCache.put(dataSource, sec);
                    return sec;
                }
            }
            catch (MetaDataAccessException ex) {
                logger.warn("Error while extracting database product name - falling back to empty error codes", ex);
            }
        }

        // Fallback is to return an empty SQLErrorCodes instance.
        return new SQLErrorCodes();
    }

这里有两个关键点,JdbcUtils.extractDatabaseMetaData和dataSourceCache。先看dataSourceCache,这里直接根据做了一层缓存,那么问题已然实锤。我每次使用的SQLErrorCodes是需要根据dataSource可取到的Connection信息来的,加一层缓存是肯定不行的。至此问题原因已经确诊,下一步是考虑该如何修复。

问题修复

直接修改Spring源码肯定是行不通的,考虑SqlSessionTemplate是Spring提供给我们操作数据库的工具,那么考虑在SqlSessionTemplate上做文章。发现如下构造函数,我们使用这个构造函数,传入我们自定义的PersistenceExceptionTranslator不就好了吗。

public SqlSessionTemplate(SqlSessionFactory sqlSessionFactory, ExecutorType executorType,
      PersistenceExceptionTranslator exceptionTranslator)

相关代码如下:
spring_dataSource.xml

    id="sqlSessionFactory" class="org.mybatis.spring.SqlSessionFactoryBean">
        <property name="dataSource" ref="dynamicDataSource"/>
        <property name="mapperLocations" value="classpath:mapper/*.xml"/>
    

    id="myBatisExceptionTranslator" class="xxx.multidatasource.plugin.MyBatisExceptionTranslator">
        ref="sqlSessionFactory"/>
    

    id="sqlSessionTemplate" class="org.mybatis.spring.SqlSessionTemplate">
        "0" ref="sqlSessionFactory" />
        "1" value="SIMPLE" />
        "2" ref="myBatisExceptionTranslator" />
    

    class="org.mybatis.spring.mapper.MapperScannerConfigurer">
        <property name="basePackage" value="..." />
        <property name="sqlSessionTemplateBeanName" value="sqlSessionTemplate"/>
    

MyBatisExceptionTranslator.java

// 这个类直接复制了Mybatis的实现
public class MyBatisExceptionTranslator implements PersistenceExceptionTranslator {

    private final DataSource dataSource;

    private SQLExceptionTranslator exceptionTranslator;


    public MyBatisExceptionTranslator(SqlSessionFactory sqlSessionFactory) {
        this.dataSource = sqlSessionFactory.getConfiguration().getEnvironment().getDataSource();

        this.initExceptionTranslator();
    }

    /**
     * {@inheritDoc}
     */
    public DataAccessException translateExceptionIfPossible(RuntimeException e) {
        if (e instanceof PersistenceException) {
            // Batch exceptions come inside another PersistenceException
            // recursion has a risk of infinite loop so better make another if
            if (e.getCause() instanceof PersistenceException) {
                e = (PersistenceException) e.getCause();
            }
            if (e.getCause() instanceof SQLException) {
                this.initExceptionTranslator();
                return this.exceptionTranslator.translate(e.getMessage() + "\n", null, (SQLException) e.getCause());
            }
            return new MyBatisSystemException(e);
        }
        return null;
    }

    /**
     * Initializes the internal translator reference.
     */
    private synchronized void initExceptionTranslator() {
        if (this.exceptionTranslator == null) {
            // 这里改成使用自定义的DacSQLErrorCodeSQLExceptionTranslator
            this.exceptionTranslator = new DacSQLErrorCodeSQLExceptionTranslator(this.dataSource);
        }
    }

}

DacSQLErrorCodeSQLExceptionTranslator.java

// 改造思路很简单,每次translate时都要确定一次sqlErrorCodes,再走原来的SQLErrorCodeSQLExceptionTranslator逻辑即可
public class DacSQLErrorCodeSQLExceptionTranslator implements SQLExceptionTranslator {

    protected final Log logger = LogFactory.getLog(this.getClass());

    private DataSource dataSource;

    public DacSQLErrorCodeSQLExceptionTranslator(DataSource dataSource) {
        this.dataSource = dataSource;
    }

    @Override
    public DataAccessException translate(String task, String sql, SQLException ex) {
        String dbName = null;
        try {
            dbName = (String) JdbcUtils.extractDatabaseMetaData(dataSource, "getDatabaseProductName");
            if (dbName != null) {
                if (logger.isDebugEnabled()) {
                    logger.debug("Database product name cached for DataSource [" +
                            dataSource.getClass().getName() + '@' + Integer.toHexString(dataSource.hashCode()) +
                            "]: name is '" + dbName + "'");
                }
            }
        } catch (MetaDataAccessException mdaEx) {
            logger.warn("Error while extracting database product name - falling back to empty error codes", mdaEx);
        }

        SQLErrorCodes sqlErrorCodes = (dbName == null) ? new SQLErrorCodes()
                : SQLErrorCodesFactory.getInstance().getErrorCodes(dbName);

        return new SQLErrorCodeSQLExceptionTranslator(sqlErrorCodes).translate(task, sql, ex);
    }
}

你可能感兴趣的:(Java,Spring)