MySQL批处理之rewriteBatchedStatements

网上很多文章,都说MySQL驱动并没有实现"真正的"batchUpdate,执行的时候还是一条一条按顺序将SQL发送到MySQL服务器,其实这是错误的。

先贴一段源码(基于MySQL 5.1.40驱动),执行batchUpdate的时候最终执行如下方法:executeBatchInternal

protected long[] executeBatchInternal() throws SQLException {
    synchronized (checkClosed().getConnectionMutex()) {

        if (this.connection.isReadOnly()) {
            throw new SQLException(Messages.getString("PreparedStatement.25") + Messages.getString("PreparedStatement.26"),
                    SQLError.SQL_STATE_ILLEGAL_ARGUMENT);
        }

        if (this.batchedArgs == null || this.batchedArgs.size() == 0) {
            return new long[0];
        }

        // we timeout the entire batch, not individual statements
        int batchTimeout = this.timeoutInMillis;
        this.timeoutInMillis = 0;

        resetCancelledState();

        try {
            statementBegins();

            clearWarnings();

            if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) {

                if (canRewriteAsMultiValueInsertAtSqlLevel()) {
                    return executeBatchedInserts(batchTimeout);
                }

                if (this.connection.versionMeetsMinimum(4, 1, 0) && !this.batchHasPlainStatements && this.batchedArgs != null
                        && this.batchedArgs.size() > 3 /* cost of option setting rt-wise */) {
                    return executePreparedBatchAsMultiStatement(batchTimeout);
                }
            }

            return executeBatchSerially(batchTimeout);
        } finally {
            this.statementExecuting.set(false);

            clearBatch();
        }
    }
}

为了测试网上文章说法的正误,本地测试写了一个batch批量执行,跟踪源码的时候发现最后进入到

return executeBatchSerially(batchTimeout);

该方法的javadoc 如是说:

Executes the current batch of statements by executing them one-by-one.

executeBatchSerially核心源码如下:

for (this.batchCommandIndex = 0; this.batchCommandIndex < nbrCommands; this.batchCommandIndex++) {
    Object arg = this.batchedArgs.get(this.batchCommandIndex);

    try {
        if (arg instanceof String) {
            updateCounts[this.batchCommandIndex] = executeUpdateInternal((String) arg, true, this.retrieveGeneratedKeys);

            // limit one generated key per OnDuplicateKey statement
            getBatchedGeneratedKeys(this.results.getFirstCharOfQuery() == 'I' && containsOnDuplicateKeyInString((String) arg) ? 1 : 0);
        } else {
            BatchParams paramArg = (BatchParams) arg;
            //核心代码,for循环执行每一条SQL
            updateCounts[this.batchCommandIndex] = executeUpdateInternal(paramArg.parameterStrings, paramArg.parameterStreams,
                    paramArg.isStream, paramArg.streamLengths, paramArg.isNull, true);

            // limit one generated key per OnDuplicateKey statement
            getBatchedGeneratedKeys(containsOnDuplicateKeyUpdateInSQL() ? 1 : 0);
        }
    } catch (SQLException ex) {
        updateCounts[this.batchCommandIndex] = EXECUTE_FAILED;

        if (this.continueBatchOnError && !(ex instanceof MySQLTimeoutException) && !(ex instanceof MySQLStatementCancelledException)
                && !hasDeadlockOrTimeoutRolledBackTx(ex)) {
            sqlEx = ex;
        } else {
            long[] newUpdateCounts = new long[this.batchCommandIndex];
            System.arraycopy(updateCounts, 0, newUpdateCounts, 0, this.batchCommandIndex);

            throw SQLError.createBatchUpdateException(ex, newUpdateCounts, getExceptionInterceptor());
        }
    }
}

通过代码分析,也确实是一条一条SQL执行,而不是把batch的SQL发送到服务器

但是

重点来了,执行executeBatchSerially是有条件的,再次贴一下源码:

if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) {
    ...
}

return executeBatchSerially(batchTimeout);

也就是说,如果没做任何配置,默认情况下if条件是进不去的,会直接执行if块后边的

return executeBatchSerially(batchTimeout);

那,我们来看一下if条件是什么:

!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()

batchHasPlainStatements默认初始化就是false,可以不用管,重点是connection.getRewriteBatchedStatements()。这个是Connection的一个参数rewriteBatchedStatements,会在读取jdbcUrl的时候读取进来:jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true

我们将jdbcUrl添加一个rewriteBatchedStatements试试,即变成:jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&rewriteBatchedStatements=true

这时候就会进if块了。进入if块之后,再根据执行的是insert 还是update、 delete,会走不同方法

  • 如果是insert语句,满成条件情况下,会整合成形如:"insert into xxx_table values (xx),(yy),(zz)..."这样的语句
  • 如果是update\delete语句,满成条件情况下,会整合成形如:"update t set … where id = 1; update t set … where id = 2; update t set … where id = 3 ..."这样的语句

然后分批次发送给MySQL(会有一次发送的package大小限制,所以需要拆分批次)

int maxAllowedPacket = this.connection.getMaxAllowedPacket();

if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) {
    //insert会在这里进行再次判断
    if (canRewriteAsMultiValueInsertAtSqlLevel()) {
        return executeBatchedInserts(batchTimeout);
    }
    //update、delete会在这里进行再次判断
    //1. mysql版本>=4.1.0
    //2. batchHasPlainStatements为false
    //3. batch的数量>3
    if (this.connection.versionMeetsMinimum(4, 1, 0) && !this.batchHasPlainStatements && this.batchedArgs != null
            && this.batchedArgs.size() > 3 /* cost of option setting rt-wise */) {
        return executePreparedBatchAsMultiStatement(batchTimeout);
    }
}

return executeBatchSerially(batchTimeout);

在这里总结一下,如果想要达到MySQL真正batchUpdate效果,需要有以下几个条件:

  1. 需要在jdbcUrl后添加参数rewriteBatchedStatements=true
  2. this.batchHasPlainStatements 为false
  3. 如果是update \ delete 语句,还需要mysql版本>=4.1.0,并且batch的数量>3

因此,如果可能的情况下,请在jdbcUrl后添加参数rewriteBatchedStatements=true,尽可能利用上MySQL给我们提供的便利,提高性能。

原文地址:
MySQL之rewriteBatchedStatements

你可能感兴趣的:(MySQL批处理之rewriteBatchedStatements)