数据库链接长时间无数据交互,发生线程阻塞情况

背景:

 在执行双机房部署的时候,因为应用长时间未访问数据库,导致后面访问的数据库的线程都被挂起。

 

现象分析:

"Thread-74" daemon prio=10 tid=0x00007f1840044000 nid=0x387b runnable [0x00007f18bdb27000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at oracle.net.ns.Packet.receive(Unknown Source)
        at oracle.net.ns.DataPacket.receive(Unknown Source)
        at oracle.net.ns.NetInputStream.getNextPacket(Unknown Source)
        at oracle.net.ns.NetInputStream.read(Unknown Source)
        at oracle.net.ns.NetInputStream.read(Unknown Source)
        at oracle.net.ns.NetInputStream.read(Unknown Source)
        at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1109)
        at oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1080)
        at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:485)
        at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:210)
        at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:804)
        at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1051)
        at oracle.jdbc.driver.T4CStatement.executeMaybeDescribe(T4CStatement.java:845)
        at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1156)
        at oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1315)
        - locked <0x00000000be528248> (a oracle.jdbc.driver.T4CStatement)
        - locked <0x00000000c4d1b000> (a oracle.jdbc.driver.T4CConnection)
        at oracle.jdbc.driver.PhysicalConnection.doPingDatabase(PhysicalConnection.java:4614)
        at oracle.jdbc.driver.PhysicalConnection$1.run(PhysicalConnection.java:4590)
        at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:

DubboServerHandler-172.21.55.25:20883-thread-13" daemon prio=10 tid=0x00007f1828012800 nid=0x3867 in Object.wait() [0x00007f18be305000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000be51e0b0> (a java.lang.Thread)
        at java.lang.Thread.join(Thread.java:1194)
        - locked <0x00000000be51e0b0> (a java.lang.Thread)
        at oracle.jdbc.driver.PhysicalConnection.pingDatabase(PhysicalConnection.java:4596)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at com.alibaba.druid.pool.vendor.OracleValidConnectionChecker.isValidConnection(OracleValidConnectionChecker.java:94)
        at com.alibaba.druid.pool.DruidAbstractDataSource.testConnectionInternal(DruidAbstractDataSource.java:1252)
        at com.alibaba.druid.pool.DruidDataSource.getConnectionDirect(DruidDataSource.java:954)
        at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:921)
        at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:911)
        at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:98)
        at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:113)
        at org.springframework.jdbc.datasource.TransactionAwareDataSourceProxy$TransactionAwareInvocationHandler.invoke(TransactionAwareDataSourceProxy.java:210)
        at $Proxy25.toString(Unknown Source)
        at java.lang.String.valueOf(String.java:2826)
        at java.lang.StringBuffer.append(StringBuffer.java:219)
        - locked <0x00000000be4cd210> (a java.lang.StringBuffer)
        at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:194)
        at org.springframework.orm.ibatis.SqlMapClientTemplate.executeWithListResult(SqlMapClientTemplate.java:249)
        at org.springframework.orm.ibatis.SqlMapClientTemplate.queryForList(SqlMapClientTemplate.java:296)
        at com.lianpay.bank.mng.dao.impl.BaseIbatisDAOImpl.queryAll(BaseIbatisDAOImpl.java:123)
        at com.lianpay.bank.mng.service.impl.QueryBankListServiceImpl.queryPayTypeList(QueryBankListServiceImpl.java:319)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

    我们可以发现线程thread-13一直被阻塞在pingDatabase方法调用中,而线程Thread-74一直卡在socketRead0,初步分析是没有设置socket timeout时间。

解决:

参考文档:

http://agapple.iteye.com/blog/772507 

http://www.importnew.com/2466.html

http://agapple.iteye.com/blog/1024508

查看上面两个参考文档之后,发现需要配置validationQueryTimeout,oracle.net.CONNECT_TIMEOUT=3000&oracle.net.READ_TIMEOUT=60000这几个参数。

 

源码分析:

问题1:我们设置了maxWait参数,为什么获取链接的时候未超时,报异常

问题2:我们设置了validationQuery,为什么没有定期发送ping包过去保活

 

   一: maxWait参数主要作用在getConnectionInternal,而我们现在是卡在testConnectionInternal中,表示是已经获取到connection,但是卡在验证connection的有效性中。

  public DruidPooledConnection getConnectionDirect(long maxWaitMillis) throws SQLException {
        int notFullTimeoutRetryCnt = 0;
        for (;;) {
            // handle notFullTimeoutRetry
            DruidPooledConnection poolableConnection;
            try {
                poolableConnection = getConnectionInternal(maxWaitMillis);
            } catch (GetConnectionTimeoutException ex) {
                if (notFullTimeoutRetryCnt <= this.notFullTimeoutRetryCount && !isFull()) {
                    notFullTimeoutRetryCnt++;
                    if (LOG.isWarnEnabled()) {
                        LOG.warn("not full timeout retry : " + notFullTimeoutRetryCnt);
                    }
                    continue;
                }
                throw ex;
            }

            if (isTestOnBorrow()) {
                boolean validate = testConnectionInternal(poolableConnection.getConnection());
                if (!validate) {
                    if (LOG.isDebugEnabled()) {
                        LOG.debug("skip not validate connection.");
                    }

                    Connection realConnection = poolableConnection.getConnection();
                    discardConnection(realConnection);
                    continue;
                }
 }

   

    2. validationQuery不是用来保证socket长链接不断开的。validationQuery的调用有两处,一是创建链接的时候,二是获取到链接的时候,会验证有效性。

 protected boolean testConnectionInternal(Connection conn) {
        String sqlFile = JdbcSqlStat.getContextSqlFile();
        String sqlName = JdbcSqlStat.getContextSqlName();

        if (sqlFile != null) {
            JdbcSqlStat.setContextSqlFile(null);
        }
        if (sqlName != null) {
            JdbcSqlStat.setContextSqlName(null);
        }
        try {
            if (validConnectionChecker != null) {
                return validConnectionChecker.isValidConnection(conn, validationQuery, validationQueryTimeout);
            }

            if (conn.isClosed()) {
                return false;
            }

            if (null == validationQuery) {
                return true;
            }

            Statement stmt = null;
            ResultSet rset = null;
            try {
                stmt = conn.createStatement();
                if (getValidationQueryTimeout() > 0) {
                    stmt.setQueryTimeout(validationQueryTimeout);
                }
                rset = stmt.executeQuery(validationQuery);
                if (!rset.next()) {
                    return false;
                }
            } finally {
                JdbcUtils.close(rset);
                JdbcUtils.close(stmt);
            }

            return true;
        } catch (Exception ex) {
            // skip
            return false;
        } finally {
            if (sqlFile != null) {
                JdbcSqlStat.setContextSqlFile(sqlFile);
            }
            if (sqlName != null) {
                JdbcSqlStat.setContextSqlName(sqlName);
            }
        }
    }

    如果socket链接断开不可用,也不会导致应用异常,druid框架会不断的重试,抛弃不可用链接,直到拿到可用的链接

 public DruidPooledConnection getConnectionDirect(long maxWaitMillis) throws SQLException {
        int notFullTimeoutRetryCnt = 0;
        for (;;) {
            // handle notFullTimeoutRetry
            DruidPooledConnection poolableConnection;
            try {
                poolableConnection = getConnectionInternal(maxWaitMillis);
            } catch (GetConnectionTimeoutException ex) {
                if (notFullTimeoutRetryCnt <= this.notFullTimeoutRetryCount && !isFull()) {
                    notFullTimeoutRetryCnt++;
                    if (LOG.isWarnEnabled()) {
                        LOG.warn("not full timeout retry : " + notFullTimeoutRetryCnt);
                    }
                    continue;
                }
                throw ex;
            }

            if (isTestOnBorrow()) {
                boolean validate = testConnectionInternal(poolableConnection.getConnection());
                if (!validate) {
                    if (LOG.isDebugEnabled()) {
                        LOG.debug("skip not validate connection.");
                    }

                    Connection realConnection = poolableConnection.getConnection();
                    discardConnection(realConnection);
                    continue;
                }
            } else {
                Connection realConnection = poolableConnection.getConnection();
                if (realConnection.isClosed()) {
                    discardConnection(null); // 传入null,避免重复关闭
                    continue;
                }

                if (isTestWhileIdle()) {
                    final long currentTimeMillis = System.currentTimeMillis();
                    final long lastActiveTimeMillis = poolableConnection.getConnectionHolder().getLastActiveTimeMillis();
                    final long idleMillis = currentTimeMillis - lastActiveTimeMillis;
                    long timeBetweenEvictionRunsMillis = this.getTimeBetweenEvictionRunsMillis();
                    if (timeBetweenEvictionRunsMillis <= 0) {
                        timeBetweenEvictionRunsMillis = DEFAULT_TIME_BETWEEN_EVICTION_RUNS_MILLIS;
                    }

                    if (idleMillis >= timeBetweenEvictionRunsMillis) {
                        boolean validate = testConnectionInternal(poolableConnection.getConnection());
                        if (!validate) {
                            if (LOG.isDebugEnabled()) {
                                LOG.debug("skip not validate connection.");
                            }

                            discardConnection(realConnection);
                            continue;
                        }
                    }
                }
            }

            if (isRemoveAbandoned()) {
                StackTraceElement[] stackTrace = Thread.currentThread().getStackTrace();
                poolableConnection.setConnectStackTrace(stackTrace);
                poolableConnection.setConnectedTimeNano();
                poolableConnection.setTraceEnable(true);

                synchronized (activeConnections) {
                    activeConnections.put(poolableConnection, PRESENT);
                }
            }

            if (!this.isDefaultAutoCommit()) {
                poolableConnection.setAutoCommit(false);
            }

            return poolableConnection;
        }
    }

 

 

 

 

 

你可能感兴趣的:(数据库链接长时间无数据交互,发生线程阻塞情况)