Failed to validate connection com.mysql.cj.jdbc.ConnectionImpl问题排查

整个问题排查过程中基本把Hikari连接池源码看完了,也发现了Hikari内部的一些实现细节,更多细节参见:Hikari 数据库连接池内部源码实现细节

问题描述及排查

最近发布项目测试环境后,发现后台经常隔28分钟时间就打印mysql 连接已经被关闭警告。(应用程序每28分钟访问一次数据库)

数据库: mysql 5.7
数据库驱动:mysql-connector-java-8.0.23
连接池:HikariCP-3.4.5
SpringBoot 2.4.3

连接池配置如下:

spring.datasource.username=${name}
spring.datasource.password=${pwd}
spring.datasource.url=jdbc:mysql://${ip}:3306/${dbname}?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=Asia/Shanghai

spring.datasource.type=com.zaxxer.hikari.HikariDataSource
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.datasource.hikari.minimum-idle=10
spring.datasource.hikari.maximum-pool-size=20
spring.datasource.hikari.connection-test-query=SELECT 1 FROM DUAL
spring.datasource.hikari.validation-timeout=10000
spring.datasource.hikari.auto-commit=true

警告信息如下:

2021-03-19 13:25:58.486  WARN 21806 --- [scheduling-1] com.zaxxer.hikari.pool.PoolBase          : 
slave - Failed to validate connection com.mysql.cj.jdbc.ConnectionImpl@2e2ce118 (No operations allowed after connection closed.). 
Possibly consider using a shorter maxLifetime value.

虽然这只是一个警告,并不影响程序执行,但还是很好奇想要排查一下具体原因。

warn信息的描述,程序通过已经关闭的连接来访问数据数据库,这是不允许的操作,并且警告中提醒减小maxLifetime的值来解决这个问题。

看到这个警告之后,笔者首先想到的就是mysql 8小时主动断开问题:如果hikari的maxLifetime的值超过了mysql默认的wait_timeout设置的时间28800s (8小时),实际上mysql已经主动关闭了连接。但示例中并没有配置hikari的maxLifetime选项,采用的是默认。查看hikari的源码发现了如下几个默认配置
com.zaxxer.hikari.HikariConfig源码中的类变量

private static final long CONNECTION_TIMEOUT = SECONDS.toMillis(30);  30s
private static final long VALIDATION_TIMEOUT = SECONDS.toMillis(5);   5s
private static final long IDLE_TIMEOUT = MINUTES.toMillis(10);        20分钟
private static final long MAX_LIFETIME = MINUTES.toMillis(30);        30分钟
private static final int DEFAULT_POOL_SIZE = 10;

hikari中默认连接存活的最大时间是30分钟,小于mysql默认的wait_timeout28800s (8小时),按道理不会是服务端主动主动关闭连接导致。
那会不会是mysql的wait_timeout被修改过??? 通过如下语句查询发现并没有被修改。

SHOW VARIABLES LIKE '%timeout%'

Failed to validate connection com.mysql.cj.jdbc.ConnectionImpl问题排查_第1张图片
当排除这个常规问题后,笔者尝试搜索了有没有其它可能导致这个警告,最终还是有一个小发现:
搜索到的案例是:虽然mysql服务端通过SHOW VARIABLES LIKE '%timeout%'查出来的时间28800s,但是实际上,/etc/my.cnf中配置的超时时间更小,导致mysql服务端主动关闭连接。
但是经过和dba确认,也排除了这个情况。
Failed to validate connection com.mysql.cj.jdbc.ConnectionImpl问题排查_第2张图片

既然以上两种情况都排除了,那只能深入源码,在测试环境远程断点调试。先搜索下警告打印的位置,只有一处在isConnectionAlive方法中.

源码如下: com.zaxxer.hikari.pool.PoolBase#isConnectionAlive

boolean isConnectionAlive(final Connection connection)
{
  try {
	 try {
		setNetworkTimeout(connection, validationTimeout);

		final int validationSeconds = (int) Math.max(1000L, validationTimeout) / 1000;

		if (isUseJdbc4Validation) {
		   return connection.isValid(validationSeconds);
		}

		try (Statement statement = connection.createStatement()) {
		   if (isNetworkTimeoutSupported != TRUE) {
			  setQueryTimeout(statement, validationSeconds);
		   }

		   statement.execute(config.getConnectionTestQuery());
		}
	 }
	 finally {
		setNetworkTimeout(connection, networkTimeout);

		if (isIsolateInternalQueries && !isAutoCommit) {
		   connection.rollback();
		}
	 }

	 return true;
  }
  catch (Exception e) {
	 lastConnectionFailure.set(e);
	 logger.warn("{} - Failed to validate connection {} ({}). Possibly consider using a shorter maxLifetime value.",
				 poolName, connection, e.getMessage());
	 return false;
  }
}

该方法用于检测connection的有效性,catch到异常后,只是简单打印文章开头所提的异常警告信息,并返回false,表示连接失效。然后会继续从连接池中获取有效的连接。

获取连接的代码如下: com.zaxxer.hikari.pool.HikariPool#getConnection(long hardTimeout)

public Connection getConnection(final long hardTimeout) throws SQLException
{
  suspendResumeLock.acquire();
  final long startTime = currentTime();

  try {
	 long timeout = hardTimeout;
	 do {
		PoolEntry poolEntry = connectionBag.borrow(timeout, MILLISECONDS);
		if (poolEntry == null) {
		   break; // We timed out... break and throw exception
		}

		final long now = currentTime();
		if (poolEntry.isMarkedEvicted() || (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs && !isConnectionAlive(poolEntry.connection))) {
		   closeConnection(poolEntry, poolEntry.isMarkedEvicted() ? EVICTED_CONNECTION_MESSAGE : DEAD_CONNECTION_MESSAGE);
		   timeout = hardTimeout - elapsedMillis(startTime);
		}
		else {
		   metricsTracker.recordBorrowStats(poolEntry, startTime);
		   return poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), now);
		}
	 } while (timeout > 0L);

	 metricsTracker.recordBorrowTimeoutStats(startTime);
	 throw createTimeoutException(startTime);
  }
  catch (InterruptedException e) {
	 Thread.currentThread().interrupt();
	 throw new SQLException(poolName + " - Interrupted during connection acquisition", e);
  }
  finally {
	 suspendResumeLock.release();
  }
}

为了打印更多的异常栈信息,笔者修改了isConnectionAlive方法源码,在catch到异常后,打印整个异常栈信息,最终得到如下日志:

2021-03-19 13:25:58.486  WARN 21806 --- [scheduling-1] com.zaxxer.hikari.pool.PoolBase          : slave - Failed to validate connection com.mysql.cj.jdbc.ConnectionImpl@2e2ce118 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2021-03-19 13:25:58.489 ERROR 21806 --- [scheduling-1] com.zaxxer.hikari.pool.PoolBase          : poolBase:

java.sql.SQLNonTransientConnectionException: No operations allowed after connection closed.
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:110) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:73) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:73) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.jdbc.ConnectionImpl.setNetworkTimeout(ConnectionImpl.java:2495) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:563) ~[classes!/:na]
	at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:173) ~[classes!/:na]
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:186) [HikariCP-3.4.5.jar!/:na]
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162) [HikariCP-3.4.5.jar!/:na]
	at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:100) [HikariCP-3.4.5.jar!/:na]
	at org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource.getConnection(AbstractRoutingDataSource.java:194) [spring-jdbc-5.3.4.jar!/:5.3.4]
	at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:158) [spring-jdbc-5.3.4.jar!/:5.3.4]
	at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:116) [spring-jdbc-5.3.4.jar!/:5.3.4]
	at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:79) [spring-jdbc-5.3.4.jar!/:5.3.4]
	.........................
Caused by: com.mysql.cj.exceptions.ConnectionIsClosedException: No operations allowed after connection closed.
	at sun.reflect.GeneratedConstructorAccessor32.newInstance(Unknown Source) ~[na:na]
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_91]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_91]
	at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:61) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:151) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.NativeSession.checkClosed(NativeSession.java:1171) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.jdbc.ConnectionImpl.checkClosed(ConnectionImpl.java:573) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.jdbc.ConnectionImpl.setNetworkTimeout(ConnectionImpl.java:2491) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	... 44 common frames omitted
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications link failure

The last packet successfully received from the server was 1,758,048 milliseconds ago. The last packet sent successfully to the server was 1,758,049 milliseconds ago.
	at sun.reflect.GeneratedConstructorAccessor31.newInstance(Unknown Source) ~[na:na]
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_91]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_91]
	at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:61) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:151) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.exceptions.ExceptionFactory.createCommunicationsException(ExceptionFactory.java:167) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.protocol.a.NativeProtocol.send(NativeProtocol.java:577) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:632) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.protocol.a.NativeProtocol.sendQueryPacket(NativeProtocol.java:948) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.protocol.a.NativeProtocol.sendQueryString(NativeProtocol.java:894) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.NativeSession.execSQL(NativeSession.java:1073) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.jdbc.StatementImpl.executeInternal(StatementImpl.java:722) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.jdbc.StatementImpl.execute(StatementImpl.java:646) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) ~[classes!/:na]
	... 42 common frames omitted
Caused by: java.net.SocketException: Broken pipe
	at java.net.SocketOutputStream.socketWrite0(Native Method) ~[na:1.8.0_91]
	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) ~[na:1.8.0_91]
	at java.net.SocketOutputStream.write(SocketOutputStream.java:153) ~[na:1.8.0_91]
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[na:1.8.0_91]
	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[na:1.8.0_91]
	at com.mysql.cj.protocol.a.SimplePacketSender.send(SimplePacketSender.java:55) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.protocol.a.TimeTrackingPacketSender.send(TimeTrackingPacketSender.java:50) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	at com.mysql.cj.protocol.a.NativeProtocol.send(NativeProtocol.java:568) ~[mysql-connector-java-8.0.23.jar!/:8.0.23]
	... 49 common frames omitted

从最后的Caused by: java.net.SocketException: Broken pipe 似乎只能看出在发送数据时socket已经被断开。

于是想到是不是socket的connectionTimesocketTimeout太短,网络较差导致发送sql请求时超时???

又是一通debug之后,深入到mysql-connector-java客户端源码中,最终发现了如下两段重要的代码,用于初始化mysql的连接。
其中成员变量propertySet则保存了全部的连接参数以及数据连接配置参数。

com.mysql.cj.NativeSession#connect源码
Failed to validate connection com.mysql.cj.jdbc.ConnectionImpl问题排查_第3张图片

com.mysql.cj.protocol.StandardSocketFactory#configureSocket源码
Failed to validate connection com.mysql.cj.jdbc.ConnectionImpl问题排查_第4张图片

com.mysql.cj.CoreSession#propertySet中保存的连接配置属性如下:

result = {Properties@10050}  size = 167
 "useCompression" -> "false"
 "maxRows" -> "-1"
 "tcpTrafficClass" -> "0"
 "allowNanAndInf" -> "false"
 "databaseTerm" -> "CATALOG"
 "sendFractionalSeconds" -> "true"
 "loadBalanceBlocklistTimeout" -> "0"
 "tcpKeepAlive" -> "true"
 "allowReplicaDownConnections" -> "false"
 "maxQuerySizeToLog" -> "2048"
 "verifyServerCertificate" -> "false"
 "cacheServerConfiguration" -> "false"
 "dontCheckOnDuplicateKeyUpdateInSQL" -> "false"
 "paranoid" -> "false"
 "allowSourceDownConnections" -> "false"
 "xdevapi.ssl-truststore-type" -> "JKS"
 "fallbackToSystemTrustStore" -> "true"
 "enableQueryTimeouts" -> "true"
 "blobSendChunkSize" -> "1048576"
 "useNanosForElapsedTime" -> "false"
 "readOnlyPropagatesToServer" -> "true"
 "profilerEventHandler" -> "com.mysql.cj.log.LoggingProfilerEventHandler"
 "secondsBeforeRetrySource" -> "30"
 "parseInfoCacheFactory" -> "com.mysql.cj.PerConnectionLRUFactory"
 "includeThreadNamesAsStatementComment" -> "false"
 "xdevapi.compression" -> "PREFERRED"
 "socketTimeout" -> "5000"
 "socksProxyPort" -> "1080"
 "requireSSL" -> "false"
 "largeRowSizeThreshold" -> "2048"
 "xdevapi.fallback-to-system-keystore" -> "true"
 "allowMultiQueries" -> "false"
 "rewriteBatchedStatements" -> "false"
 "enablePacketDebug" -> "false"
 "allowPublicKeyRetrieval" -> "false"
 "noDatetimeStringSync" -> "false"
 "includeThreadDumpInDeadlockExceptions" -> "false"
 "enableEscapeProcessing" -> "true"
 "failOverReadOnly" -> "true"
 "readFromSourceWhenNoReplicas" -> "false"
 "clientCertificateKeyStoreType" -> "JKS"
 "overrideSupportsIntegrityEnhancementFacility" -> "false"
 "disconnectOnExpiredPasswords" -> "true"
 "clobberStreamingResults" -> "false"
 "trustCertificateKeyStoreType" -> "JKS"
 "jdbcCompliantTruncation" -> "false"
 "useLocalTransactionState" -> "false"
 "cacheCallableStmts" -> "false"
 "rollbackOnPooledClose" -> "true"
 "cacheResultSetMetadata" -> "false"
 "loadBalanceValidateConnectionOnSwapServer" -> "false"
 "selfDestructOnPingMaxOperations" -> "0"
 "elideSetAutoCommits" -> "false"
 "loadBalanceHostRemovalGracePeriod" -> "15000"
 "serverConfigCacheFactory" -> "com.mysql.cj.util.PerVmServerConfigCacheFactory"
 "cacheDefaultTimeZone" -> "true"
 "allowLoadLocalInfile" -> "false"
 "loadBalanceAutoCommitStatementThreshold" -> "0"
 "getProceduresReturnsFunctions" -> "true"
 "forceConnectionTimeZoneToSession" -> "false"
 "useInformationSchema" -> "false"
 "reportMetricsIntervalMillis" -> "30000"
 "alwaysSendSetIsolation" -> "true"
 "tinyInt1isBit" -> "true"
 "xdevapi.ssl-keystore-type" -> "JKS"
 "xdevapi.auth" -> "PLAIN"
 "processEscapeCodesForPrepStmts" -> "true"
 "emptyStringsConvertToZero" -> "true"
 "sslMode" -> "DISABLED"
 "xdevapi.connect-timeout" -> "10000"
 "zeroDateTimeBehavior" -> "EXCEPTION"
 "sendFractionalSecondsForTime" -> "true"
 "maxReconnects" -> "3"
 "autoDeserialize" -> "false"
 "callableStmtCacheSize" -> "100"
 "prepStmtCacheSqlLimit" -> "256"
 "resultSetSizeThreshold" -> "100"
 "useSSL" -> "false"
 "metadataCacheSize" -> "50"
 "useHostsInPrivileges" -> "true"
 "tcpRcvBuf" -> "0"
 "dontTrackOpenResources" -> "false"
 "autoGenerateTestcaseScript" -> "false"
 "characterEncoding" -> "utf-8"
 "xdevapi.fallback-to-system-truststore" -> "true"
 "nullDatabaseMeansCurrent" -> "false"
 "useLocalSessionState" -> "false"
 "detectCustomCollations" -> "false"
 "useOldAliasMetadataBehavior" -> "false"
 "autoReconnect" -> "false"
 "autoReconnectForPools" -> "false"
 "traceProtocol" -> "false"
 "maxAllowedPacket" -> "16777216"
 "populateInsertRowWithDefaultValues" -> "false"
 "useUnbufferedInput" -> "true"
 "ignoreNonTxTables" -> "false"
 "useAffectedRows" -> "false"
 "ha.loadBalanceStrategy" -> "random"
 "loadBalancePingTimeout" -> "0"
 "compensateOnDuplicateKeyUpdateCounts" -> "false"

最后发现在验证connection有效性时,connectionTimesocketTimeout设置的就是数据源中配置的connection-timeoutvalidation-timeout,并通过ping命令检测发现,网络并不是很慢,所以网络请求超时这个可能也排除。

调试过程中发现mysql客户端和服务端之间Socket通信,心跳是通过TCP协议中默认的tcpKeepAlive参数来维护心跳的。
通过搜索得知,linux中默认是2小时。

会不会服务器tcp_keepalive_time时间被修改过,但因为网络缓慢,心跳未接收到,导致服务器关闭了连接???
可以通过linux的如下命令查看

cat /proc/sys/net/ipv4/tcp_keepalive_time

在笔者的应用服务器上发现这个时间确实被改成了较短的180s,那么mysql所在的机器,很可能也是被改过。
于是找到dba查看该项的配置。同样是180s,但这个时间也不至于让心跳超时啊!

最终原因: 好在dba询问了一下为什么要查这个参数,然后笔者将警告信息以及排查的过程描述了一下,结果dba那边竟然告诉我有定时任务, 删除4分钟内没有请求的连接,而我的程序访问数据库时间间隔是28分钟一次,所以早就在mysql服务端关闭了(捂脸哭的表情)

那如何让这个告警不打印呢?
既然数据库服务端会在4分钟后删除没有请求的连接,我们可以减小Hikari中maxLifetime的值小于该值,这样达到maxLifetime时Hikari连接池会通过软驱逐的方式来主动关闭该连接,下次就不会获取到因超时而关闭的连接,也就不会出现警告了。

彩蛋:Hikari中max-lifetime参数作用及实现

该参数就是用于控制连接池中一个connection的最大存活时间。

Hikari内部有一个ScheduledExecutorService对象houseKeepingExecutorService,每创建一个连接,就会被封装成一个PoolEntry对象,然后放在定时任务中,定时时间就是设置的max-lifetime。只要到达这个时间就会采取软驱逐的方式从池中移除。
除此之外,houseKeepingExecutorService还用于每隔30s来检查一次池中的空闲连接、最大连接情况,并维护池中连接数的平衡。

何谓软驱逐

  1. 如果这个连接正在被使用,则不立即关闭连接,但是通过PoolEntry对象中的private volatile boolean evict;字段来标记为需要关闭。下次有线程来获取到这个连接时,发现evict=true则调用异步关闭方法。重新获取池中其它的连接。
  2. 如果这个连接未被使用,则立即调用异步关闭连接的方法。

源码见: com.zaxxer.hikari.pool.HikariPool#createPoolEntry

private PoolEntry createPoolEntry()
{
  try {
	 final PoolEntry poolEntry = newPoolEntry();

	 final long maxLifetime = config.getMaxLifetime();
	 if (maxLifetime > 0) {
		// variance up to 2.5% of the maxlifetime
		final long variance = maxLifetime > 10_000 ? ThreadLocalRandom.current().nextLong( maxLifetime / 40 ) : 0;
		final long lifetime = maxLifetime - variance;
		poolEntry.setFutureEol(houseKeepingExecutorService.schedule(
		   () -> {
			  if (softEvictConnection(poolEntry, "(connection has passed maxLifetime)", false /* not owner */)) {
				 addBagItem(connectionBag.getWaitingThreadCount());
			  }
		   },
		   lifetime, MILLISECONDS));
	 }

	 return poolEntry;
  }
  catch (ConnectionSetupException e) {
	 if (poolState == POOL_NORMAL) { // we check POOL_NORMAL to avoid a flood of messages if shutdown() is running concurrently
		logger.error("{} - Error thrown while acquiring connection from data source", poolName, e.getCause());
		lastConnectionFailure.set(e);
	 }
  }
  catch (Exception e) {
	 if (poolState == POOL_NORMAL) { // we check POOL_NORMAL to avoid a flood of messages if shutdown() is running concurrently
		logger.debug("{} - Cannot acquire connection from data source", poolName, e);
	 }
  }

  return null;
}

总结

整个问题排查过程中基本把Hikari连接池源码看完了,也发现了Hikari内部的一些实现细节,更多细节参见:Hikari 数据库连接池内部源码实现细节
如果Hikari连接池遇到Failed to validate connection com.mysql.cj.jdbc.ConnectionImpl问题注意从下面三个方向排查。

  1. Hikari连接池的max-lifetime(默认30分钟)(对应springboot中的spring.datasource.hikari.max-lifetime)是否大于了MySQL中wait_timeoutinteractive_timeout(默认8小时)。
  2. 检查MySQL服务的/etc/my.cnf配置文件中wait_timeoutinteractive_timeout两个超时时间是否和通过sql SHOW VARIABLES LIKE '%timeout%'查出的结果一致。
  3. 检查是否和笔者的情况一下,dba为了节约连接资源,通过脚本定时清理了长时间没有请求的连接。

你可能感兴趣的:(故障排查,mysql,数据库,java)