调用第三方api,出现请求超时的错误,导致业务失败,springMVC项目使用的是httpClient4,部署环境是centos。
开始debug,具体日志如下图,从日志拿到关键信息,[total kept alive: 1; route allocated: 1 of 100; total allocated: 1 of 400](下文做认识介绍)这个是httpclient的连接池debug日志,从这里面没有看到是连接池的问题。
[total kept alive: 1; route allocated: 1 of 100; total allocated: 1 of 400]
名字 | 介绍 |
---|---|
total kept alive | 是连接池总存活数量 |
route allocated | 每个路由最大连接数 |
total allocated | 总连接数 |
线上服务调用第三方api出现以下错误,
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:209)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
从上得知从创建链接放到连接池,整个流程是没有问题的,如果有问题就不会有空闲连接池。
那么我们先从请求流程来分析,httpclient接到任务,会先去连接池查看是否有可用同路由的链接,如果有就拿来使用,如果没有就创建一个,使用后然后放到连接池里面,因为没有配置连接池空闲等待时长,默认就是永久,那会不会服务器把这个tcp链接给kill掉了?????
。
那么问题真的来了啊,看下连接池中的链接信息:
timeToLive是当前连接存活时间,-1就是永久存活。
其中expiry就是过期时间,看着明显很大,expiry=当前时间+timeToLive。
既然定位到问题那么就开始解决。
问题原因已经算是找到了,那么我们尝试增加连接池空闲存活时长,然后修改配置httpClient配置文件,原来只是增加连接池管理器空闲连接存活时间,现实并没有生效,debug发现并没有时间到期后并没有closeExpiredConnections和closeIdleConnections方法。通过官方查到需要实现清楚策略,原因是我的版本是httpClient4。
那么再次开启修复之路:
官方给的策略:
http.pool.request.timeToLive=5
http.pool.request.tunit=SECONDS
http.pool.request.idle.maxIdleTime=3
http.pool.request.idle.maxIdleTimeUnit=SECONDS
/*
* ====================================================================
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
* ====================================================================
*
* This software consists of voluntary contributions made by many
* individuals on behalf of the Apache Software Foundation. For more
* information on the Apache Software Foundation, please see
* .
*
*/
package com.fhlkd.client.config;
import org.apache.http.conn.HttpClientConnectionManager;
import org.apache.http.util.Args;
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.TimeUnit;
/**
* This class maintains a background thread to enforce an eviction policy for expired / idle
* persistent connections kept alive in the connection pool.
*
* @since 4.4
*/
public final class IdleConnectionEvictor {
private final HttpClientConnectionManager connectionManager;
private final ThreadFactory threadFactory;
private final Thread thread;
private final long sleepTimeMs;
private final long maxIdleTimeMs;
private volatile Exception exception;
public IdleConnectionEvictor(
final HttpClientConnectionManager connectionManager,
final ThreadFactory threadFactory,
final long sleepTime, final TimeUnit sleepTimeUnit,
final long maxIdleTime, final TimeUnit maxIdleTimeUnit) {
this.connectionManager = Args.notNull(connectionManager, "Connection manager");
this.threadFactory = threadFactory != null ? threadFactory : new DefaultThreadFactory();
this.sleepTimeMs = sleepTimeUnit != null ? sleepTimeUnit.toMillis(sleepTime) : sleepTime;
this.maxIdleTimeMs = maxIdleTimeUnit != null ? maxIdleTimeUnit.toMillis(maxIdleTime) : maxIdleTime;
this.thread = this.threadFactory.newThread(new Runnable() {
@Override
public void run() {
try {
while (!Thread.currentThread().isInterrupted()) {
Thread.sleep(sleepTimeMs);
//关闭失效连接
connectionManager.closeExpiredConnections();
if (maxIdleTimeMs > 0) {
//关闭空闲超过配置
connectionManager.closeIdleConnections(maxIdleTimeMs, TimeUnit.MILLISECONDS);
}
}
} catch (final Exception ex) {
exception = ex;
}
}
});
}
public IdleConnectionEvictor(
final HttpClientConnectionManager connectionManager,
final long sleepTime, final TimeUnit sleepTimeUnit,
final long maxIdleTime, final TimeUnit maxIdleTimeUnit) {
this(connectionManager, null, sleepTime, sleepTimeUnit, maxIdleTime, maxIdleTimeUnit);
}
public IdleConnectionEvictor(
final HttpClientConnectionManager connectionManager,
final long maxIdleTime, final TimeUnit maxIdleTimeUnit) {
this(connectionManager, null,
maxIdleTime > 0 ? maxIdleTime : 5, maxIdleTimeUnit != null ? maxIdleTimeUnit : TimeUnit.SECONDS,
maxIdleTime, maxIdleTimeUnit);
}
public void start() {
thread.start();
}
public void shutdown() {
thread.interrupt();
}
public boolean isRunning() {
return thread.isAlive();
}
public void awaitTermination(final long time, final TimeUnit timeUnit) throws InterruptedException {
thread.join((timeUnit != null ? timeUnit : TimeUnit.MILLISECONDS).toMillis(time));
}
static class DefaultThreadFactory implements ThreadFactory {
@Override
public Thread newThread(final Runnable r) {
final Thread t = new Thread(r, "Connection evictor");
t.setDaemon(true);
return t;
}
};
}
结果
timeToLive为5,说明已经生效了,且已正常运行了,没有再出现连接超时了。
到此为止:
其实问题主要原因是,服务器防火墙会把空闲时间大于7秒的链接给kill掉,所以才会导致这个问题出现,我们只需要处理在防火墙kill掉链接之前给该链接从连接池清除掉即可