SpringCloudFeign引入feign-httpclient导致的坑

SpringCloudFeign底层是通过http/https协议进行通信,默认是采用java.net.HttpURLConnection,每次请求都会建立、关闭连接,为了性能考虑,可以引入httpclient、okhttp作为底层的通信框架。

maven坐标如下:

<dependency>
 <groupId>io.github.openfeigngroupId>
 <artifactId>feign-httpclientartifactId>
 <version>9.5.0version>
dependency>

在项目中我们采用SpringCloud(Dalston.SR4)技术栈,为了性能考虑也引入了httpclient对feign的支持。当一次项目压测过程中,发现请求的tps达到一定值时,整个请求的耗时明显上升了,后来通过日志分析,发现请求调用发起时到服务端接受请求耗费了5s以上,最后聚焦到feign调用的底层实现(中间也走了很多弯路,一度怀疑网络层面是否存在问题),通过代码分析,发现feign初始化client的代码如下:

@Configuration
@ConditionalOnClass(ApacheHttpClient.class)
@ConditionalOnProperty(value = "feign.httpclient.enabled", matchIfMissing = true)
class HttpClientFeignLoadBalancedConfiguration {

	@Autowired(required = false)
	private HttpClient httpClient;

	@Bean
	@ConditionalOnMissingBean(Client.class)
	public Client feignClient(CachingSpringLoadBalancerFactory cachingFactory,
							  SpringClientFactory clientFactory) {
		ApacheHttpClient delegate;
		if (this.httpClient != null) {
			delegate = new ApacheHttpClient(this.httpClient);
		} else {
			delegate = new ApacheHttpClient();
		}
		return new LoadBalancerFeignClient(delegate, cachingFactory, clientFactory);
	}
}

项目中我们没有显示的声明org.apache.http.client.HttpClient,所以走到了delegate = new ApacheHttpClient();这段逻辑,继续往下分析,找到org.apache.http.impl.conn.PoolingHttpClientConnectionManager初始化方法:

public PoolingHttpClientConnectionManager(
        final HttpClientConnectionOperator httpClientConnectionOperator,
        final HttpConnectionFactory<HttpRoute, ManagedHttpClientConnection> connFactory,
        final long timeToLive, final TimeUnit tunit) {
        super();
        this.configData = new ConfigData();
        this.pool = new CPool(new InternalConnectionFactory(
                this.configData, connFactory), 2, 20, timeToLive, tunit);
        this.pool.setValidateAfterInactivity(2000);
        this.connectionOperator = Args.notNull(httpClientConnectionOperator, "HttpClientConnectionOperator");
        this.isShutDown = new AtomicBoolean(false);
    }
    
public CPool(
            final ConnFactory<HttpRoute, ManagedHttpClientConnection> connFactory,
            final int defaultMaxPerRoute, final int maxTotal,
            final long timeToLive, final TimeUnit tunit) {
        super(connFactory, defaultMaxPerRoute, maxTotal);
        this.timeToLive = timeToLive;
        this.tunit = tunit;
    }

到这里一眼就能看出问题了,默认的defaultMaxPerRoute=2,maxTotal=20,所以根本原因已找到,解决方法就是不用默认构造的org.apache.http.client.HttpClient,在应用中自己申明一个HttpClient实例bean:

@Bean(destroyMethod = "close")
    public CloseableHttpClient httpClient() {
        PoolingHttpClientConnectionManager connectionManager = new PoolingHttpClientConnectionManager();
        connectionManager.setMaxTotal(400);
        connectionManager.setDefaultMaxPerRoute(100);

        RequestConfig requestConfig = RequestConfig.custom().setConnectionRequestTimeout(2000)//从连接池获取连接等待超时时间
                .setConnectTimeout(2000)//请求超时时间
                .setSocketTimeout(15000)//等待服务响应超时时间
                .build();
        HttpClientBuilder httpClientBuilder = HttpClientBuilder.create().setConnectionManager(connectionManager)
                .setDefaultRequestConfig(requestConfig)
                //自定义重试策略,针对502和503重试一次
                .setServiceUnavailableRetryStrategy(new CustomizedServiceUnavailableRetryStrategy())
                .evictExpiredConnections();
        return httpClientBuilder.build();
    }

至此,问题已经解决。

总结:

引入新技术栈时,一定要阅读相关文档了解组件的配置化参数信息(默认值往往在遇到高并发场景无法满足),特别是对于基于springboot构建的应用,往往由于自动化的配置,导致忽略了重要参数的指定。

你可能感兴趣的:(springcloud,java)