在spring cloud项目中,因为用了feign肯定会用到ribbon,二者都提供了重试功能。引入的重试机制保证了高可用的同时,也会带来一些其它的问题,如幂等操作或一些没必要的重试。但很多人对其重试机制经常混淆,接下来将分析Feign 和 Ribbon 重试机制的实现原理和区别。
Feign 和 Ribbon重试,其实二者的重试机制相互独立,并无联系。如果一个http请求,如果feign和ribbon都配置了重试机制,请求总次数 n (计算公式)为Feign 和 Ribbon 配置参数的笛卡尔积。
计算公式:n(请求总次数)=feign(默认5次) * (MaxAutoRetries+1) * (MaxAutoRetriesNextServer+1)
注意:+1是代表ribbon本身默认的请求
feign的重试机制相对来说比较鸡肋,使用Feign 的时候一般会关闭该功能。Ribbon的重试机制默认配置为0,也就是默认是去除重试机制的。
如果验证呢?
feign的配置这里就不再介绍了,通过FeignClientProperties可以配置重试,重试类在FeignClientFactoryBean中实例化,实例化的逻辑是configureFeign方法中,先从spring ioc中查找Retryer中的bean,如果有则填充到Feign.Builder,再从FeignClientProperties中查找配置,如果有则再次填充Feign.Builder,简单的讲就是覆盖逻辑。
而feign真正执行重试请求的逻辑在代理类SynchronousMethodHandler中,SynchronousMethodHandler再不熟悉的话,可以回顾之前的相关分享。
//重试接口
public interface Retryer extends Cloneable {
void continueOrPropagate(RetryableException e);
Retryer clone();
//重试,默认的请求次数为5次
public static class Default implements Retryer {
private final int maxAttempts;
private final long period;
private final long maxPeriod;
int attempt;
long sleptForMillis;
public Default() {
this(100, SECONDS.toMillis(1), 5);
}
public Default(long period, long maxPeriod, int maxAttempts) {
this.period = period;
this.maxPeriod = maxPeriod;
this.maxAttempts = maxAttempts;
this.attempt = 1;
}
// visible for testing;
protected long currentTimeMillis() {
return System.currentTimeMillis();
}
public void continueOrPropagate(RetryableException e) {
if (attempt++ >= maxAttempts) {//重试次数大于最大请求次数,抛出异常
throw e;
}
long interval;
if (e.retryAfter() != null) {
interval = e.retryAfter().getTime() - currentTimeMillis();
if (interval > maxPeriod) {
interval = maxPeriod;
}
if (interval < 0) {
return;
}
} else {
interval = nextMaxInterval();
}
try {
Thread.sleep(interval);
} catch (InterruptedException ignored) {
Thread.currentThread().interrupt();
}
sleptForMillis += interval;
}
/**
* Calculates the time interval to a retry attempt.
The interval increases exponentially
* with each attempt, at a rate of nextInterval *= 1.5 (where 1.5 is the backoff factor), to the
* maximum interval.
*
* @return time in nanoseconds from now until the next attempt.
*/
long nextMaxInterval() {
long interval = (long) (period * Math.pow(1.5, attempt - 1));
return interval > maxPeriod ? maxPeriod : interval;
}
@Override
public Retryer clone() {
return new Default(period, maxPeriod, maxAttempts);
}
}
/**
* 从不重试
*/
Retryer NEVER_RETRY = new Retryer() {
@Override
public void continueOrPropagate(RetryableException e) {
throw e;
}
@Override
public Retryer clone() {
return this;
}
};
}
@Configuration
public class FeignClientsConfiguration {
//默认不重试
@Bean
@ConditionalOnMissingBean
public Retryer feignRetryer() {
return Retryer.NEVER_RETRY;
}
}
//执行重试请求
final class SynchronousMethodHandler implements MethodHandler {
@Override
public Object invoke(Object[] argv) throws Throwable {
RequestTemplate template = buildTemplateFromArgs.create(argv);
Retryer retryer = this.retryer.clone();
while (true) {
try {
return executeAndDecode(template);
} catch (RetryableException e) {
//这里重试,无异常则代表可以继续重试
retryer.continueOrPropagate(e);
if (logLevel != Logger.Level.NONE) {
logger.logRetry(metadata.configKey(), logLevel);
}
continue;
}
}
}
Object executeAndDecode(RequestTemplate template) throws Throwable {
...
Response response;
long start = System.nanoTime();
try {
//重点关注:client(feign.Client),那它和ribbon是和关系
response = client.execute(request, options);
// ensure the request is set. TODO: remove in Feign 10
response.toBuilder().request(request).build();
...
}
}
这里提问一个:前面讲过二者重试毫无关联,Feign和Ribbon重试机制谁先重试谁后重试呢?
先不急,文章结尾再回答这个问题!
Ribbon的配置这里就不再介绍了,通过CommonClientConfigKey可以配置重试。重试的底层使用了spring-retry,重试的实例化在LoadBalancerFeignClient中lbClient方法中,重试逻辑实现是通过RetryableFeignLoadBalancer(之前讲过FeignLoadBalancer)实现。
//重试配置实现
public class DefaultLoadBalancerRetryHandler implements RetryHandler {
public DefaultLoadBalancerRetryHandler(IClientConfig clientConfig) {
//重试相同实例
this.retrySameServer = clientConfig.get(CommonClientConfigKey.MaxAutoRetries, DefaultClientConfigImpl.DEFAULT_MAX_AUTO_RETRIES);
//重试下一实例
this.retryNextServer = clientConfig.get(CommonClientConfigKey.MaxAutoRetriesNextServer, DefaultClientConfigImpl.DEFAULT_MAX_AUTO_RETRIES_NEXT_SERVER);
//重试所有操作
this.retryEnabled = clientConfig.get(CommonClientConfigKey.OkToRetryOnAllOperations, false);
}
@Override
public boolean isRetriableException(Throwable e, boolean sameServer) {
if (retryEnabled) {
if (sameServer) {
return Utils.isPresentAsCause(e, getRetriableExceptions());
} else {
return true;
}
}
return false;
}
}
public class LoadBalancerFeignClient implements Client {
@Override
public Response execute(Request request, Request.Options options) throws IOException {
...
return lbClient(clientName).executeWithLoadBalancer(ribbonRequest,
requestConfig).toResponse();
}
//启用重试(默认不启用),启用的话创建RetryableFeignLoadBalancer
private FeignLoadBalancer lbClient(String clientName) {
return this.lbClientFactory.create(clientName);
}
public T executeWithLoadBalancer(final S request, final IClientConfig requestConfig) throws ClientException {
//获取重试机制配置:RequestSpecificRetryHandler
LoadBalancerCommand command = buildLoadBalancerCommand(request, requestConfig);
try {
return command.submit(
new ServerOperation() {
@Override
public Observable call(Server server) {
URI finalUri = reconstructURIWithServer(server, request.getUri());
S requestForServer = (S) request.replaceUri(finalUri);
try {
return Observable.just(AbstractLoadBalancerAwareClient.this.execute(requestForServer, requestConfig));
}
catch (Exception e) {
return Observable.error(e);
}
}
})
.toBlocking()
.single();
} catch (Exception e) {
Throwable t = e.getCause();
if (t instanceof ClientException) {
throw (ClientException) t;
} else {
throw new ClientException(e);
}
}
}
@Deprecated
protected boolean isRetriableException(Throwable e) {
if (getRetryHandler() != null) {
//执行DefaultLoadBalancerRetryHandler的isRetriableException
return getRetryHandler().isRetriableException(e, true);
}
return false;
}
}
//可重试的FeignLoadBalancer
public class RetryableFeignLoadBalancer extends FeignLoadBalancer implements ServiceInstanceChooser {
@Override
public RibbonResponse execute(final RibbonRequest request, IClientConfig configOverride)
throws IOException {
final Request.Options options;
if (configOverride != null) {
options = new Request.Options(
configOverride.get(CommonClientConfigKey.ConnectTimeout,
this.connectTimeout),
(configOverride.get(CommonClientConfigKey.ReadTimeout,
this.readTimeout)));
}
else {
options = new Request.Options(this.connectTimeout, this.readTimeout);
}
final LoadBalancedRetryPolicy retryPolicy = loadBalancedRetryPolicyFactory.create(this.getClientName(), this);
RetryTemplate retryTemplate = new RetryTemplate();
BackOffPolicy backOffPolicy = loadBalancedBackOffPolicyFactory.createBackOffPolicy(this.getClientName());
retryTemplate.setBackOffPolicy(backOffPolicy == null ? new NoBackOffPolicy() : backOffPolicy);
RetryListener[] retryListeners = this.loadBalancedRetryListenerFactory.createRetryListeners(this.getClientName());
if (retryListeners != null && retryListeners.length != 0) {
retryTemplate.setListeners(retryListeners);
}
retryTemplate.setRetryPolicy(retryPolicy == null ? new NeverRetryPolicy()
: new FeignRetryPolicy(request.toHttpRequest(), retryPolicy, this, this.getClientName()));
return retryTemplate.execute(new RetryCallback() {
//执行重试逻辑
@Override
public RibbonResponse doWithRetry(RetryContext retryContext) throws IOException {
Request feignRequest = null;
//on retries the policy will choose the server and set it in the context
//extract the server and update the request being made
if (retryContext instanceof LoadBalancedRetryContext) {
ServiceInstance service = ((LoadBalancedRetryContext) retryContext).getServiceInstance();
if (service != null) {
feignRequest = ((RibbonRequest) request.replaceUri(reconstructURIWithServer(new Server(service.getHost(), service.getPort()), request.getUri()))).toRequest();
}
}
if (feignRequest == null) {
feignRequest = request.toRequest();
}
Response response = request.client().execute(feignRequest, options);
if (retryPolicy.retryableStatusCode(response.status())) {
byte[] byteArray = response.body() == null ? new byte[]{} : StreamUtils.copyToByteArray(response.body().asInputStream());
response.close();
throw new RibbonResponseStatusCodeException(RetryableFeignLoadBalancer.this.clientName, response,
byteArray, request.getUri());
}
return new RibbonResponse(request.getUri(), response);
}
}, new RibbonRecoveryCallback() {
@Override
protected RibbonResponse createResponse(Response response, URI uri) {
return new RibbonResponse(uri, response);
}
});
}
}
Feign和Ribbon重试机制谁先重试谁后重试,其实不难回答!
关于他们的调用逻辑是这样的:发起请求时,HystrixInvocationHandler(HystrixCommand 【hystrix】))----》SynchronousMethodHandler(feign)----》FeignLoadBalancer (ribbon)----》LoadBalancerFeignClient (ribbon),如果你读过我之前分享的内容,有类似这样的认识,这个方向是对的。也就是说RetryableFeignLoadBalancer(ribbon)会先进行重试,如果重试不成功(抛异常),SynchronousMethodHandler(feign)会再执行重试。
如何验证呢?
验证也比较简单,突破口就是下面注释的部分(重点关注:client(feign.Client),那它和ribbon是和关系)。Client 提供了2个实现类,而SynchronousMethodHandler(feign)的client其实就是LoadBalancerFeignClient(ribbon)。
final class SynchronousMethodHandler implements MethodHandler {
Object executeAndDecode(RequestTemplate template) throws Throwable {
...
Response response;
long start = System.nanoTime();
try {
//重点关注:client(feign.Client),那它和ribbon是和关系
response = client.execute(request, options);
// ensure the request is set. TODO: remove in Feign 10
response.toBuilder().request(request).build();
...
}
}
public interface Client {
Response execute(Request request, Options options) throws IOException;
//feign提供默认的真正执行get/post请求的对象
public static class Default implements Client {
@Override
public Response execute(Request request, Options options) throws IOException {
HttpURLConnection connection = convertAndSend(request, options);
return convertResponse(connection).toBuilder().request(request).build();
}
}
}
//继承了ribbon的Client
public class LoadBalancerFeignClient implements Client {
//其实就是Client.Default
private final Client delegate;
@Override
public Response execute(Request request, Request.Options options) throws IOException {
...
IClientConfig requestConfig = getClientConfig(options, clientName);
return lbClient(clientName).executeWithLoadBalancer(ribbonRequest,
requestConfig).toResponse();
}
}
总结,建议关闭二者的重试功能,如果配置不当,会因为幂等请求带来数据问题。