Retryer
接口RetryableException
异常FeignException
异常通过下面的源码,Retry
接口继承了Cloneable
接口。
Retry
接口里面有一个方法叫continueOrPropagate
,参数是一个RetryableException
重试异常的对象,返回值为void
类型
Retry
接口还有 一个clone()
方法,返回类型是Retryer
该接口里面有个静态内部类Default
,并且实现了Retryer
接口
源码如下:
package feign;
import static java.util.concurrent.TimeUnit.SECONDS;
对于克隆每次调用`Client.execute(Request, Request.Options)` 实现可以保持状态,以确定是否重试操作应该继续。
public interface Retryer extends Cloneable {
/**
* 如果重试被允许,返回(睡觉后可能)。 否则传播例外。
*/
void continueOrPropagate(RetryableException e);
Retryer clone();
public static class Default implements Retryer {
// 最大重试次数
private final int maxAttempts;
// 重试的间隔
private final long period;
// 最大重试间隔
private final long maxPeriod;
int attempt;
long sleptForMillis;
// Default类的默认无参构造函数,
// 重试间隔100 ms,最大重试间隔1s,最大重试次数默认5次
public Default() {
this(100, SECONDS.toMillis(1), 5);
}
// 重试间隔,最大重试间隔,最大重试次数,attempt默认是1
public Default(long period, long maxPeriod, int maxAttempts) {
this.period = period;
this.maxPeriod = maxPeriod;
this.maxAttempts = maxAttempts;
this.attempt = 1;
}
// visible for testing;
protected long currentTimeMillis() {
return System.currentTimeMillis();
}
// 重写了Retryer的方法continueOrPropagate
public void continueOrPropagate(RetryableException e) {
// 如果重试的次数attempt大于最大重试次数,则抛出重试异常对象RetryableException
if (attempt++ >= maxAttempts) {
throw e;
}
long interval;
if (e.retryAfter() != null) {
interval = e.retryAfter().getTime() - currentTimeMillis();
if (interval > maxPeriod) {
interval = maxPeriod;
}
if (interval < 0) {
return;
}
} else {
interval = nextMaxInterval();
}
try {
Thread.sleep(interval);
} catch (InterruptedException ignored) {
Thread.currentThread().interrupt();
}
sleptForMillis += interval;
}
/**
* 计算时间间隔为重试尝试。 的间隔呈指数增加每次尝试,在nextInterval * = 1.5(其中,1.5是回退因子)的速率,在最大间隔。
* @return 时间从现在纳秒,直到下一次尝试。
*/
long nextMaxInterval() {
long interval = (long) (period * Math.pow(1.5, attempt - 1));
return interval > maxPeriod ? maxPeriod : interval;
}
@Override
public Retryer clone() {
return new Default(period, maxPeriod, maxAttempts);
}
}
/**
* 实现永不重试请求。 它传播RetryableException
*/
Retryer NEVER_RETRY = new Retryer() {
@Override
public void continueOrPropagate(RetryableException e) {
throw e;
}
@Override
public Retryer clone() {
return this;
}
};
}
FeignException
,也是一个RuntimeException
Long
类型的变量retryAfter
RetryableException(String message, Throwable cause, Date retryAfter)
RetryableException(String message, Date retryAfter)
retryAfter
,会返回一个Date
类型源码如下:
package feign;
import java.util.Date;
/**
* 当引发此异常Response被认为是可重试,通常经由feign.codec.ErrorDecoder当status是503
*/
public class RetryableException extends FeignException {
private static final long serialVersionUID = 1L;
private final Long retryAfter;
/**
* retryAfter -通常对应于Util.RETRY_AFTER报头。
*/
public RetryableException(String message, Throwable cause, Date retryAfter) {
super(message, cause);
this.retryAfter = retryAfter != null ? retryAfter.getTime() : null;
}
/**
* retryAfter -通常对应于Util.RETRY_AFTER报头。
*/
public RetryableException(String message, Date retryAfter) {
super(message);
this.retryAfter = retryAfter != null ? retryAfter.getTime() : null;
}
/**
* http->503 服务不可用
* 有时对应于Util.RETRY_AFTER存在于报头503的状态。 其他的时间就从专用响应解析。 空如果不明
*/
public Date retryAfter() {
return retryAfter != null ? new Date(retryAfter) : null;
}
}
RuntimeException
int
类型的私有变量status
,用来表示HTTP
的状态码errorReading(Request request, Response ignored, IOException cause)
errorStatus(String methodKey, Response response)
errorExecuting(Request request, IOException cause)
I/O
类的可以进行重试,404无重试效果源码如下:
package feign;
import java.io.IOException;
import static java.lang.String.format;
public class FeignException extends RuntimeException {
private static final long serialVersionUID = 0;
// HTTP status
private int status;
protected FeignException(String message, Throwable cause) {
super(message, cause);
}
protected FeignException(String message) {
super(message);
}
protected FeignException(int status, String message) {
super(message);
this.status = status;
}
public int status() {
return this.status;
}
static FeignException errorReading(Request request, Response ignored, IOException cause) {
return new FeignException(
format("%s reading %s %s", cause.getMessage(), request.method(), request.url()),
cause);
}
public static FeignException errorStatus(String methodKey, Response response) {
String message = format("status %s reading %s", response.status(), methodKey);
try {
if (response.body() != null) {
String body = Util.toString(response.body().asReader());
message += "; content:\n" + body;
}
} catch (IOException ignored) { // NOPMD
}
return new FeignException(response.status(), message);
}
static FeignException errorExecuting(Request request, IOException cause) {
return new RetryableException(
format("%s executing %s %s", cause.getMessage(), request.method(), request.url()), cause,
null);
}
}
在上面的介绍中,可以知道Retryer
接口,Default
类,重试异常类RetryerException
,我们可以通过重写Retryer
接口的方法continueOrPropagate
来实现重试,比如:
@Slf4j
public class ConnectTimeoutRetryer extends Retryer.Default {
Supplier<Stream<String>> streamSupplier = () -> Stream.of("connect timed out");
public ConnectTimeoutRetryer(){
super();
}
@Override
public void continueOrPropagate(RetryableException e) {
// 在kibana上可以分析prd上由于feign超时,都会在cause里面有connect time out关键字,因此这里做判断,如果异常原因里面都不是connect time out的,会打印ConnectTimeoutRetryerFeign failed,并抛出RetryableException对象e
if (streamSupplier.get().noneMatch(x -> e.getCause().getMessage().contains(x))) {
log.warn("ConnectTimeoutRetryerFeign failed", e);
throw e;
}
log.error("begin to retry:{} ,{}" , e.getMessage(), e);
super.continueOrPropagate(e);
}
//重写retryer的clone方法
@Override
public Retryer clone() {
return new ConnectTimeoutRetryer();
}
}
我们这个方案,主要是解决,各个微服务的feign
调用之间超时问题,比如网络不稳定等原因导致的。
下面是重试时的堆栈信息:
2020-05-28 21:17:08,954 [hystrix-zis-zzzz-193] ERROR [com.xxxx.common.service.share.feign.ConnectTimeoutRetryer] [?:?] [trace=xxx,span=xxx] - begin to retry:connect timed out executing POST http://xxx.com/search/rrr ,{} feign.RetryableException: connect timed out executing POST http://xxx.com/search/rrr at feign.FeignException.errorExecuting(FeignException.java:67) at feign.SynchronousMethodHandler.executeAndDecode(SynchronousMethodHandler.java:104) at feign.SynchronousMethodHandler.invoke(SynchronousMethodHandler.java:76) at feign.hystrix.HystrixInvocationHandler$1.run(HystrixInvocationHandler.java:108) at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:302) at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:298) at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:46) at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) at rx.Observable.unsafeSubscribe(Observable.java:10211) at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:51) at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35) at rx.Observable.unsafeSubscribe(Observable.java:10211) at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:41) at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:30) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) at rx.Observable.unsafeSubscribe(Observable.java:10211) at rx.internal.operators.OperatorSubscribeOn$1.call(OperatorSubscribeOn.java:94) at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:56) at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:47) at org.springframework.cloud.sleuth.instrument.hystrix.SleuthHystrixConcurrencyStrategy$HystrixTraceCallable.call(SleuthHystrixConcurrencyStrategy.java:188) at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:69) at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:463) at sun.net.www.http.HttpClient.openServer(HttpClient.java:558) at sun.net.www.http.HttpClient.<init>(HttpClient.java:242) at sun.net.www.http.HttpClient.New(HttpClient.java:339) at sun.net.www.http.HttpClient.New(HttpClient.java:357) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1220) at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:984) at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1334) at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1309) at feign.Client$Default.convertAndSend(Client.java:133) at feign.Client$Default.execute(Client.java:73) at org.springframework.cloud.sleuth.instrument.web.client.feign.TraceFeignClient.execute(TraceFeignClient.java:92) at feign.SynchronousMethodHandler.executeAndDecode(SynchronousMethodHandler.java:97) ... 32 common frames omitted
缺点:该方案是可以解决各个微服务之间feign
调用超时的问题,但是Supplier
灵活度不够,只有堆栈cause
中有connect time out
的时候才会抛出重试异常RetryerException
去进行重试。