近期在项目中引入了Hystrix,用来做系统的降级,但是偶发性的会发生的HystrixRuntimeException错误,于是自己静下来研究一下这个错误。。带着问题去看源码应该快点
关于这个问题老外们早就讨论过。而官方的接口文档对于什么时候会发生错误就一句不清不楚的话
RuntimeException that is thrown when a HystrixCommand
fails and does not have a fallback.
老外们也开始叽叽歪歪的讨论了:https://github.com/Netflix/Hystrix/issues/1357
总结了几句话:
This means that too many fallbacks were executing concurrently. Since the goal of Hystrix is to protect application threads, and fallbacks may run on the caller thread, Hystrix needs to limit the amount of concurrent fallbacks. Whenever this situation occurs, the fallback is not run, and an exception is returned to the caller.
In general, the purpose of a fallback is to provide substitute work that is less costly than the original work. Here are 2 options for proceeding:
Use a non-network fallback. That will be fast and no fallback concurrency limits will be hit
Wrap the fallback in its own Hystrix command. In this case, you're still getting protection when the original run() fails and the fallback path gets entered. You will still need to defined a fallback for this new command, and it should do little work to avoid problems as well.
fallback的并发的太多了。因为Hystrix的主要目标是保护应用线程,而且fallbacks也有可能在调用者线程中,,Hystrix需要限制并发的数量,当达到数量时无论多少fallback都不再运行了,就直接返回HystrixRuntimeException。
又有一个老外提供了暴力的解决方案,你不是限制数量吗?好好我就给你设个天大的数看你还限制不
.withFallbackIsolationSemaphoreMaxConcurrentRequests(Integer.MAX_VALUE)
.withExecutionIsolationSemaphoreMaxConcurrentRequests(Integer.MAX_VALUE)
这个时候,上面总结的老外不同意了,你这么暴力真的好吗??我给你出几个主意
1.要不你就直接自己返回一个不需要通过其他服务的stubed 对象,简单的说就是new 一个新对象填充上你的请求参数。。填上默认值
2.如果有主从关心的你可以自己分别包装HystrixCommand,实在不行再fallback简单的。。
在生产中我们团队也碰到了一个HystrixRuntimeException:异常栈如下
RuntimeException com.netflix.hystrix.exception.HystrixRuntimeException ERROR SimpleMessage[message=poiInfoUtil.findPoiByThrift Exception] com.netflix.hystrix.exception.HystrixRuntimeException: CmdBatchQueryPartnerListByPoiId timed-out and fallback failed.
at com.netflix.hystrix.AbstractCommand$22.call(AbstractCommand.java:815)
at com.netflix.hystrix.AbstractCommand$22.call(AbstractCommand.java:790)
at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4.onError(OperatorOnErrorResumeNextViaFunction.java:139)
at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:71)
at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:71)
at com.netflix.hystrix.AbstractCommand$DeprecatedOnFallbackHookApplication$1.onError(AbstractCommand.java:1451)
at com.netflix.hystrix.AbstractCommand$FallbackHookApplication$1.onError(AbstractCommand.java:1376)
at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:71)
at rx.observers.Subscribers$5.onError(Subscribers.java:224)
at rx.Observable$ThrowObservable$1.call(Observable.java:10200)
at rx.Observable$ThrowObservable$1.call(Observable.java:10190)
at rx.Observable.unsafeSubscribe(Observable.java:8314)
at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:51)
at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35)
at rx.Observable$2.call(Observable.java:162)
at rx.Observable$2.call(Observable.java:154)
at rx.Observable$2.call(Observable.java:162)
at rx.Observable$2.call(Observable.java:154)
at rx.Observable$2.call(Observable.java:162)
at rx.Observable$2.call(Observable.java:154)
at rx.Observable$2.call(Observable.java:162)
at rx.Observable$2.call(Observable.java:154)
at rx.Observable$2.call(Observable.java:162)
at rx.Observable$2.call(Observable.java:154)
at rx.Observable$2.call(Observable.java:162)
at rx.Observable$2.call(Observable.java:154)
at rx.Observable$2.call(Observable.java:162)
at rx.Observable$2.call(Observable.java:154)
at rx.Observable$2.call(Observable.java:162)
at rx.Observable$2.call(Observable.java:154)
at rx.Observable.unsafeSubscribe(Observable.java:8314)
at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4.onError(OperatorOnErrorResumeNextViaFunction.java:141)
at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:71)
at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:71)
at com.netflix.hystrix.AbstractCommand$HystrixObservableTimeoutOperator$1.run(AbstractCommand.java:1121)
at com.netflix.hystrix.strategy.concurrency.HystrixContextRunnable$1.call(HystrixContextRunnable.java:41)
at com.netflix.hystrix.strategy.concurrency.HystrixContextRunnable$1.call(HystrixContextRunnable.java:37)
at com.meituan.service.hotel.api.command.hystrixConfig.HystrixCommandConfig$CatWrapHystrixConcurrencyStrategy$2.call(HystrixCommandConfig.java:62)
at com.netflix.hystrix.strategy.concurrency.HystrixContextRunnable.run(HystrixContextRunnable.java:57)
at com.netflix.hystrix.AbstractCommand$HystrixObservableTimeoutOperator$2.tick(AbstractCommand.java:1138)
at com.netflix.hystrix.util.HystrixTimer$1.run(HystrixTimer.java:99)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
at com.netflix.hystrix.AbstractCommand.handleTimeoutViaFallback(AbstractCommand.java:980)
at com.netflix.hystrix.AbstractCommand.access$500(AbstractCommand.java:59)
at com.netflix.hystrix.AbstractCommand$12.call(AbstractCommand.java:595)
at com.netflix.hystrix.AbstractCommand$12.call(AbstractCommand.java:587)
at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4.onError(OperatorOnErrorResumeNextViaFunction.java:139)
... 16 more