Spring Cloud - Hystrix容错处理

在Spring Cloud构建的分布式系统里,不可避免地会出现服务调用失败的情况,如:超时、异常等。如何能够保证在一个依赖出问题的情况下,不会导致整体服务失败。Hystrix提供了服务降级、服务熔断、线程隔离、请求缓存、请求合并以及服务监控等强大功能,能够在一个、或多个依赖出现问题时保证系统依然可用。

Hystrix容错保护同样是在服务消费方完成的,只要对前文中的eureka-client-consumer稍作修改即可。

1.直接使用Hystrix

1.1 添加Hystrix依赖;
compile('org.springframework.cloud:spring-cloud-starter-netflix-eureka-client:2.0.0.RELEASE')
compile('org.springframework.cloud:spring-cloud-starter-netflix-ribbon:2.0.0.RELEASE')
compile('org.springframework.cloud:spring-cloud-starter-openfeign:2.0.0.RELEASE')
compile('org.springframework.cloud:spring-cloud-starter-netflix-hystrix:2.0.0.RELEASE')
1.2 修改Spring Boot启动类,添加@EnableHystrix或@EnableCircuitBreaker注解;
@EnableCircuitBreaker
@EnableFeignClients
@SpringBootApplication
class Application

fun main(args: Array) {
    runApplication(*args)
}

启动类中的注解实际上可以用@ SpringCloudApplication替代,因为@SpringCloudApplication包含了@SpringBootApplication
@EnableDiscoveryClient(可省略)、@EnableCircuitBreaker这三个注解,也说明了一个标准的Spring Cloud应用包含了Hystrix容错保护功能。

@Target({ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
@SpringBootApplication
@EnableDiscoveryClient
@EnableCircuitBreaker
public @interface SpringCloudApplication {
}
1.3 修改TestController,使用@HystrixCommand注解指定降级的方法
@RestController
class TestController {

    @Autowired
    private lateinit var testService: TestService

    @HystrixCommand(fallbackMethod = "fallback")
    @GetMapping("/test")
    fun test(): String? {
        return testService.test()
    }

    fun fallback(): String{
        return "fallback..."
    }
}
1.4 测试验证

分别启动eureka-server、eureka-client-provider、eureka-client-consumer实例,请求localhost:30001/test结果能正常返回;此时停掉eureka-client-provider,再次访问该请求,返回了fallack的结果;重新启动eureka-client-provider,又能正常返回请求结果。

2.Feign整合Hystrix

以上内容是直接使用Hystrix来处理服务容错,前面的文章中介绍过Feign声明式服务,而Feign对Hystrix也提供了支持,所以我们在Feign的基础上使用Hystrix就变得更为简单了。

2.1 添加依赖
compile('org.springframework.cloud:spring-cloud-starter-netflix-eureka-client:2.0.0.RELEASE')
compile('org.springframework.cloud:spring-cloud-starter-netflix-ribbon:2.0.0.RELEASE')
compile('org.springframework.cloud:spring-cloud-starter-openfeign:2.0.0.RELEASE')
compile('org.springframework.cloud:spring-cloud-starter-netflix-hystrix:2.0.0.RELEASE')
2.2 在Spring Boot启动类添加@SpringCloudApplication及@EnableFeignClients注解
@EnableFeignClients
@SpringCloudApplication
class Application

fun main(args: Array) {
    runApplication(*args)
}
2.3 在@FeignClient注解上指定Hystrix用于fallback的实现类,针对每个方法处理,并且去除原来的@HystrixCommand

TestController.kt

@RestController
class TestController {

    @Autowired
    private lateinit var testService: TestService

    @GetMapping("/test")
    fun test(): String? {
        return testService.test()
    }

}

TestService.kt

@FeignClient(value = "eureka-client-provider", fallback = TestServiceFallback::class, configuration = [(FeignLogConfiguration::class)])
interface TestService {

    @GetMapping("/test")
    fun test(): String

}

TestServiceFallback.kt

@Component
class TestServiceFallback : TestService {
    override fun test(): String {
        return "fallback"
    }
}
2.4 开启feign的hystrix支持,这一步特别重要
feign:
  hystrix:
    enabled: true
2.5 其它步骤与1中一致,实现的效果是一致的,只不过以一种更为统一、方便的方式通过Feign整合了Hystrix。

3.Hystrix的更多功能

3.1 依赖隔离

开发者在使用@HystrixCommand等注解的时候,实际上是使用了Hystrix的命令模式,通过命令模式实现对服务调用操作的封装,命令在一个独立线程中进行执行。

Hystrix为每个命令创建一个独立的线程池,这样即使某个依赖的服务出现异常,也只是对该依赖服务的调用产生影响,而不会影响其他的服务。

3.2 断路器

当某个服务的错误率超过一定阀值时,Hystrix可以触发断路机制,停止向该服务请求一段时间。阀值有几个指标:1.一定时间(默认10s)内错误一定数量(默认20次);2.请求错误数量超过一定百分比(默认50%)。

当某个服务的断路器打开后,Hystrix将不会请求至该服务,直接fallback,这样对于已经确定的故障在一定时间内不会再尝试。

3.3 自动恢复

当断路器打开一段时间后,Hystrix会进入"半开"状态,断路器会允许一个请求尝试对服务进行请求,如果该服务可以调用成功,则关闭断路器,否则将继续保持断路器打开,并进入倒计时,倒计时结束后继续尝试自动恢复。

4.Hystrix监控

实现Hystrix监控非常简单,
添加需要spring-boot-starter-actuator依赖,

compile('org.springframework.boot:spring-boot-starter-actuator')

并设置management.endpoints.web.exposure.include: hystrix.stream

management:
  endpoints:
    web:
      exposure:
        include: hystrix.stream

访问http://localhost:30001/actuator/hystrix.stream可以在页面上看到如下数据,这是对单机应用的监控。当然,我们需要访问提供的服务,才会出现这些统计数据。

data: {"type":"HystrixCommand","name":"test","group":"TestController","currentTime":1513135304152,"isCircuitBreakerOpen":false,"errorPercentage":33,"errorCount":1,"requestCount":3,"rollingCountBadRequests":0,"rollingCountCollapsedRequests":0,"rollingCountEmit":0,"rollingCountExceptionsThrown":0,"rollingCountFailure":0,"rollingCountFallbackEmit":0,"rollingCountFallbackFailure":0,"rollingCountFallbackMissing":0,"rollingCountFallbackRejection":0,"rollingCountFallbackSuccess":1,"rollingCountResponsesFromCache":0,"rollingCountSemaphoreRejected":0,"rollingCountShortCircuited":0,"rollingCountSuccess":2,"rollingCountThreadPoolRejected":0,"rollingCountTimeout":1,"currentConcurrentExecutionCount":0,"rollingMaxConcurrentExecutionCount":1,"latencyExecute_mean":0,"latencyExecute":{"0":0,"25":0,"50":0,"75":0,"90":0,"95":0,"99":0,"99.5":0,"100":0},"latencyTotal_mean":0,"latencyTotal":{"0":0,"25":0,"50":0,"75":0,"90":0,"95":0,"99":0,"99.5":0,"100":0},"propertyValue_circuitBreakerRequestVolumeThreshold":20,"propertyValue_circuitBreakerSleepWindowInMilliseconds":5000,"propertyValue_circuitBreakerErrorThresholdPercentage":50,"propertyValue_circuitBreakerForceOpen":false,"propertyValue_circuitBreakerForceClosed":false,"propertyValue_circuitBreakerEnabled":true,"propertyValue_executionIsolationStrategy":"THREAD","propertyValue_executionIsolationThreadTimeoutInMilliseconds":1000,"propertyValue_executionTimeoutInMilliseconds":1000,"propertyValue_executionIsolationThreadInterruptOnTimeout":true,"propertyValue_executionIsolationThreadPoolKeyOverride":null,"propertyValue_executionIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_fallbackIsolationSemaphoreMaxConcurrentRequests":10,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000,"propertyValue_requestCacheEnabled":true,"propertyValue_requestLogEnabled":true,"reportingHosts":1,"threadPool":"TestController"}

data: {"type":"HystrixThreadPool","name":"TestController","currentTime":1513135304152,"currentActiveCount":0,"currentCompletedTaskCount":3,"currentCorePoolSize":10,"currentLargestPoolSize":3,"currentMaximumPoolSize":10,"currentPoolSize":3,"currentQueueSize":0,"currentTaskCount":3,"rollingCountThreadsExecuted":2,"rollingMaxActiveThreads":1,"rollingCountCommandRejections":0,"propertyValue_queueSizeRejectionThreshold":5,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000,"reportingHosts":1}

ping: 

5.Hystrix Dashboard

Hystrix统计的监控数据其实不是很友好,所以Hystrix Dashboard提供了可视化的界面来展示统计数据。

Hystrix Dashboard支持三种监控方式:

  • 默认的集群监控:http://turbine-hostname:port/turbine.stream
  • 指定的集群监控:http://turbine-hostname:port/turbine.stream?cluster=[clusterName]
  • 单体应用的监控:http://hystrix-app:port/hystrix.stream
5.1 单体应用监控

1.新建一个Spring Boot应用hystrix-dashboard,添加Hystrix相关依赖

compile('org.springframework.cloud:spring-cloud-starter-netflix-hystrix-dashboard:2.0.0.RELEASE')

2.在Spring Boot启动类添加@EnableHystrixDashboard注解

@EnableHystrixDashboard
@SpringBootApplication
class Application

fun main(args: Array) {
    runApplication(*args)
}

3.在application.properties中添加配置

spring:
  application:
    name: hystrix-dashboard

4.启动应用,访问http://localhost:30001/hystrix,进入Hystrix Dashboard页面

Spring Cloud - Hystrix容错处理_第1张图片
Hystrix Dashboard.png

5.在Hystrix Dashboard页面第一行输入前面提到的地址http://localhost:30001/actuator/hystrix.stream,点击“Monitor Stream”按钮,就能进入详细的数据统计页面(如下图,这里用了之前的截图,端口号略有差别),调用相应的服务后页面会刷新监控数据。

Spring Cloud - Hystrix容错处理_第2张图片
hystrix-dashboard.png
5.2 集群监控

To Be Continued...

你可能感兴趣的:(Spring Cloud - Hystrix容错处理)