Spring Cloud Sleuth/Zipkin分布式链路追踪

Sleuth

参考：https://spring.io/projects/spring-cloud-sleuth
Spring Cloud Sleuth provides Spring Boot auto-configuration for distributed tracing.

Sleuth Maven配置

Sleuth Maven 配置代码如下所示。


    org.springframework.cloud
    spring-cloud-starter-sleuth

Sleuth主要特性

Sleuth configures everything you need to get started. This includes where trace data (spans) are reported to, how many traces to keep (sampling), if remote fields (baggage) are sent, and which libraries are traced.

Specifically, Spring Cloud Sleuth…

Adds trace and span ids to the Slf4J MDC, so you can extract all the logs from a given trace or span in a log aggregator.
Instruments common ingress and egress points from Spring applications (servlet filter, rest template, scheduled actions, message channels, feign client).
If spring-cloud-sleuth-zipkin is available then the app will generate and report Zipkin-compatible traces via HTTP. By default it sends them to a Zipkin collector service on localhost (port 9411). Configure the location of the service using spring.zipkin.baseUrl.

Sleuth主要配置

sleuth:
    enabled: true
    sampler:
      probability: 1.0
    http:
      legacy:
        enabled: true

在实际使用中可能调用了 10 次接口，但是 Zipkin 中只有一条数据，这是因为收集信息是有一定比例的，这并不是 bug。Zipkin 中的数据条数与调用接口次数默认比例是 0.1，当然我们也可以通过上述配置来修改这个比例值。之所以有这样的一个配置，是因为在高并发下，如果所有数据都采集，那这个数据量就太大了，采用抽样的做法可以减少一部分数据量，特别是对于 Http 方式去发送采集数据，对性能有很大的影响。

Sleuth日志格式

2021-01-27 14:03:35.632 DEBUG [gateway-web,793bc0eca122f81d,793bc0eca122f81d,true] 4012 --- [ioEventLoop-4-2] Limiter$$EnhancerBySpringCGLIB$$1d620e4c : 
2021-01-27 14:03:35.633 DEBUG [gateway-web,793bc0eca122f81d,793bc0eca122f81d,true] 4012 --- [ioEventLoop-4-2] c.s.c.g.filter.AccessGatewayFilter       :

观察上述输出日志，我们会发现在日志的最前面加了一部分内容，这部分内容就是 Sleuth 为服务直接提供的链路信息。

可以看到内容是由 [appname，traceId，spanId，exportable] 组成的，具体含义如下：
appname：服务的名称，也就是 spring.application.name 的值（ex: gateway-web）。
traceId：整个请求的唯一 ID，它标识整个请求的链路（ex: 793bc0eca122f81d）。
spanId：基本的工作单元，发起一次远程调用就是一个 span（ex:793bc0eca122f81d）。
exportable：决定是否导入数据到 Zipkin 中（ex: true）。

Zipkin

参考：http://c.biancheng.net/view/5496.html
Zipkin 是 Twitter 的一个开源项目，是一个致力于收集所有服务的监控数据的分布式跟踪系统，它提供了收集数据和查询数据两大接口服务。有了 Zipkin 我们就可以很直观地对调用链进行查看，并且可以很方便地看出服务之间的调用关系以及调用耗费的时间。

Zipkin Maven配置

注意：该依赖中包含了Sleuth。不用显示添加Sleuth依赖。


    org.springframework.cloud
    spring-cloud-starter-zipkin

Sleuth和Zipkin的关系

Sleuth和Zipkin通常搭配着使用。配置好Zipkin之后，Sleuth会把日志信息发送到Zipkin服务器。当Sleuth决定把日志信息导入到Zipkin的时候，exportable参数为true。

Zipkin主要配置

Sleuth默认通过http发送日志信息给Zipkin。如果type指定为web,则使用http发送日志，需指定Zipkin服务器（配置base-url参数）
虽然有基于采样的收集方式，但是数据的发送采用 Http 还是对性能有影响。如果 Zipkin 的服务端重启或者挂掉了，那么将丢失部分采集数据。为了解决这些问题，我们将集成 RabbitMq 来发送采集数据，利用消息队列来提高发送性能，保证数据不丢失（当然也可以使用kafka）。

 zipkin:
    # base-url: http://10.106.11.151:9411
    enabled: true
    sender:
      type: rabbit

Zipkin的日志存储

zipkin支持mem，MySQL，ES存储方式。一般来说，分布式的链路跟踪数据是比较大量的，建议采用ES来存储，方便支持分区，以及后期的扩展等，比如使用某些字段来存储非结构化数据。

Zipkin的安装

建议使用docker安装，下面的示例使用rabbitmq作为传输通道。es作为数据存储。

docker run -d -p 9411:9411 --restart=always -e STORAGE_TYPE=elasticsearch \
-e ES_HOSTS=*.*.*.*:9200 -e RABBIT_ADDRESSES=*.*.*.*:5672 -e RABBIT_USER=admin \
-e RABBIT_PASSWORD=admin openzipkin/zipkin

也可以使用添加maven依赖作为一个微服务部署


            io.zipkin.java
            zipkin-server
            2.12.3
            
                
                    org.apache.logging.log4j
                    log4j-slf4j-impl
                
            
        
        
            io.zipkin.java
            zipkin-autoconfigure-ui
            2.12.3

@SpringBootApplication
@EnableDiscoveryClient
@EnableZipkinServer
public class ZipkinApplication {
    public static void main(String[] args) {
        SpringApplication.run(ZipkinApplication.class, args);
    }
}

spring:
  application:
    name: zipkin
  sleuth:
    sampler:
      probability: 1.0
  profiles: test
  main:
    allow-bean-definition-overriding: true
  rabbitmq:
    host: ${RABBIT_MQ_HOST:*.*.*.*}
    port: ${RABBIT_MQ_PORT:5672}
    username: ${RABBIT_MQ_USERNAME:admin}
    password: ${RABBIT_MQ_PASSWORD:admin}
management:
  metrics:
    web:
      server:
        auto-time-requests: false

部署完成之后，可以在浏览器输入: http://{host}:9411查看或者搜索Zipkin的链路追踪信息。