源码链接:https://github.com/simba1949/springcloud-learn
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-starter-netflix-eureka-clientartifactId>
dependency>
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-starter-sleuthartifactId>
dependency>
spring.application.name=spring-cloud-sleuth-service
server.port=7000
eureka.instance.hostname=localhost
eureka.client.service-url.defaultZone=http://${eureka.instance.hostname}:8761/eureka
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-starter-netflix-eureka-clientartifactId>
dependency>
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-starter-openfeignartifactId>
dependency>
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-starter-sleuthartifactId>
dependency>
spring.application.name=spring-cloud-sleuth-client
server.port=8081
eureka.instance.hostname=localhost
eureka.client.service-url.defaultZone=http://${eureka.instance.hostname}:8761/eureka
# 需要配置超时时间,否则会请求多个服务,返回错误
ribbon.ConnectTimeout=5000
hystrix.command.default.execution.isolation.thread.tiemoutInMilliseconds=5000
# 打开熔断器
feign.hystrix.enabled=true
# 客户端请求日志
2019-08-03 12:01:19.352 INFO [spring-cloud-sleuth-client,b5e10e3c4bf53798,b5e10e3c4bf53798,false] 31604 --- [nio-8081-exec-9] top.simba1949.controller.UserController : the quest coming
# 服务端请求日志
2019-08-03 12:01:19.356 WARN [spring-cloud-sleuth-service,b5e10e3c4bf53798,c8c228cc3683ecdb,false] 8316 --- [nio-7000-exec-2] top.simba1949.controller.UserController : the request's data of UserController-getUser is User(id=null, username=null, birthday=null)
上面的四个值得 Trace ID 和 Span ID 是 SpringCloudSleuth 实现分布式服务跟踪的核心。在一次服务请求链路的调用过程中,会保持并传递同一个 Trace ID,从而将整个分布于不同微服务进程中的请求跟踪信息串联起来。在上面输出的内容为例,spring-cloud-sleuth-client 和 spring-cloud-sleuth-service 同属于一个前端服务请求来源,所以他们的 Trace ID 是相同的,处于同一条请求链路中。
分布式系统中的服务跟踪在理论上并不复杂,它主要包括下面两个关键点。
在实战演练中,在 spring-cloud-sleuth-client 发送到 spring-cloud-sleuth-service 之前,Sleuth 会在该请求的 Header 中增加实现跟踪需要的重要信息,主要有下面几个
log.info("X-B3-TraceId is {}, X-B3-SpanId is {}, X-B3-ParentSpanId is {}, X-B3-Sampled is {}, X-Span-Name is {}",
request.getHeader("X-B3-TraceId"),
request.getHeader("X-B3-SpanId"),
request.getHeader("X-B3-ParentSpanId"),
request.getHeader("X-B3-Sampled"),
request.getHeader("X-Span-Name")
);
# 抽样比例, 0.1 代表 10% 的请求跟踪信息
spring.sleuth.sampler.probability=0.1
由于日志文件都离散地存储在各个服务实例的文件系统之上,仅仅通过查看日志文件来分析请求链路依然是件相当麻烦的事情,所以需要引入一些工具帮忙集中收集、存储和搜索这些跟踪信息。引入基于日志的分析系统是一个不错的选择,ELK。
ELK 平台主要有 ElasticSearch、Logstash 和 Kibana 三个开源工具组成
<dependency>
<groupId>net.logstash.logbackgroupId>
<artifactId>logstash-logback-encoderartifactId>
<version>6.1version>
dependency>
<configuration scan="true" scanPeriod="60 seconds" debug="false">
<include resource="org/springframework/boot/logging/logback/defaults.xml"/>
<springProperty scope="context" name="springAppName" source="spring.application.name"/>
<property name="LOG_FILE" value="${BUILD_FOLDER:-build}/${springAppName}"/>
<property name="CONSOLE_LOG_PATTERN"
value="%clr(%d{yyyy-MM-dd HH:mm:ss.SSS}){faint} %clr(${LOG_LEVEL_PATTERN:-%5p}) %clr([${springAppName:-},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-},%X{X-Span-Export:-}]){yellow} %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n${LOG_EXCEPTION_CONVERSION_WORD:-%wEx}"/>
<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFOlevel>
filter>
<encoder>
<pattern>${CONSOLE_LOG_PATTERN}pattern>
<charset>utf8charset>
encoder>
appender>
<appender name="logstash" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${LOG_FILE}.jsonfile>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>${LOG_FILE}.json.%d{yyyy-MM-dd}.gzfileNamePattern>
<maxHistory>7maxHistory>
rollingPolicy>
<encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
<providers>
<timestamp>
<timeZone>UTCtimeZone>
timestamp>
<pattern>
<pattern>
{
"severity": "%level",
"service": "${springAppName:-}",
"trace": "%X{X-B3-TraceId:-}",
"span": "%X{X-B3-SpanId:-}",
"exportable": "%X{X-Span-Export:-}",
"pid": "${PID:-}",
"thread": "%thread",
"class": "%logger{40}",
"rest": "%message"
}
pattern>
pattern>
providers>
encoder>
appender>
<root level="INFO">
<appender-ref ref="console"/>
<appender-ref ref="logstash"/>
root>
configuration>
ELK 平台提供的收集、存储、搜索等强大功能,我们对跟踪信息的管理和使用已经变得非常便利。但是,在 ELK 平台中的数据分析维度缺少对请求链路中各个阶段时间延迟的关注,很多时候我们追溯请求链路的一个原因是为了找到调用链路中出现延迟过高的瓶颈源,或为实现对分布式系统做延迟监控等与时间消耗相关的需求。需要引入 Zipkin 得以轻松整合。
Zipkin 的基础架构,有四个核心组件构成:
SpringCloud 不推荐通过 SpringCloud & SpringBoot 构建 Zipkin Server 服务。
官网推荐的构建方式:https://zipkin.io/pages/quickstart
docker 方式构建
docker run -d -p 9411:9411 openzipkin/zipkin
java 方式构建,至少java8
curl -sSL https://zipkin.io/quickstart.sh | bash -s
java -jar zipkin.jar
从源码构建
# get the latest source
git clone https://github.com/openzipkin/zipkin
cd zipkin
# Build the server and also make its dependencies
./mvnw -DskipTests --also-make -pl zipkin-server clean install
# Run the server
java -jar ./zipkin-server/target/zipkin-server-*exec.jar
pom
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-starter-netflix-eureka-clientartifactId>
dependency>
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-starter-sleuthartifactId>
dependency>
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-sleuth-zipkinartifactId>
dependency>
配置文件
application.properties
spring.application.name=spring-cloud-sleuth-zipkin-service
server.port=7000
eureka.instance.hostname=localhost
eureka.client.service-url.defaultZone=http://${eureka.instance.hostname}:8761/eureka
# 抽样比例, 0.1 代表 10% 的请求跟踪信息
spring.sleuth.sampler.probability=1.0
# 配置 zipkin 地址
spring.zipkin.base-url=http://192.168.128.5:9411
logback-spring.xml
<configuration scan="true" scanPeriod="60 seconds" debug="false">
<include resource="org/springframework/boot/logging/logback/defaults.xml"/>
<springProperty scope="context" name="springAppName" source="spring.application.name"/>
<property name="CONSOLE_LOG_PATTERN"
value="%clr(%d{yyyy-MM-dd HH:mm:ss.SSS}){faint} %clr(${LOG_LEVEL_PATTERN:-%5p}) %clr([${springAppName:-},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-},%X{X-Span-Export:-}]){yellow} %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n${LOG_EXCEPTION_CONVERSION_WORD:-%wEx}"/>
<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFOlevel>
filter>
<encoder>
<pattern>${CONSOLE_LOG_PATTERN}pattern>
<charset>utf8charset>
encoder>
appender>
<root level="INFO">
<appender-ref ref="console"/>
root>
configuration>
pom
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-starter-netflix-eureka-clientartifactId>
dependency>
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-starter-openfeignartifactId>
dependency>
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-starter-sleuthartifactId>
dependency>
<dependency>
<groupId>org.springframework.cloudgroupId>
<artifactId>spring-cloud-sleuth-zipkinartifactId>
dependency>
配置文件
application.properties
server.port=8081
eureka.instance.hostname=localhost
eureka.client.service-url.defaultZone=http://${eureka.instance.hostname}:8761/eureka
# 需要配置超时时间,否则会请求多个服务,返回错误
ribbon.ConnectTimeout=5000
hystrix.command.default.execution.isolation.thread.tiemoutInMilliseconds=5000
# 打开熔断器
feign.hystrix.enabled=true
# 抽样比例, 0.1 代表 10% 的请求跟踪信息
spring.sleuth.sampler.probability=1.0
# 配置 zipkin 地址
spring.zipkin.base-url=http://192.168.128.5:9411
bootstrap.properties
spring.application.name=spring-cloud-sleuth-zipkin-client
logback-spring.xml
<configuration scan="true" scanPeriod="60 seconds" debug="false">
<include resource="org/springframework/boot/logging/logback/defaults.xml"/>
<springProperty scope="context" name="springAppName" source="spring.application.name"/>
<property name="CONSOLE_LOG_PATTERN"
value="%clr(%d{yyyy-MM-dd HH:mm:ss.SSS}){faint} %clr(${LOG_LEVEL_PATTERN:-%5p}) %clr([${springAppName:-},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-},%X{X-Span-Export:-}]){yellow} %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n${LOG_EXCEPTION_CONVERSION_WORD:-%wEx}"/>
<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFOlevel>
filter>
<encoder>
<pattern>${CONSOLE_LOG_PATTERN}pattern>
<charset>utf8charset>
encoder>
appender>
<root level="INFO">
<appender-ref ref="console"/>
root>
configuration>
lter.ThresholdFilter">
INFO
${CONSOLE_LOG_PATTERN}
utf8
```