在较大的web集群和微服务环境中,客户端的一次请求可能需要经过多个不同的模块,多个不同中间件,多个不同的机器一起相互协作才能处理完成客户端请求,而在这一系列的请求过程之中,处理流程可能是串行执行,也可能是并行执行.那么如何确定客户端的一次请求到结束背后究竟调用了哪些应用以及哪些模块并经过了哪些节点,并且每个模块的调用先后顺序是怎么样的,每个模块的处理响应性能如何?后期随着业务系统的不断增多,业务处理逻辑会越来越复杂,而分布式系统中急需要一套链路追踪(Trace)系统来解决这个问题,从而让运维人员对整个业务系统一目了然,了如指掌.
分布式服务追踪系统是整个分布式系统中跟踪一个用户请求的完整过程,包过数据采集,数据传输,数据存储,数据分析和数据可视化,获取并存储和分享此类追踪可以让运维清晰了解用户请求与业务系统交互背后的整个调用链的调用关系,链路追踪系统是针对调试和监控微服务不可或缺的帮手
Dapper是google 2008年开始内部使用的链路追踪系统.
分布式追踪方法:
黑盒法(black-box)
无需任何侵入代码,它的优势在于无需修改代码,缺点在于记录不是很精确,且需要大量数据才能推导出服务间的关系.
标记法(annotation-based)
需要为每个请求打标记,并通过一个全局标识符将请求途径的所有服务信息串联,复盘整个链路,标记法记录准确,但他的缺点也很明显,需要将标记代码注入到每个服务中.
Span代表系统中具有开始时间和执行时长的请求跨度,span之间通过嵌入或者顺序排列建立逻辑因果关系.
任何一个Span可以包含来自不同的主机信息,这些也要记录下来.事实上每一个RPC Span可以包含客户端和服务器两个过程注释.由于客户端和服务器上的时间戳来自不同主机,还必须考虑到时间偏差,在分析工具就利用了时间偏差,即RPC客户端发送一个请求之后服务端才能收到,对应响应也是一样的.这样一来服务器的RPC就有一个时间戳的一个开始和结束,然后就计算出时间消耗.
Dapper跟踪记录和收集管道的过程分为三阶段:
一个跟踪被设计成Bigtable中的一行,每一列相当于一个Span.Bigtable的支持稀疏表格布局正适合这种情况,因为每一次跟踪可以有任意多个span.
Dapper资源占用很小
Dapper守护进程CPU使用率从来没超过0.3%单核CPU.而且只有少量的内存使用,另外还限制了Dapper守护进程内核scheduler最低的优先级,以防在一台高负载的服务器上发生cpu竞争.
一个span在仓库传输中占用平均426byte
APM系统(Application Performance Management)性能管理系统
早起APM功能主要在监控CPU,内存,IO,网络等资源上
微服务兴起后,系统功能被模块化,再加上k8s与容器化的兴起及应用数据量的爆炸式增长,各模块和服务之间的调用链路,响应时间,负载等越来越不好通过传统的工具进行监控和统计,此时APM系统诞生了.
实现从请求跟踪,指标收集和日志记录的完整信息记录
多语言自动探针,支持java,go,python,php,nodejs,Lua,Rust等客户端
内置服务器网络可观察性,支持从Istio+Envoy Service Mesh收集和分析数据
模块化架构,存储,集群管理,使用插件集合都可以进行自由选择.
支持告警.
优秀的可视化效果.
OAP平台(Observability Analysis Platform,可观测性分析平台)或OAP Server,它是一个高度组件化的轻量级分析程序,由兼容各种探针Receiver,流式分析内核和查询内核三部分构成.
探针: 基于无侵入式的收集,并通过HTTP或者gRPC方式发送数据到OAP Server
存储实现(Sotrage Implementors)SkyWalking OAP Server支持多种存储实现并且提供了标准接口,可支持不同的存储后端.
UI模块(Skywalking)通过标准的GraphQL(Facebook 2012年开源)协议进行统计数据查询和展示.
面向协议设计: 面向协议设计时SkyWalking从5.X开始严格遵守的首要设计原则,组件之间使用标准的协议进行数据交互
协议有探针协议和查询协议
探针协议:
探针上报协议: 协议包括语言探针的注册,Metrics数据上报,Tracing数据上报等标准,Java,Go等探针都需要严格遵守此协议的标准.
探针交互协议:因为分布式追踪环境,探针间需要借助HTTP Header,MQ Header在应用之间进行通信和交互,探针交互协议就定义了交互的数据格式
Service Mesh 协议: 是SkyWalking对Service Mesh抽象的专有协议,任何Mesh类的服务都可以通过此协议直接上传指标数据,用于计算服务的指标数据和绘制拓扑图.
第三方协议: 对大型的第三方开源项目尤其是Service Mesh核心平台Istio和Envoy,提供核心协议适配,支持针对Istio+Envoy Service Mesh进行无缝对接.
查询协议:
元数据查询: 查询在skywalking注册的服务,服务实例,Endpoint等元数据信息.
拓扑关系查询: 查询全局,或单个服务,Endpoint的拓扑图及依赖关系.
Metrics指标查询:区间范围均值查询及Top N排名查询等.
Trace查询: 追踪数据的明细查询.
告警查询: 基于表达式,判断指标数据是否超出阈值.
模块化设计:
- 探针负责收集数据
- 前端负责展示数据
- OAP Server负责从后端存储读写数据
- 后端存储负责持久化数据
轻量化设计:
SkyWalking在设计之初就提出了轻量化的设计理念,skywalking使用最轻量级的jar包模式,实现强大的数据处理和分析能力,可扩展能力和模块化能力
SkyWalking优势:
服务器名 | IP地址 | 服务端口 | 作用 |
---|---|---|---|
skywalking-oap | 192.168.31.232 | 11800(写),12800(读) | OAP观测性分析平台Server端 |
es-1 | 192.168.31.41 | 9200 | ES数据库Version: 7.12.1 |
es-2 | 192.168.31.42 | 9200 | ES数据库Version: 7.12.1 |
es-3 | 192.168.31.43 | 9200 | ES数据库Version: 7.12.1 |
zookeeper-1 | 192.168.31.121 | 2181 | zookeeper节点 |
zookeeper-2 | 192.168.31.122 | 2181 | zookeeper节点 |
zookeeper-3 | 192.168.31.123 | 2181 | zookeeper节点 |
django | 192.168.31.231 | 80 | django服务端 |
创建skywalking工作目录及下载skywalking
mkdir /apps && cd /apps
wget https://dlcdn.apache.org/skywalking/9.2.0/apache-skywalking-apm-9.2.0.tar.gz
tar xf apache-skywalking-apm-9.2.0.tar.gz
apt install openjdk-11-jdk -y
## 确认jdk安装版本正确
root@skywalking-oap:/apps# java --version
openjdk 11.0.16 2022-07-19
OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu120.04)
OpenJDK 64-Bit Server VM (build 11.0.16+8-post-Ubuntu-0ubuntu120.04, mixed mode, sharing)
cd /apps/apache-skywalking-apm-bin
vim config/application.yml
修改133行和136行,指定elasticsearch为数据库及elasticsearch的集群地址
storage:
selector: ${SW_STORAGE:elasticsearch}
elasticsearch:
namespace: ${SW_NAMESPACE:""}
## 单机ES
## clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:192.168.31.41:9200}
## ES集群多个ip用逗号,分割
clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:192.168.31.41:9200,192.168.31.42:9200,192.168.31.43:9200}
其他几个重要参数
# 配置你的elasticsearch服务的IP和端口,集群IP请用“,”逗号隔开
clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:192.168.31.41:9200}
clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:192.168.31.41:9200,192.168.31.42:9200,192.168.31.43:9200}
#存储最多7天的内容,过期数据将会清理。因此请根据实际需求进行调整
recordDataTTL: ${SW_STORAGE_ES_RECORD_DATA_TTL:7} # Unit is day
# 每10秒刷新数据到收集器中
flushInterval: ${SW_STORAGE_ES_FLUSH_INTERVAL:10}
# 提供2个并发请求,如果系统业务量大,日志产生的非常快,请根据实况调整
concurrentRequests: ${SW_STORAGE_ES_CONCURRENT_REQUESTS:2}
启动skywalking服务
# /apps/apache-skywalking-apm-bin/bin/startup.sh
SkyWalking OAP started successfully!
SkyWalking Web Application started successfully!
确认skywalk是否正常启动,如果正常启动会在es数据库里创建数据.
skywalking正常启动后监听8080端口
可以通过192.168.31.232:8080访问skywalking
Halo需要jdk11以上版本
yum install -y java-11-openjdk
mkdir /apps/halo -p && cd /apps/halo
curl -L https://github.com/halo-dev/halo/releases/download/v1.5.4/halo-1.5.4.jar --output halo.jar
cd /apps
wget https://dlcdn.apache.org/skywalking/java-agent/8.12.0/apache-skywalking-java-agent-8.12.0.tgz
tar xf apache-skywalking-java-agent-8.12.0.tgz
vi /apps/skywalking-agent/config/agent.config
修改/apps/skywalking-agent/config/agent.config配置
# The group name is optional only.
# 在UI显示,服务的名字
agent.service_name=${SW_AGENT_NAME:Halo}
# The agent namespace
# 不显示,一般是所属项目
agent.namespace=${SW_AGENT_NAMESPACE:Qiu}
# Backend service addresses.
# skywalk server 地址
collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:192.168.31.232:11800}
启动服务
java -javaagent:/apps/skywalking-agent/skywalking-agent.jar -jar /apps/halo/halo.jar
如果是容器可以
java -javaagent:/apps/skywalking-agent/skywalking-agent.jar \
-DSW_AGENT_NAMESPACE=Qiu\
-DSW_AGENT_NAME=Halo\
-DSW_AGENT_COLLECTOR_BACKEND_SERVICES=192.168.31.232:11800\
-jar /apps/halo/halo.jar
测试发几个博客 nodeip:8090/admin/
此时SkyWalking可以看到有流量数据
Apdex(应用性能指数),是由Apdex联盟开放的用于评估应用性能的标准,Apdex联盟起源于2004年,Apdex标准从用户的角度发出,提供了一个统一的测量和报告用户体验的方法,将其量化为范围0-1的满意度评价,把最终用户体验和应用性能作为一个完整的指标进行统一度量.
在网络中运行的一个应用服务,它的响应时间决定了用户的满意程度,用户等待所有交互完整时间的长短直接影响了用户对应用的满意程度,这才是对用户有真正意义的"响应时间",Apdex把完成这样一个任务所用的时间长短称为应用的"响应性"
Apdex定义了应用响应时间的最优门槛为T,根据应用响应时间结合T定义了三种不同的性能表现:
服务:表示对请求提供相同行为的一系列或一组工作负载(服务名称),在使用Agent或者SDK的时候,可以定义服务的名字,如果不定义的话,SkyWalking将会使用你平台上定义的名字.
服务实例:上述的一组工作负载中的每一个工作负载称为一个实例(一个服务运行的节点),一个服务实例可以是一个kubernetes中的pod或者是一个虚拟机甚至于物理机.
端点:对于特定服务所接收的请求路径,如http的rui路径和rpc服务的类+方法签名,如/api/v1/
mkdir /apps
## 下载apache-tomcat-8.5.73.tar.gz,apache-skywalking-java-agent-8.12.0.tgz和jenkins.war(2.319.2)
tar xf apache-tomcat-8.5.73.tar.gz
ln -sf /apps/apache-tomcat-8.5.73 /apps/tomcat
mv jenkins.war tomcat/webapps/
tar xf apache-skywalking-java-agent-8.12.0.tgz
SkyWalking-agent依然配置以下3行
skywalking-agent/config/agent.config
agent.service_name=${SW_AGENT_NAME:Jenkins}
agent.namespace=${SW_AGENT_NAMESPACE:Qiu}
collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:192.168.31.232:11800}
配置tomcat启动文件
vi /apps/tomcat/bin/catalina.sh
追加以下行 用以加载SkyWalking-agent 125行插入
JAVA_OPTS="$JAVA_OPTS -javaagent:/apps/skywalking-agent/skywalking-agent.jar"
[root@centos-18 apps]# /apps/tomcat/bin/startup.sh
Using CATALINA_BASE: /apps/tomcat
Using CATALINA_HOME: /apps/tomcat
Using CATALINA_TMPDIR: /apps/tomcat/temp
Using JRE_HOME: /usr
Using CLASSPATH: /apps/tomcat/bin/bootstrap.jar:/apps/tomcat/bin/tomcat-juli.jar
Using CATALINA_OPTS: -javaagent:/apps/skywalking-agent/skywalking-agent.jar
Tomcat started.
[root@centos-18 apps]# ss -ntl|grep 8080
LISTEN 0 100 *:8080 *:*
此时访问skywalking-ui
上传dubbo-demo-provider-2.1.5-assembly.tar.gz
tar xf dubbo-demo-provider-2.1.5-assembly.tar.gz
vi /apps/dubbo-demo-provider-2.1.5/conf/dubbo.properties
# 配置zookeeper连接
## 单个zookeeper
## dubbo.registry.address=zookeeper://192.168.31.122:2181
## zookeeper集群
dubbo.registry.address=zookeeper://192.168.31.122:2181?backup=192.168.31.121:2181,192.168.31.123:2181
配置skywalking-agent
vi skywalking-agent/config/agent.config
agent.service_name=${SW_AGENT_NAME:dubbo-provider}
agent.namespace=${SW_AGENT_NAMESPACE:Qiu}
collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:192.168.31.232:11800}
运行生产者
java -javaagent:/apps/skywalking-agent/skywalking-agent.jar -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -server -Xms1024m -Xmx1024m -XX:PermSize=128m -XX:SurvivorRatio=2 -XX:+UseParallelGC -classpath /apps/dubbo-demo-provider-2.1.5/conf:/apps/dubbo-demo-provider-2.1.5/lib/cache-api-0.4.jar:/apps/dubbo-demo-provider-2.1.5/lib/commons-codec-1.4.jar:/apps/dubbo-demo-provider-2.1.5/lib/commons-logging-1.1.1.jar:/apps/dubbo-demo-provider-2.1.5/lib/commons-pool-1.5.5.jar:/apps/dubbo-demo-provider-2.1.5/lib/dubbo-2.1.5.jar:/apps/dubbo-demo-provider-2.1.5/lib/dubbo-demo-2.1.5.jar:/apps/dubbo-demo-provider-2.1.5/lib/dubbo-demo-provider-2.1.5.jar:/apps/dubbo-demo-provider-2.1.5/lib/fastjson-1.1.8.jar:/apps/dubbo-demo-provider-2.1.5/lib/gmbal-api-only-3.0.0-b023.jar:/apps/dubbo-demo-provider-2.1.5/lib/grizzly-core-2.1.4.jar:/apps/dubbo-demo-provider-2.1.5/lib/grizzly-framework-2.1.4.jar:/apps/dubbo-demo-provider-2.1.5/lib/grizzly-portunif-2.1.4.jar:/apps/dubbo-demo-provider-2.1.5/lib/grizzly-rcm-2.1.4.jar:/apps/dubbo-demo-provider-2.1.5/lib/hessian-4.0.7.jar:/apps/dubbo-demo-provider-2.1.5/lib/hibernate-validator-4.2.0.Final.jar:/apps/dubbo-demo-provider-2.1.5/lib/httpclient-4.1.2.jar:/apps/dubbo-demo-provider-2.1.5/lib/httpcore-4.1.2.jar:/apps/dubbo-demo-provider-2.1.5/lib/javassist-3.15.0-GA.jar:/apps/dubbo-demo-provider-2.1.5/lib/jedis-2.0.0.jar:/apps/dubbo-demo-provider-2.1.5/lib/jetty-6.1.26.jar:/apps/dubbo-demo-provider-2.1.5/lib/jetty-util-6.1.26.jar:/apps/dubbo-demo-provider-2.1.5/lib/jline-0.9.94.jar:/apps/dubbo-demo-provider-2.1.5/lib/log4j-1.2.16.jar:/apps/dubbo-demo-provider-2.1.5/lib/management-api-3.0.0-b012.jar:/apps/dubbo-demo-provider-2.1.5/lib/mina-core-1.1.7.jar:/apps/dubbo-demo-provider-2.1.5/lib/netty-3.2.5.Final.jar:/apps/dubbo-demo-provider-2.1.5/lib/servlet-api-2.5-20081211.jar:/apps/dubbo-demo-provider-2.1.5/lib/slf4j-api-1.6.2.jar:/apps/dubbo-demo-provider-2.1.5/lib/spring-2.5.6.SEC03.jar:/apps/dubbo-demo-provider-2.1.5/lib/validation-api-1.0.0.GA.jar:/apps/dubbo-demo-provider-2.1.5/lib/zookeeper-3.3.3.jar: com.alibaba.dubbo.container.Main
启动成功
[3.057s][warning][exceptions] Class com.alibaba.dubbo.common.URL in throws clause of method com.alibaba.dubbo.remoting.Client com.alibaba.dubbo.remoting.Transporter_Adpative.connect(com.alibaba.dubbo.common.URL, com.alibaba.dubbo.remoting.ChannelHandler) is not a subtype of class java.lang.Throwable
[2022-09-09 23:49:58] Dubbo service server started!
上传dubbo-demo-consumer-2.1.5-assembly.tar.gz
tar xf dubbo-demo-consumer-2.1.5-assembly.tar.gz
vi dubbo-demo-consumer-2.1.5/conf/dubbo.properties
## 配置zookeeper地址
dubbo.registry.address=zookeeper://192.168.31.122:2181?backup=192.168.31.121:2181,192.168.31.123:2181
配置skywalking-agent
vi skywalking-agent/config/agent.config
agent.service_name=${SW_AGENT_NAME:dubbo-consumer}
agent.namespace=${SW_AGENT_NAMESPACE:Qiu}
collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:192.168.31.232:11800}
启动consumer
java -javaagent:/apps/skywalking-agent/skywalking-agent.jar -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -server -Xms1024m -Xmx1024m -XX:PermSize=128m -XX:SurvivorRatio=2 -XX:+UseParallelGC -classpath /apps/dubbo-demo-consumer-2.1.5/conf:/apps/dubbo-demo-consumer-2.1.5/lib/cache-api-0.4.jar:/apps/dubbo-demo-consumer-2.1.5/lib/commons-codec-1.4.jar:/apps/dubbo-demo-consumer-2.1.5/lib/commons-logging-1.1.1.jar:/apps/dubbo-demo-consumer-2.1.5/lib/commons-pool-1.5.5.jar:/apps/dubbo-demo-consumer-2.1.5/lib/dubbo-2.1.5.jar:/apps/dubbo-demo-consumer-2.1.5/lib/dubbo-demo-2.1.5.jar:/apps/dubbo-demo-consumer-2.1.5/lib/dubbo-demo-consumer-2.1.5.jar:/apps/dubbo-demo-consumer-2.1.5/lib/fastjson-1.1.8.jar:/apps/dubbo-demo-consumer-2.1.5/lib/gmbal-api-only-3.0.0-b023.jar:/apps/dubbo-demo-consumer-2.1.5/lib/grizzly-core-2.1.4.jar:/apps/dubbo-demo-consumer-2.1.5/lib/grizzly-framework-2.1.4.jar:/apps/dubbo-demo-consumer-2.1.5/lib/grizzly-portunif-2.1.4.jar:/apps/dubbo-demo-consumer-2.1.5/lib/grizzly-rcm-2.1.4.jar:/apps/dubbo-demo-consumer-2.1.5/lib/hessian-4.0.7.jar:/apps/dubbo-demo-consumer-2.1.5/lib/hibernate-validator-4.2.0.Final.jar:/apps/dubbo-demo-consumer-2.1.5/lib/httpclient-4.1.2.jar:/apps/dubbo-demo-consumer-2.1.5/lib/httpcore-4.1.2.jar:/apps/dubbo-demo-consumer-2.1.5/lib/javassist-3.15.0-GA.jar:/apps/dubbo-demo-consumer-2.1.5/lib/jedis-2.0.0.jar:/apps/dubbo-demo-consumer-2.1.5/lib/jetty-6.1.26.jar:/apps/dubbo-demo-consumer-2.1.5/lib/jetty-util-6.1.26.jar:/apps/dubbo-demo-consumer-2.1.5/lib/jline-0.9.94.jar:/apps/dubbo-demo-consumer-2.1.5/lib/log4j-1.2.16.jar:/apps/dubbo-demo-consumer-2.1.5/lib/management-api-3.0.0-b012.jar:/apps/dubbo-demo-consumer-2.1.5/lib/mina-core-1.1.7.jar:/apps/dubbo-demo-consumer-2.1.5/lib/netty-3.2.5.Final.jar:/apps/dubbo-demo-consumer-2.1.5/lib/servlet-api-2.5-20081211.jar:/apps/dubbo-demo-consumer-2.1.5/lib/slf4j-api-1.6.2.jar:/apps/dubbo-demo-consumer-2.1.5/lib/spring-2.5.6.SEC03.jar:/apps/dubbo-demo-consumer-2.1.5/lib/validation-api-1.0.0.GA.jar:/apps/dubbo-demo-consumer-2.1.5/lib/zookeeper-3.3.3.jar: com.alibaba.dubbo.container.Main
启动完毕
[16:02:12] Hello world0, response form provider: 192.168.31.18:20880
[16:02:14] Hello world1, response form provider: 192.168.31.18:20880
[16:02:16] Hello world2, response form provider: 192.168.31.18:20880
[16:02:18] Hello world3, response form provider: 192.168.31.18:20880
# Install the latest version pip
apt install python3 pip3 -y
# Install the latest version, using the default gRPC protocol to report data to OAP
pip install "apache-skywalking"
安装django依赖包
# cat requirements.txt
apache-skywalking==0.7.0
asgiref==3.4.1
backports.zoneinfo==0.2.1
Django==4.0.1
grpcio==1.43.0
grpcio-tools==1.43.0
packaging==21.3
protobuf==3.19.3
PyMySQL==1.0.2
pyparsing==3.0.6
six==1.16.0
sqlparse==0.4.2
wrapt==1.13.3
# pip3 install -r requirements.txt
Successfully installed Django-4.0.1 PyMySQL-1.0.2 apache-skywalking-0.7.0 asgiref-3.4.1 backports.zoneinfo-0.2.1 grpcio-1.43.0 grpcio-tools-1.43.0 protobuf-3.19.3 pyparsing-3.0.6 six-1.16.0 sqlparse-0.4.2 wrapt-1.13.3
启动django项目
# django-admin startproject mysite
# cd mysite
# python3 manage.py startapp myapp
# python3 manage.py migrate
## 创建超级用户
# python3 manage.py createsuperuser
Username (leave blank to use 'root'): root
Email address: [email protected]
Password:
Password (again):
The password is too similar to the username.
This password is too short. It must contain at least 8 characters.
Bypass password validation and create user anyway? [y/N]: y
Superuser created successfully.
## 声明环境变量
# export SW_AGENT_NAME='python-app1'
# export SW_AGENT_NAMESPACE='python-app1'
# export SW_AGENT_COLLECTOR_BACKEND_SERVICES='192.168.31.232:11800'
# 修改配置文件
# vi mysite/settings.py
ALLOWED_HOSTS = ['192.168.31.231']
# 启动服务
# sw-python -d run python3 manage.py runserver 192.168.31.231:80
这样就可以访问 192.168.31.231:80
此时skywalking里就有数据了