摘要: 原创出处 https://zijiancode.cn/archives/skywalking%E5%AE%9E%E6%88%98md 欢迎转载,保留摘要,谢谢!
中文文档
关于Skywalking
的介绍请参见中文文档
环境:
linux ubuntu 18 TLS arm64
elasticsearch: 7.11.0
skywalking: 8.4.0
参考ELK最佳实践
进入下载页面,选择最新的版本进行下载,以下是本次笔记所下载版本
https://www.apache.org/dyn/closer.cgi/skywalking/8.4.0/apache-skywalking-apm-es7-8.4.0.tar.gz
tar -xf apache-skywalking-apm-es7-8.4.0.tar.gz
# 移动至/opt/server/ 目录下
mv apache-skywalking-apm-bin-es7 skywalking
vim /opt/server/skywalking/config/application.yml
storage:
selector: ${SW_STORAGE:elasticsearch7}
elasticsearch7:
nameSpace: ${SW_NAMESPACE:"elasticsearch7"}
clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:192.168.1.13:9200,192.168.1.14:9200,192.168.1.15:9200}
protocol: ${SW_STORAGE_ES_HTTP_PROTOCOL:"http"}
#trustStorePath: ${SW_STORAGE_ES_SSL_JKS_PATH:""}
#trustStorePass: ${SW_STORAGE_ES_SSL_JKS_PASS:""}
dayStep: ${SW_STORAGE_DAY_STEP:1} # Represent the number of days in the one minute/hour/day index.
indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:1} # Shard number of new indexes
indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:1} # Replicas number of new indexes
# Super data set has been defined in the codes, such as trace segments.The following 3 config would be improve es performance when storage super size data in es.
superDatasetDayStep: ${SW_SUPERDATASET_STORAGE_DAY_STEP:-1} # Represent the number of days in the super size dataset record index, the default value is the same as dayStep when the value is less than 0
superDatasetIndexShardsFactor: ${SW_STORAGE_ES_SUPER_DATASET_INDEX_SHARDS_FACTOR:5} # This factor provides more shards for the super data set, shards number = indexShardsNumber * superDatasetIndexShardsFactor. Also, this factor effects Zipkin and Jaeger traces.
superDatasetIndexReplicasNumber: ${SW_STORAGE_ES_SUPER_DATASET_INDEX_REPLICAS_NUMBER:0} # Represent the replicas number in the super size dataset record index, the default value is 0.
user: ${SW_ES_USER:"elastic"}
password: ${SW_ES_PASSWORD:"elastic"}
secretsManagementFile: ${SW_ES_SECRETS_MANAGEMENT_FILE:""} # Secrets management file in the properties format includes the username, password, which are managed by 3rd party tool.
bulkActions: ${SW_STORAGE_ES_BULK_ACTIONS:1000} # Execute the async bulk record data every ${SW_STORAGE_ES_BULK_ACTIONS} requests
syncBulkActions: ${SW_STORAGE_ES_SYNC_BULK_ACTIONS:50000} # Execute the sync bulk metrics data every ${SW_STORAGE_ES_SYNC_BULK_ACTIONS} requests
flushInterval: ${SW_STORAGE_ES_FLUSH_INTERVAL:10} # flush the bulk every 10 seconds whatever the number of requests
concurrentRequests: ${SW_STORAGE_ES_CONCURRENT_REQUESTS:2} # the number of concurrent requests
resultWindowMaxSize: ${SW_STORAGE_ES_QUERY_MAX_WINDOW_SIZE:10000}
metadataQueryMaxSize: ${SW_STORAGE_ES_QUERY_MAX_SIZE:5000}
segmentQueryMaxSize: ${SW_STORAGE_ES_QUERY_SEGMENT_SIZE:200}
profileTaskQueryMaxSize: ${SW_STORAGE_ES_QUERY_PROFILE_TASK_SIZE:200}
oapAnalyzer: ${SW_STORAGE_ES_OAP_ANALYZER:"{\"analyzer\":{\"oap_analyzer\":{\"type\":\"stop\"}}}"} # the oap analyzer.
oapLogAnalyzer: ${SW_STORAGE_ES_OAP_LOG_ANALYZER:"{\"analyzer\":{\"oap_log_analyzer\":{\"type\":\"standard\"}}}"} # the oap log analyzer. It could be customized by the ES analyzer configuration to support more language log formats, such as Chinese log, Japanese log and etc.
advanced: ${SW_STORAGE_ES_ADVANCED:""}
只需修改
storage
配置
storage.selector
:选择哪种数据库进行存储,我们选择elasticsearch7修改
elastcisearch
中的以下配置
nameSpace
: 命名空间
clusterNodes
: es集群
user
: es用户名
password
: es密码
/opt/server/skywalking/bin/oapService.sh
# 查看日志
tail -100f /opt/server/skywalking/logs/skywalking-oap-server.log
第一次启动时间较长,需要初始化环境
/opt/server/skywalking/bin/webappService.sh
如需修改配置 webapp/webapp.yml
http://localhost:8080
skywalking已经搭建好了,那么现在就开始集成到服务里吧
假装你已经知道服务是使用skywalking-agent
进行数据采集的(不知道就看最开头的文档吧),关于agent
相关的文件在/opt/server/skywalking/agent
目录下
java 脚本
export SW_AGENT_NAME=demo
export SW_AGENT_SPAN_LIMIT=2000
export SW_AGENT_COLLECTOR_BACKEND_SERVICES=122.9.35.11:21800
JAVA_AGENT="-javaagent:/opt/server/skywalking/agent/skywalking-agent.jar"
javar -jar ${JAVA_AGENT} demo.jar
Docker
FROM openjdk:8-jdk-alpine3.8
ENV SW_AGENT_NAME=demo \
SW_AGENT_SPAN_LIMIT=2000 \
SW_AGENT_COLLECTOR_BACKEND_SERVICES=122.9.35.11:21800 \
JAVA_AGENT=-javaagent:/app/agent/skywalking-agent.jar \
ENTRYPOINT ["sh","-c","java ${JAVA_AGENT} -jar /app/app.jar"]
在 docker-compose中编辑数据卷挂载
volumes:
- /opt/server/skywalking/agent:/app/agent
SW_AGENT_NAME: 服务名
SW_AGENT_SPAN_LIMIT:调用链路记录的最大跨度
SW_AGENT_COLLECTOR_BACKEND_SERVICES:skywalking-oap的地址
这些配置都在agent/config/agent.config中
这里我已经编写好了一个接口:/oauth/login
这个接口将途径 apiserver(网关) -> auth(认证中心) -> user(用户服务) -> mysql | redis
发起一个请求
curl http://localhost:9001/oauth/login
查看ui界面
查看拓扑图
查看调用链路
我们发现有一个性能剖析的的tab,怎么用呢?
端点名称在追踪链路中找到
点击分析,可以看到出现了线程栈,并且有每个方法的调用时长
告警规则
默认告警规则
为了方便,skywalking在发行版中提供了默认的alarm setting.yml
文件,包括以下规则
1.最近 3 分钟内服务平均响应时间超过 1 秒。
2.服务成功率在最近 2 分钟内低于80%。
3.服务响应时间在最近 3 分钟内低于 1000 毫秒.
4.服务实例在最近 2 分钟内的平均响应时间超过 1 秒。
5.端点平均响应时间在最近 2 分钟内超过1秒。
6.数据库访问平均响应时间在过去 2 分钟内超过 1 秒。
7.端点之间平均响应时间在最近 2 分钟内超过 1 秒。
想要定制化告警需要自己实现,如何实现具体参考官方文档
我们发现,在链路追踪中,存在一个trace id
,这个trace id
是全链路的,通过这个trace id
我们可以找到整条调用链,如果我们将这个trace id
放到日志中,再集成到ELK, 嘿嘿~
引入依赖
<dependency>
<groupId>org.apache.skywalkinggroupId>
<artifactId>apm-toolkit-logback-1.xartifactId>
<version>8.4.0version>
dependency>
修改logback-spring.xml
<configuration scan="true" scanPeriod="60 seconds" debug="false">
<include resource="org/springframework/boot/logging/logback/defaults.xml"/>
<springProperty name="applicationName" scope="context" source="spring.application.name" />
<property name="LOG_FILE_NAME_PATTERN" value="logs/${applicationName}/log.out"/>
<property name="CONSOLE_LOG_PATTERN"
value="%clr(%d{${LOG_DATEFORMAT_PATTERN:-yyyy-MM-dd HH:mm:ss.SSS}}){faint} %clr(${LOG_LEVEL_PATTERN:-%5p}) %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr(%c){cyan} %clr(:){faint} %m%n${LOG_EXCEPTION_CONVERSION_WORD:-%wEx}"/>
<property name="FILE_LOG_PATTERN"
value="%d{${LOG_DATEFORMAT_PATTERN:-yyyy-MM-dd HH:mm:ss.SSS}} ${applicationName} [%tid] ${LOG_LEVEL_PATTERN:-%5p} ${PID:- } --- [%t] %c : %m%n${LOG_EXCEPTION_CONVERSION_WORD:-%wEx}"/>
<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>${CONSOLE_LOG_PATTERN}pattern>
encoder>
appender>
<appender name="file" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${LOG_FILE_NAME_PATTERN}file>
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>${LOG_FILE_NAME_PATTERN}.%d{yyyy-MM-dd}.%i.gzfileNamePattern>
<maxHistory>7maxHistory>
<maxFileSize>10MBmaxFileSize>
rollingPolicy>
<encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
<layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.TraceIdPatternLogbackLayout">
<pattern>${FILE_LOG_PATTERN}pattern>
layout>
encoder>
appender>
<springProfile name="local">
<root level="info">
<appender-ref ref="console"/>
root>
springProfile>
<springProfile name="dev">
<root level="info">
<appender-ref ref="file"/>
root>
springProfile>
<springProfile name="staging">
<root level="info">
<appender-ref ref="file"/>
root>
springProfile>
<springProfile name="online">
<root level="info">
<appender-ref ref="console"/>
<appender-ref ref="file"/>
root>
springProfile>
configuration>
主要修改项:
FILE_LOG_PATTERN中添加: [%tid]
encode中layout的class修改为:TraceIdPatternLogbackLayout (必须!!)
如何在ELK查看参考 参考ELK最佳实践 :JAVA项目实战小结
最后,更多的内容请参考官方文档,官网文档才是最好最快的学习途径
更多Java技术欢迎关注本人博客网站:https://zijiancode.cn/