随着微服务架构的流行,一些微服务架构下的问题也会越来越突出,比如一个请求会涉及多个服务,而服务本身可能也会依赖其他服务,整个请求路径就构成了一个网状的调用链,而在整个调用链中一旦某个节点发生异常,整个调用链的稳定性就会受到影响,所以会深深的感受到 “银弹” 这个词是不存在的,每种架构都有其优缺点 。
随着业务越来越复杂,企业应用也进入了分布式服务化的阶段,随着模块的不断增多,一次请求可能会涉及到十几个甚至几十个服务的协同处理,那么如何准确快速的定位到线上故障和性能瓶颈,便成为我们不得不面对的棘手问题,传统的日志监控等方式无法很好达到跟踪调用,排查问题等需求。在谷歌论文《 Dapper,大规模分布式系统的跟踪系统》的指导下,许多优秀的APM应运而生。
分布式追踪系统发展很快,种类繁多,给我们带来很大的方便。但在数据采集过程中,有时需要侵入用户代码,并且不同系统的 API 并不兼容,这就导致了如果您希望切换追踪系统,往往会带来较大改动。OpenTracing为了解决不同的分布式追踪系统 API 不兼容的问题,诞生了 OpenTracing 规范。OpenTracing 是一个轻量级的标准化层,它位于应用程序/类库和追踪或日志分析程序之间。详细介绍见
Skywalking是一款APM(应用程序性能监视器),尤其适用于微服务,Cloud Native和基于容器的架构系统。也称为分布式跟踪系统。它提供了一种自动检测应用程序的方法:无需更改目标应用程序的任何源代码; 以及具有高效流媒体模块的收集器。
针对分布式系统的APM(应用性能监控)系统,特别针对微服务、cloud native和容器化(Docker, Kubernetes, Mesos)架构, 其核心是个分布式追踪系统。
该项目由国人吴晟基于OpenTracking实现的开源项目skywalking(码云、github)
2017年12月8日,Apache软件基金会孵化器项目管理委员会 ASF IPMC宣布“SkyWalking全票通过,进入Apache孵化器”
性能好,针对单实例5000tps的应用,在全量采集的情况下,只增加 10% 的CPU开销。详细评测见《skywalking agent performance test》。
支持多语言探针
支持自动及手动探针;自动探针:Java支持的中间件、框架与类库列表; 手动探针:OpenTrackingApi、@Trace注解、trackId集成到日志中。
采用探针技术,在使用过程中,完全是0代码,无侵入,分布式自动采集与监控系统运行;
一、环境概览
软件 | 版本 | 机器数量 |
---|---|---|
系统 | centos7.4 | |
jdk | 1.8 | |
elasticsearch | 6.5.2 | 3 |
skywalking-UI | 6.0 | 1 |
skywalking-collector | 6.0 | 3 |
二、下载软件
apache-skywalking:
项目git地址:https://github.com/OpenSkywalking/skywalking-netcore
项目包下载地址:http://www.apache.org/dyn/closer.cgi/incubator/skywalking/6.0.0-GA/apache-skywalking-apm-incubating-6.0.0-GA.tar.gz
此包中包括了agent包,如下
$ ls -al apache-skywalking-apm-incubating/agent
drwxrwxr-x 2 1001 1002 271 Mar 20 17:22 activations
drwxrwxr-x 2 1001 1002 26 Mar 20 17:22 config
drwxrwxr-x 2 1001 1002 6 Jan 21 12:01 logs
drwxrwxr-x 2 1001 1002 139 Mar 20 17:22 optional-plugins
drwxrwxr-x 2 1001 1002 4096 Mar 20 17:22 plugins
-rw-rw-r-- 1 1001 1002 17805401 Jan 21 12:01 skywalking-agent.jar
三、安装部署
1、安装elasticsearch6.5.2集群 详见elsticsearch6.5.2集群安装+head插件
2、安装apache-skywalking(配置 collector) 3台都一样的配置
$ wget http://www.apache.org/dyn/closer.cgi/incubator/skywalking/6.0.0- GA/apache-skywalking-apm-incubating-6.0.0-GA.tar.gz -O /usr/local/src
$ tar xf /usr/local/src/apache-skywalking-apm-incubating-6.0.0-GA.tar.gz -C /usr/local
$ cd /usr/local/apache-skywalking-apm-incubating/config
$ cat config/application.yml
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
cluster:
#standalone:
# Please check your ZooKeeper is 3.5+, However, it is also compatible with ZooKeeper 3.4.x. Replace the ZooKeeper 3.5+
# library the oap-libs folder with your ZooKeeper 3.4.x library.
zookeeper:
nameSpace: ${SW_NAMESPACE:"skywalking"}
# hostPort: ${SW_CLUSTER_ZK_HOST_PORT:172.16.163.60:2181,172.16.163.61:2181,172.16.163.62:2181}
hostPort: ${SW_CLUSTER_ZK_HOST_PORT:10.100.11.37:2181}
# #Retry Policy
baseSleepTimeMs: ${SW_CLUSTER_ZK_SLEEP_TIME:1000} # initial amount of time to wait between retries
maxRetries: ${SW_CLUSTER_ZK_MAX_RETRIES:3} # max number of times to retry
# kubernetes:
# watchTimeoutSeconds: ${SW_CLUSTER_K8S_WATCH_TIMEOUT:60}
# namespace: ${SW_CLUSTER_K8S_NAMESPACE:default}
# labelSelector: ${SW_CLUSTER_K8S_LABEL:app=collector,release=skywalking}
# uidEnvName: ${SW_CLUSTER_K8S_UID:SKYWALKING_COLLECTOR_UID}
# consul:
# serviceName: ${SW_SERVICE_NAME:"SkyWalking_OAP_Cluster"}
# Consul cluster nodes, example: 10.0.0.1:8500,10.0.0.2:8500,10.0.0.3:8500
# hostPort: ${SW_CLUSTER_CONSUL_HOST_PORT:localhost:8500}
core:
default:
restHost: ${SW_CORE_REST_HOST:0.0.0.0}
restPort: ${SW_CORE_REST_PORT:12800}
restContextPath: ${SW_CORE_REST_CONTEXT_PATH:/}
gRPCHost: ${SW_CORE_GRPC_HOST:0.0.0.0}
gRPCPort: ${SW_CORE_GRPC_PORT:11800}
downsampling:
- Hour
- Day
- Month
# Set a timeout on metric data. After the timeout has expired, the metric data will automatically be deleted.
recordDataTTL: ${SW_CORE_RECORD_DATA_TTL:90} # Unit is minute
minuteMetricsDataTTL: ${SW_CORE_MINUTE_METRIC_DATA_TTL:90} # Unit is minute
hourMetricsDataTTL: ${SW_CORE_HOUR_METRIC_DATA_TTL:36} # Unit is hour
dayMetricsDataTTL: ${SW_CORE_DAY_METRIC_DATA_TTL:45} # Unit is day
monthMetricsDataTTL: ${SW_CORE_MONTH_METRIC_DATA_TTL:18} # Unit is month
storage:
#h2:
#driver: ${SW_STORAGE_H2_DRIVER:org.h2.jdbcx.JdbcDataSource}
#url: ${SW_STORAGE_H2_URL:jdbc:h2:mem:skywalking-oap-db}
#user: ${SW_STORAGE_H2_USER:sa}
elasticsearch:
nameSpace: ${SW_NAMESPACE:"skywalking"}
clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:10.100.xx.xx:9200,10.100.xx.xx:9200,10.100.xx.xx:9200}
indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:2}
indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:0}
# Batch process setting, refer to https://www.elastic.co/guide/en/elasticsearch/client/java-api/5.5/java-docs-bulk-processor.html
bulkActions: ${SW_STORAGE_ES_BULK_ACTIONS:2000} # Execute the bulk every 2000 requests
bulkSize: ${SW_STORAGE_ES_BULK_SIZE:20} # flush the bulk every 20mb
flushInterval: ${SW_STORAGE_ES_FLUSH_INTERVAL:10} # flush the bulk every 10 seconds whatever the number of requests
concurrentRequests: ${SW_STORAGE_ES_CONCURRENT_REQUESTS:2} # the number of concurrent requests
# mysql:
receiver-register:
default:
receiver-trace:
default:
bufferPath: ${SW_RECEIVER_BUFFER_PATH:../trace-buffer/} # Path to trace buffer files, suggest to use absolute path
bufferOffsetMaxFileSize: ${SW_RECEIVER_BUFFER_OFFSET_MAX_FILE_SIZE:100} # Unit is MB
bufferDataMaxFileSize: ${SW_RECEIVER_BUFFER_DATA_MAX_FILE_SIZE:500} # Unit is MB
bufferFileCleanWhenRestart: ${SW_RECEIVER_BUFFER_FILE_CLEAN_WHEN_RESTART:false}
sampleRate: ${SW_TRACE_SAMPLE_RATE:10000} # The sample rate precision is 1/10000. 10000 means 100% sample in default.
receiver-jvm:
default:
#service-mesh:
# default:
# bufferPath: ${SW_SERVICE_MESH_BUFFER_PATH:../mesh-buffer/} # Path to trace buffer files, suggest to use absolute path
# bufferOffsetMaxFileSize: ${SW_SERVICE_MESH_OFFSET_MAX_FILE_SIZE:100} # Unit is MB
# bufferDataMaxFileSize: ${SW_SERVICE_MESH_BUFFER_DATA_MAX_FILE_SIZE:500} # Unit is MB
# bufferFileCleanWhenRestart: ${SW_SERVICE_MESH_BUFFER_FILE_CLEAN_WHEN_RESTART:false}
#istio-telemetry:
# default:
#receiver_zipkin:
# default:
# host: ${SW_RECEIVER_ZIPKIN_HOST:0.0.0.0}
# port: ${SW_RECEIVER_ZIPKIN_PORT:9411}
# contextPath: ${SW_RECEIVER_ZIPKIN_CONTEXT_PATH:/}
query:
graphql:
path: ${SW_QUERY_GRAPHQL_PATH:/graphql}
alarm:
default:
telemetry:
none:
#启动
#SkyWalking 的启动包括两部分,一个是 SkyWalking Collector ,一个是 SkyWalking UI
$ ../bin/startup.sh //UI和collector全部启动
3、部署agent
取出agent放到项目里
cd //usr/local/apache-skywalking-apm-incubating
tar zcf agent.tar.gz agent
将agent 放到项目里
接下来部署项目,有两种方式
第一种是Jar包部署方式的探针配置
java -javaagent:/path/to/skywalking-agent.jar -jar your_name.jar
第二种方式是tomcat部署方式
Tomcat配置探针
## linux
CATALINA_OPTS="-javaagent:/usr/local/tomcat-pof/bin/skywalking/Agent/skywalking-agent.jar -DSW_AGENT_NAMESPACE=default-namespace -DSW_AGENT_COLLECTOR_BACKEND_SERVICES=10.100.xx.xx:11800,
10.100.xx.xx:11800,10.100.xx.xx:11800 -DSW_AGENT_NAME=xxx_www_pof"; export CATALINA_OPTS
注意 11800 为 collector 端口
## windows
set "CATALINA_OPTS=... -javaagent:E:\apache-tomcat-8.5.20\skywalking-agent\skywalking-agent.jar"
skyWalking的高级特性
插件会被统一放置在plugins目录中,新的插件,也只需要在启动阶段,放在目录中,就自动生效。删除则失效。
配置除了通过/config/agent.config文件外,可以通过环境变量和VM参数(-D)来进行设置
参数的key = skywalking. + agent.config文件中的key
优先级:系统环境变量 > VM参数(-D) >/config/agent.config中的配置
Log默认使用文件输出,输出到/log目录中
也就是说可以将传递的参数写到config/agent.config 文件中
这里也展示下agent.config的配置
cat Agent/config/agent.config
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# The agent namespace
agent.namespace=${SW_AGENT_NAMESPACE:default-namespace}
# The service name in UI
agent.service_name=${SW_AGENT_NAME:xxx_www_pef}
# The number of sampled traces per 3 seconds
# Negative number means sample traces as many as possible, most likely 100%
# agent.sample_n_per_3_secs=${SW_AGENT_SAMPLE:-1}
# Authentication active is based on backend setting, see application.yml for more details.
# agent.authentication = ${SW_AGENT_AUTHENTICATION:xxxx}
# The max amount of spans in a single segment.
# Through this config item, skywalking keep your application memory cost estimated.
# agent.span_limit_per_segment=${SW_AGENT_SPAN_LIMIT:300}
# Ignore the segments if their operation names start with these suffix.
# agent.ignore_suffix=${SW_AGENT_IGNORE_SUFFIX:.jpg,.jpeg,.js,.css,.png,.bmp,.gif,.ico,.mp3,.mp4,.html,.svg}
# If true, skywalking agent will save all instrumented classes files in `/debugging` folder.
# Skywalking team may ask for these files in order to resolve compatible problem.
# agent.is_open_debugging_class = ${SW_AGENT_OPEN_DEBUG:true}
# Backend service addresses.
collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:10.100.xx.xx:11800,10.100.xx.xx:11800,10.100.xx.xx:11800}
# Logging level
logging.level=${SW_LOGGING_LEVEL:DEBUG}
如果写在config/agent.config 中,项目里的探针配置需要写成
vi bin/catalina.sh
## 添加如下
CATALINA_OPTS="-javaagent:/usr/local/tomcat-htmall/bin/skywalking/Agent/skywalking-agent.jar" ;export CATALINA_OPTS