400ToJava:125-SkyWalking分布式调用链监控介绍

一、介绍

SkyWalking是一种APM(应用性能管理)工具,通过链路追踪对分布式程序进行性能管理,下面对链路追踪及SkyWalking进行简要介绍

1.1 技术介绍

1.1.1 链路追踪

微服务架构是通过业务来划分服务的,对外暴露的一个接口可以使用可能需要很多个服务协同才能完成这个接口功能,如果链路上任何一个服务出现问题或者网络超时,都会形成导致接口调用失败。随着业务的不断扩张,服务之间互相调用会越来越复杂,此时当分布式项目的一个部分出现错误时,定位它就变得十分困难,这时我们就需要一个可以帮助理解系统行为、用于分析性能问题的工具,以便发生故障的时候,能够快速定位和解决问题,这就是所谓的 APM(应用性能管理)。

1.1.2 SkyWalking简介

SkyWalking是一个观察性分析平台和应用性能管理系统。提供分布式追踪、服务网格遥测分析、度量聚合和可视化一体化解决方案。具有如下几个特性:

  • 多种监控手段,语言探针和服务网格(Service Mesh)
  • 多语言自动探针,Java,.NET Core 和 Node.JS
  • 轻量高效,不需要大数据
  • 模块化,UI、存储、集群管理多种机制可选
  • 支持告警
  • 优秀的可视化方案
  • 国人开源,Apache顶级项目,中文文档支持优秀

1.2 项目地址

官方网站(有中文):http://skywalking.apache.org/
GitHub地址:https://github.com/apache/incubator-skywalking

二、环境搭建

2.1 安装

选择6.1.0版本,下载地址,下载压缩包。

图-1

解压后可直接启动

2.2 准备

默认使用h2数据库,数据存储及使用效率不高,本例使用ElasticSearch进行数据存储,启动ElasticSearch,启动及配置可参考ElasticSearch介绍,启动后访问http://localhost:9200,启动成功。

图-2

2.3 配置

2.3.1 配置数据源为ElasticSearch

打开config目录下的application.yml文件,修改数据源相关配置

storage:
  selector: ${SW_STORAGE:elasticsearch}
  elasticsearch:
    nameSpace: ${SW_NAMESPACE:""}
    clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:localhost:9200}
    protocol: ${SW_STORAGE_ES_HTTP_PROTOCOL:"http"}
    trustStorePath: ${SW_SW_STORAGE_ES_SSL_JKS_PATH:"../es_keystore.jks"}
    trustStorePass: ${SW_SW_STORAGE_ES_SSL_JKS_PASS:""}
    user: ${SW_ES_USER:""}
    password: ${SW_ES_PASSWORD:""}
    secretsManagementFile: ${SW_ES_SECRETS_MANAGEMENT_FILE:""} # Secrets management file in the properties format includes the username, password, which are managed by 3rd party tool.
    enablePackedDownsampling: ${SW_STORAGE_ENABLE_PACKED_DOWNSAMPLING:true} # Hour and Day metrics will be merged into minute index.
    dayStep: ${SW_STORAGE_DAY_STEP:1} # Represent the number of days in the one minute/hour/day index.
    indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:2}
    indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:0}
    # Those data TTL settings will override the same settings in core module.
    recordDataTTL: ${SW_STORAGE_ES_RECORD_DATA_TTL:7} # Unit is day
    otherMetricsDataTTL: ${SW_STORAGE_ES_OTHER_METRIC_DATA_TTL:45} # Unit is day
    monthMetricsDataTTL: ${SW_STORAGE_ES_MONTH_METRIC_DATA_TTL:18} # Unit is month
    # Batch process setting, refer to https://www.elastic.co/guide/en/elasticsearch/client/java-api/5.5/java-docs-bulk-processor.html
    bulkActions: ${SW_STORAGE_ES_BULK_ACTIONS:1000} # Execute the bulk every 1000 requests
    flushInterval: ${SW_STORAGE_ES_FLUSH_INTERVAL:10} # flush the bulk every 10 seconds whatever the number of requests
    concurrentRequests: ${SW_STORAGE_ES_CONCURRENT_REQUESTS:2} # the number of concurrent requests
    resultWindowMaxSize: ${SW_STORAGE_ES_QUERY_MAX_WINDOW_SIZE:10000}
    metadataQueryMaxSize: ${SW_STORAGE_ES_QUERY_MAX_SIZE:5000}
    segmentQueryMaxSize: ${SW_STORAGE_ES_QUERY_SEGMENT_SIZE:200}
    profileTaskQueryMaxSize: ${SW_STORAGE_ES_QUERY_PROFILE_TASK_SIZE:200}
    advanced: ${SW_STORAGE_ES_ADVANCED:""}

其中selector: ${SW_STORAGE:elasticsearch}默认为h2clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:localhost:9200}可修改ES地址。

2.3.2 配置UI界面

打开webapp目录下的webapp.yml文件,默认启动端口为8080,修改为想要使用的端口,本例使用了9099端口

server:
  port: 9099

collector:
  path: /graphql
  ribbon:
    ReadTimeout: 10000
    # Point to all backend's restHost:restPort, split by ,
    listOfServers: 127.0.0.1:12800

2.4 启动

打开bin目录下的,的startup.bat(如果是Linux系统则打开startup.sh),系统自动启动服务及UI界面

图-3

启动日志可在logs目录下进行查看。

2020-08-31 10:02:25.645  WARN 20356 --- [main] o.s.j.e.a.AnnotationMBeanExporter        : Bean with key 'routesEndpoint' has been registered as an MBean but has no exposed attributes or operations
2020-08-31 10:02:25.645  INFO 20356 --- [main] o.s.j.e.a.AnnotationMBeanExporter        : Located managed bean 'routesMvcEndpoint': registering with JMX server as MBean [org.springframework.cloud.netflix.zuul:name=routesMvcEndpoint,type=RoutesMvcEndpoint]
2020-08-31 10:02:25.659  INFO 20356 --- [main] o.s.j.e.a.AnnotationMBeanExporter        : Located managed bean 'filtersEndpoint': registering with JMX server as MBean [org.springframework.cloud.netflix.zuul:name=filtersEndpoint,type=FiltersEndpoint]
2020-08-31 10:02:25.663  WARN 20356 --- [main] o.s.j.e.a.AnnotationMBeanExporter        : Bean with key 'filtersEndpoint' has been registered as an MBean but has no exposed attributes or operations
2020-08-31 10:02:25.663  INFO 20356 --- [main] o.s.j.e.a.AnnotationMBeanExporter        : Located managed bean 'refreshScope': registering with JMX server as MBean [org.springframework.cloud.context.scope.refresh:name=refreshScope,type=RefreshScope]
2020-08-31 10:02:25.687  INFO 20356 --- [main] o.s.j.e.a.AnnotationMBeanExporter        : Located managed bean 'configurationPropertiesRebinder': registering with JMX server as MBean [org.springframework.cloud.context.properties:name=configurationPropertiesRebinder,context=4566e5bd,type=ConfigurationPropertiesRebinder]
2020-08-31 10:02:25.695  INFO 20356 --- [main] o.s.j.e.a.AnnotationMBeanExporter        : Located managed bean 'refreshEndpoint': registering with JMX server as MBean [org.springframework.cloud.endpoint:name=refreshEndpoint,type=RefreshEndpoint]
2020-08-31 10:02:25.745  INFO 20356 --- [main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 0
2020-08-31 10:02:25.848  INFO 20356 --- [main] o.s.c.support.DefaultLifecycleProcessor  : Starting beans in phase 2147483647
2020-08-31 10:02:25.862  INFO 20356 --- [main] ration$HystrixMetricsPollerConfiguration : Starting poller
2020-08-31 10:02:25.986  INFO 20356 --- [main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 9099 (http)

打开UI界面地址http://localhost:9099,启动成功

图-4

三、Demo开发演示

本Demo主要是对SkyWalking监控服务进行简单介绍。

3.1 Demo概述

3.1.1 Demo功能

主要演示SkyWalking的功能,Demo比较简单,主要通过两个get请求对外输出两个字符串,但是第二个get请求在返回字符串时先进行了一个“除0”操作,显然该请求会报错,下面给出该服务控制层的相关代码。

package cn.toj.agenttest.controller;

import cn.toj.agenttest.dto.ResponseResult;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.HttpStatus;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

/**
 * @author Carlos
 * @description
 * @Date 2020/8/24
 */

@RestController
public class AgentTestController {

    @Value("${server.port}")
    private String port;

    @GetMapping("/getConfig")
    public ResponseResult getConfig() {
        return new ResponseResult<>(Integer.valueOf(HttpStatus.OK.value()), HttpStatus.OK.toString(), "Hello, I'm in port " + port + ".");
    }

    @GetMapping("/exception")
    public ResponseResult getException() {
        int i = 1/0;
        return new ResponseResult<>(Integer.valueOf(HttpStatus.OK.value()), HttpStatus.OK.toString(), "Hello, I'm in port " + port + ".");
    }

}

3.1.2 添加程序探针agent

将SkyWalking根目录下的agent文件夹复制到指定位置,并打开该文件夹下config目录下的agent.config文件,修改agent.service_name=${SW_AGENT_NAME:服务名}属性,本例命名为agent-test
启动时需要添加探针地址,使用IDEA等IDE工具时,需要添加启动参数,下面以IDEA为例。
打开启动配置界面

图-5

在Enviornment中添加agent的jar文件地址,使SkyWalking可以对该服务进行监控。
图-6

保存并启动该服务。

3.2 调用链演示

分别在浏览器访问http://localhost:9011/getConfighttp://localhost:9011/getException,第一个返回正确的数据,第二个报错,查看SkyWalking的UI界面,可以看到服务的两个访问情况

图-7

点击追踪选项卡,可以看到第一个请求成功访问,第二个请求访问失败。
图-8

3.3 Demo下载地址

  • GitHub项目地址:
    https://github.com/diyzhang/42j125-swdemo
  • 使用Git下载项目的命令:
git clone https://github.com/diyzhang/42j125-swdemo.git

你可能感兴趣的:(400ToJava:125-SkyWalking分布式调用链监控介绍)