SkyWalking内置参数与方法

参数

全局指标

指标 指标名称
all_p99 所有服务响应时间的 p99 值
all_p95 所有服务响应时间的 p95 值
all_p90 所有服务响应时间的 p90 值
all_p75 所有服务响应时间的 p75 值
all_p70 所有服务响应时间的 p70 值
all_heatmap 所有服务响应时间的热点图

服务指标

指标 指标名称
service_resp_time 服务的平均响应时间
service_sla 服务的成功率
service_p99 服务响应时间的 p99 值
service_p95 服务响应时间的 p95 值
service_p90 服务响应时间的 p90 值
service_p75 服务响应时间的 p75 值
service_p50 服务响应时间的 p50 值

服务实例指标

指标 指标名称
service_instance_sla 服务实例的成功率
service_instance_resp_time 服务实例的平均响应时间
service_instance_cpm 服务实例每分钟调⽤次数

端点指标

指标 指标名称
endpoint_cpm 端点每分钟调⽤次数
endpoint_avg, 端点平均响应时间
endpoint_sla, 端点成功率
endpoint_p99 端点响应时间的 p99 值
endpoint_p95
endpoint_p90
endpoint_p75
endpoint_p50

JVM指标

指标 指标名称
instance_jvm_cpu
instance_jvm_memory_heap
instance_jvm_memory_noheap
instance_jvm_memory_heap_max
instance_jvm_memory_noheap_max
instance_jvm_young_gc_time
instance_jvm_old_gc_time

服务关系指标

指标 指标名称
service_relation_client_cpm 在客户端每分钟检测到的调⽤次数
service_relation_server_cpm 在服务端每分钟检测到的调⽤次数
service_relation_client_call_sla 在客户端检测到的成功率
service_relation_server_call_sla 在服务端检测到的成功率
service_relation_client_resp_time 在客户端检测到的平均响应时间
service_relation_server_resp_time 在服务端检测到的平均响应时间
service_relation_client_cpm 在客户端每分钟检测到的调⽤次数
service_relation_server_cpm 在服务端每分钟检测到的调⽤次数

端点关系指标

指标 指标名称
endpoint_relation_cpm
endpoint_relation_resp_time

其他关键指标

指标 指标名称
CPM 每分钟请求调⽤的次数
SLA ⽹站服务可⽤性(主要是通过请求成功与失败次数来计算),9越多代表全年服务可⽤时间越长服务更可靠,停机 时间越短
CLR (公共语⾔运⾏库)在运⾏期管理程序的执⾏:主要包含:内存管理、代码安全验证、代码执⾏、垃圾收集。CLR 有⼀项服务称为GC(Garbage Collector,垃圾收集),它能为你⾃动管理内存。GC⾃动从内存中删除程序不再访问的 对象,GC是程序员不再操⼼许多以前必须执⾏的任务,⽐如释放内存和检查内存泄漏。
百分位数 skywalking中有P50,P90,P95这种统计⼝径,就是百分位数的概念

内置方法参数

以下内容都是出自SkyWalking官方git

service_resp_time = from(Service.latency).longAvg();
service_sla = from(Service.*).percent(status == true);
service_cpm = from(Service.*).cpm();
service_percentile = from(Service.latency).percentile(10); // Multiple values including p50, p75, p90, p95, p99
service_apdex = from(Service.latency).apdex(name, status);
service_mq_consume_count = from(Service.*).filter(type == RequestType.MQ).count();
service_mq_consume_latency = from((str->long)Service.tag["transmission.latency"]).filter(type == RequestType.MQ).filter(tag["transmission.latency"] != null).longAvg();

// Service relation scope metrics for topology
service_relation_client_cpm = from(ServiceRelation.*).filter(detectPoint == DetectPoint.CLIENT).cpm();
service_relation_server_cpm = from(ServiceRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();
service_relation_client_call_sla = from(ServiceRelation.*).filter(detectPoint == DetectPoint.CLIENT).percent(status == true);
service_relation_server_call_sla = from(ServiceRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);
service_relation_client_resp_time = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).longAvg();
service_relation_server_resp_time = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.SERVER).longAvg();
service_relation_client_percentile = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).percentile(10); // Multiple values including p50, p75, p90, p95, p99
service_relation_server_percentile = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99

// Service Instance relation scope metrics for topology
service_instance_relation_client_cpm = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.CLIENT).cpm();
service_instance_relation_server_cpm = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();
service_instance_relation_client_call_sla = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.CLIENT).percent(status == true);
service_instance_relation_server_call_sla = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);
service_instance_relation_client_resp_time = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).longAvg();
service_instance_relation_server_resp_time = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.SERVER).longAvg();
service_instance_relation_client_percentile = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).percentile(10); // Multiple values including p50, p75, p90, p95, p99
service_instance_relation_server_percentile = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99

// Service Instance Scope metrics
service_instance_sla = from(ServiceInstance.*).percent(status == true);
service_instance_resp_time = from(ServiceInstance.latency).longAvg();
service_instance_cpm = from(ServiceInstance.*).cpm();

// Endpoint scope metrics
endpoint_cpm = from(Endpoint.*).cpm();
endpoint_resp_time = from(Endpoint.latency).longAvg();
endpoint_sla = from(Endpoint.*).percent(status == true);
endpoint_percentile = from(Endpoint.latency).percentile(10); // Multiple values including p50, p75, p90, p95, p99
endpoint_mq_consume_latency = from((str->long)Endpoint.tag["transmission.latency"]).filter(type == RequestType.MQ).filter(tag["transmission.latency"] != null).longAvg();

// Endpoint relation scope metrics
endpoint_relation_cpm = from(EndpointRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();
endpoint_relation_resp_time = from(EndpointRelation.rpcLatency).filter(detectPoint == DetectPoint.SERVER).longAvg();
endpoint_relation_sla = from(EndpointRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);
endpoint_relation_percentile = from(EndpointRelation.rpcLatency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99

database_access_resp_time = from(DatabaseAccess.latency).longAvg();
database_access_sla = from(DatabaseAccess.*).percent(status == true);
database_access_cpm = from(DatabaseAccess.*).cpm();
database_access_percentile = from(DatabaseAccess.latency).percentile(10);

cache_read_resp_time = from(CacheAccess.latency).filter(operation == VirtualCacheOperation.Read).longAvg();
cache_read_sla = from(CacheAccess.*).filter(operation == VirtualCacheOperation.Read).percent(status == true);
cache_read_cpm = from(CacheAccess.*).filter(operation == VirtualCacheOperation.Read).cpm();
cache_read_percentile = from(CacheAccess.latency).filter(operation == VirtualCacheOperation.Read).percentile(10);

cache_write_resp_time = from(CacheAccess.latency).filter(operation == VirtualCacheOperation.Write).longAvg();
cache_write_sla = from(CacheAccess.*).filter(operation == VirtualCacheOperation.Write).percent(status == true);
cache_write_cpm = from(CacheAccess.*).filter(operation == VirtualCacheOperation.Write).cpm();
cache_write_percentile = from(CacheAccess.latency).filter(operation == VirtualCacheOperation.Write).percentile(10);

cache_access_resp_time = from(CacheAccess.latency).longAvg();
cache_access_sla = from(CacheAccess.*).percent(status == true);
cache_access_cpm = from(CacheAccess.*).cpm();
cache_access_percentile = from(CacheAccess.latency).percentile(10);

mq_service_consume_cpm = from(MQAccess.*).filter(operation == MQOperation.Consume).cpm();
mq_service_consume_sla = from(MQAccess.*).filter(operation == MQOperation.Consume).percent(status == true);
mq_service_consume_latency = from(MQAccess.transmissionLatency).filter(operation == MQOperation.Consume).longAvg();
mq_service_consume_percentile = from(MQAccess.transmissionLatency).filter(operation == MQOperation.Consume).percentile(10);
mq_service_produce_cpm = from(MQAccess.*).filter(operation == MQOperation.Produce).cpm();
mq_service_produce_sla = from(MQAccess.*).filter(operation == MQOperation.Produce).percent(status == true);

mq_endpoint_consume_cpm = from(MQEndpointAccess.*).filter(operation == MQOperation.Consume).cpm();
mq_endpoint_consume_latency = from(MQEndpointAccess.transmissionLatency).filter(operation == MQOperation.Consume).longAvg();
mq_endpoint_consume_percentile = from(MQEndpointAccess.transmissionLatency).filter(operation == MQOperation.Consume).percentile(10);
mq_endpoint_consume_sla = from(MQEndpointAccess.*).filter(operation == MQOperation.Consume).percent(status == true);
mq_endpoint_produce_cpm = from(MQEndpointAccess.*).filter(operation == MQOperation.Produce).cpm();
mq_endpoint_produce_sla = from(MQEndpointAccess.*).filter(operation == MQOperation.Produce).percent(status == true);

titles

{
  // General Service
  general_service: "常规服务",
  general_service_desc: "通过从SkyWalking代理收集的遥测数据来观察服务和相对直接的依赖关系。",
  general_service_services: "服务",
  general_service_services_desc: "通过SkyWalking Agent收集的遥测数据观察服务。",
  general_service_virtual_database: "虚拟数据库",
  general_service_virtual_database_desc: "观察语言代理通过各种插件推测的虚拟数据库。",
  general_service_virtual_cache: "虚拟缓存",
  general_service_virtual_cache_desc: "观察语言代理通过各种插件推测的虚拟缓存服务器。",
  general_service_virtual_mq: "虚拟消息队列",
  general_service_virtual_mq_desc: "观察语言代理通过各种插件推测的虚拟消息队列服务器。",
  // Service Mesh
  service_mesh: "服务网格",
  service_mesh_desc: "服务网格(Istio)通过分布式或微服务架构解决了开发人员和运营商面临的挑战。",
  service_mesh_service: "服务",
  service_mesh_service_desc: "通过从Envoy访问日志服务(ALS)收集的遥测数据观察服务网格。",
  service_mesh_control_plane: "控制平面",
  service_mesh_control_plane_desc: "通过Istio的自我监控指标提供对其行为的监控。",
  service_mesh_data_plane: "数据平面",
  service_mesh_data_plane_desc: "通过Envoy Metrics Service观察Envoy Proxy。",
  // Functions
  functions: "Functions",
  functions_desc:
    "FaaS(功能即服务)是一种云计算服务,允许您在没有通常与构建和启动微服务应用程序相关的复杂基础设施的情况下执行代码以响应事件。",
  functions_openfunction: "OpenFunction",
  functions_openfunction_desc: "OpenFunction作为一个FaaS平台,通过SkyWalking集成提供开箱即用的可观察性。",
  // Kubernetes
  kubernetes: "Kubernetes",
  kubernetes_desc: "Kubernetes是一个开源的容器编排系统,用于自动化软件部署、扩展和管理。",
  kubernetes_cluster: "集群",
  kubernetes_cluster_desc: "提供对K8S集群的状态和资源的监控。",
  kubernetes_service: "服务",
  kubernetes_service_desc: "从Kubernetes中观察服务状态和资源。",
  // Infrastructure
  infrastructure: "基础设施",
  infrastructure_desc: "操作系统是整个IT系统的基础设施。它的可观察性为所有分布式和现代复杂系统的运行提供了基础。",
  infrastructure_linux: "Linux",
  infrastructure_linux_desc: "提供Linux操作系统(OS)监控。",
  infrastructure_windows: "Windows",
  infrastructure_windows_desc: "提供Windows操作系统(OS)监控。",
  // AWS Cloud
  aws_cloud: "AWS云服务",
  aws_cloud_desc: "亚马逊网络服务(AWS)提供可靠、可扩展且价格低廉的云计算服务。",
  aws_cloud_eks: "EKS",
  aws_cloud_eks_desc: "通过AWS Container Insights Receiver提供AWS Cloud EKS监控。",
  aws_cloud_s3: "S3",
  aws_cloud_s3_desc: "通过AWS FireHose Receiver提供AWS Cloud S3监控",
  aws_cloud_dynamodb: "DynamoDB",
  aws_cloud_dynamodb_desc: "通过AWS FireHose Receiver提供DynamoDB监控。",
  aws_cloud_api_gateway: "API Gateway",
  aws_cloud_api_gateway_desc: "通过AWS FireHose Receiver提供AWS Cloud API网关监控。",
  // Browser
  browser: "Browser",
  browser_desc: "通过Apache SkyWalking Client JS提供Web应用程序、版本和页面的浏览器端监控。",
  // Gateway
  gateway: "网关",
  gateway_desc: "API网关是位于客户端和后端服务集合之间的API管理工具。",
  gateway_apisix: "APISIX",
  gateway_apisix_desc: "通过OpenTelemetry的Prometheus接收器提供APISIX监控。",
  gateway_aws_api_gateway: "AWS API Gateway",
  gateway_aws_api_gateway_desc: "通过AWS FireHose Receiver提供AWS Cloud API网关监控。",
  // Database
  database: "数据库",
  database_desc: "数据库是结构化信息或数据的有组织的集合,通常以电子方式存储在计算机系统中。",
  database_mysql_mariadb: "MySQL/MariaDB",
  database_mysql_mariadb_desc: "通过OpenTelemetry的Prometheus接收器提供MySQL和MariaDB服务器监控。",
  database_postgresql: "PostgreSQL",
  database_postgresql_desc: "通过OpenTelemetry的Prometheus接收器提供PostgreSQL监控。",
  database_dynamodb: "DynamoDB",
  database_dynamodb_desc: "通过AWS FireHose Receiver提供DynamoDB监控。",
  database_redis: "Redis",
  database_redis_desc: "通过OpenTelemetry的Prometheus接收器提供Redis监控。",
  database_elasticsearch: "Elasticsearch",
  database_elasticsearch_desc: "通过OpenTelemetry的Prometheus接收器提供Elasticsearch服务器监控。",
  database_mongodb: "MongoDB",
  database_mongodb_desc: "通过OpenTelemetry的Prometheus接收器提供MongoDB监控。",
  // Message Queue
  mq: "消息队列",
  mq_desc: "消息队列是无服务器和微服务架构中使用的异步服务对服务通信的一种形式。",
  mq_rabbitmq: "RabbitMQ",
  mq_rabbitmq_desc: "通过OpenTelemetry的Prometheus接收器提供RabbitMQ监控。",
  // self observability
  self_observability: "自监控",
  self_observability_desc: "自观察性为运行SkyWalking生态系统中的组件和服务器提供了可观察性。",
  self_observability_oap: "SkyWalking服务",
  self_observability_oap_desc: "OAP后端集群本身是一个分布式流处理系统,这是对OAP后端本身的监控。",
  self_observability_satellite: "Satellite",
  self_observability_satellite_desc:
    "Satellite:为云原生基础设施设计的开源代理,提供了一种低成本、高效、更安全的遥测数据收集方式。它是遥测采集的推荐负载均衡器。",
}

你可能感兴趣的:(SkyWalking,skywalking,服务器,运维)