参数
全局指标
指标 |
指标名称 |
all_p99 |
所有服务响应时间的 p99 值 |
all_p95 |
所有服务响应时间的 p95 值 |
all_p90 |
所有服务响应时间的 p90 值 |
all_p75 |
所有服务响应时间的 p75 值 |
all_p70 |
所有服务响应时间的 p70 值 |
all_heatmap |
所有服务响应时间的热点图 |
服务指标
指标 |
指标名称 |
service_resp_time |
服务的平均响应时间 |
service_sla |
服务的成功率 |
service_p99 |
服务响应时间的 p99 值 |
service_p95 |
服务响应时间的 p95 值 |
service_p90 |
服务响应时间的 p90 值 |
service_p75 |
服务响应时间的 p75 值 |
service_p50 |
服务响应时间的 p50 值 |
服务实例指标
指标 |
指标名称 |
service_instance_sla |
服务实例的成功率 |
service_instance_resp_time |
服务实例的平均响应时间 |
service_instance_cpm |
服务实例每分钟调⽤次数 |
端点指标
指标 |
指标名称 |
endpoint_cpm |
端点每分钟调⽤次数 |
endpoint_avg, |
端点平均响应时间 |
endpoint_sla, |
端点成功率 |
endpoint_p99 |
端点响应时间的 p99 值 |
endpoint_p95 |
|
endpoint_p90 |
|
endpoint_p75 |
|
endpoint_p50 |
|
JVM指标
指标 |
指标名称 |
instance_jvm_cpu |
|
instance_jvm_memory_heap |
|
instance_jvm_memory_noheap |
|
instance_jvm_memory_heap_max |
|
instance_jvm_memory_noheap_max |
|
instance_jvm_young_gc_time |
|
instance_jvm_old_gc_time |
|
服务关系指标
指标 |
指标名称 |
service_relation_client_cpm |
在客户端每分钟检测到的调⽤次数 |
service_relation_server_cpm |
在服务端每分钟检测到的调⽤次数 |
service_relation_client_call_sla |
在客户端检测到的成功率 |
service_relation_server_call_sla |
在服务端检测到的成功率 |
service_relation_client_resp_time |
在客户端检测到的平均响应时间 |
service_relation_server_resp_time |
在服务端检测到的平均响应时间 |
service_relation_client_cpm |
在客户端每分钟检测到的调⽤次数 |
service_relation_server_cpm |
在服务端每分钟检测到的调⽤次数 |
端点关系指标
指标 |
指标名称 |
endpoint_relation_cpm |
|
endpoint_relation_resp_time |
|
其他关键指标
指标 |
指标名称 |
CPM |
每分钟请求调⽤的次数 |
SLA |
⽹站服务可⽤性(主要是通过请求成功与失败次数来计算),9越多代表全年服务可⽤时间越长服务更可靠,停机 时间越短 |
CLR |
(公共语⾔运⾏库)在运⾏期管理程序的执⾏:主要包含:内存管理、代码安全验证、代码执⾏、垃圾收集。CLR 有⼀项服务称为GC(Garbage Collector,垃圾收集),它能为你⾃动管理内存。GC⾃动从内存中删除程序不再访问的 对象,GC是程序员不再操⼼许多以前必须执⾏的任务,⽐如释放内存和检查内存泄漏。 |
百分位数 |
skywalking中有P50,P90,P95这种统计⼝径,就是百分位数的概念 |
内置方法参数
以下内容都是出自SkyWalking官方git
service_resp_time = from(Service.latency).longAvg();
service_sla = from(Service.*).percent(status == true);
service_cpm = from(Service.*).cpm();
service_percentile = from(Service.latency).percentile(10); // Multiple values including p50, p75, p90, p95, p99
service_apdex = from(Service.latency).apdex(name, status);
service_mq_consume_count = from(Service.*).filter(type == RequestType.MQ).count();
service_mq_consume_latency = from((str->long)Service.tag["transmission.latency"]).filter(type == RequestType.MQ).filter(tag["transmission.latency"] != null).longAvg();
// Service relation scope metrics for topology
service_relation_client_cpm = from(ServiceRelation.*).filter(detectPoint == DetectPoint.CLIENT).cpm();
service_relation_server_cpm = from(ServiceRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();
service_relation_client_call_sla = from(ServiceRelation.*).filter(detectPoint == DetectPoint.CLIENT).percent(status == true);
service_relation_server_call_sla = from(ServiceRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);
service_relation_client_resp_time = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).longAvg();
service_relation_server_resp_time = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.SERVER).longAvg();
service_relation_client_percentile = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).percentile(10); // Multiple values including p50, p75, p90, p95, p99
service_relation_server_percentile = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99
// Service Instance relation scope metrics for topology
service_instance_relation_client_cpm = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.CLIENT).cpm();
service_instance_relation_server_cpm = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();
service_instance_relation_client_call_sla = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.CLIENT).percent(status == true);
service_instance_relation_server_call_sla = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);
service_instance_relation_client_resp_time = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).longAvg();
service_instance_relation_server_resp_time = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.SERVER).longAvg();
service_instance_relation_client_percentile = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).percentile(10); // Multiple values including p50, p75, p90, p95, p99
service_instance_relation_server_percentile = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99
// Service Instance Scope metrics
service_instance_sla = from(ServiceInstance.*).percent(status == true);
service_instance_resp_time = from(ServiceInstance.latency).longAvg();
service_instance_cpm = from(ServiceInstance.*).cpm();
// Endpoint scope metrics
endpoint_cpm = from(Endpoint.*).cpm();
endpoint_resp_time = from(Endpoint.latency).longAvg();
endpoint_sla = from(Endpoint.*).percent(status == true);
endpoint_percentile = from(Endpoint.latency).percentile(10); // Multiple values including p50, p75, p90, p95, p99
endpoint_mq_consume_latency = from((str->long)Endpoint.tag["transmission.latency"]).filter(type == RequestType.MQ).filter(tag["transmission.latency"] != null).longAvg();
// Endpoint relation scope metrics
endpoint_relation_cpm = from(EndpointRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();
endpoint_relation_resp_time = from(EndpointRelation.rpcLatency).filter(detectPoint == DetectPoint.SERVER).longAvg();
endpoint_relation_sla = from(EndpointRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);
endpoint_relation_percentile = from(EndpointRelation.rpcLatency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99
database_access_resp_time = from(DatabaseAccess.latency).longAvg();
database_access_sla = from(DatabaseAccess.*).percent(status == true);
database_access_cpm = from(DatabaseAccess.*).cpm();
database_access_percentile = from(DatabaseAccess.latency).percentile(10);
cache_read_resp_time = from(CacheAccess.latency).filter(operation == VirtualCacheOperation.Read).longAvg();
cache_read_sla = from(CacheAccess.*).filter(operation == VirtualCacheOperation.Read).percent(status == true);
cache_read_cpm = from(CacheAccess.*).filter(operation == VirtualCacheOperation.Read).cpm();
cache_read_percentile = from(CacheAccess.latency).filter(operation == VirtualCacheOperation.Read).percentile(10);
cache_write_resp_time = from(CacheAccess.latency).filter(operation == VirtualCacheOperation.Write).longAvg();
cache_write_sla = from(CacheAccess.*).filter(operation == VirtualCacheOperation.Write).percent(status == true);
cache_write_cpm = from(CacheAccess.*).filter(operation == VirtualCacheOperation.Write).cpm();
cache_write_percentile = from(CacheAccess.latency).filter(operation == VirtualCacheOperation.Write).percentile(10);
cache_access_resp_time = from(CacheAccess.latency).longAvg();
cache_access_sla = from(CacheAccess.*).percent(status == true);
cache_access_cpm = from(CacheAccess.*).cpm();
cache_access_percentile = from(CacheAccess.latency).percentile(10);
mq_service_consume_cpm = from(MQAccess.*).filter(operation == MQOperation.Consume).cpm();
mq_service_consume_sla = from(MQAccess.*).filter(operation == MQOperation.Consume).percent(status == true);
mq_service_consume_latency = from(MQAccess.transmissionLatency).filter(operation == MQOperation.Consume).longAvg();
mq_service_consume_percentile = from(MQAccess.transmissionLatency).filter(operation == MQOperation.Consume).percentile(10);
mq_service_produce_cpm = from(MQAccess.*).filter(operation == MQOperation.Produce).cpm();
mq_service_produce_sla = from(MQAccess.*).filter(operation == MQOperation.Produce).percent(status == true);
mq_endpoint_consume_cpm = from(MQEndpointAccess.*).filter(operation == MQOperation.Consume).cpm();
mq_endpoint_consume_latency = from(MQEndpointAccess.transmissionLatency).filter(operation == MQOperation.Consume).longAvg();
mq_endpoint_consume_percentile = from(MQEndpointAccess.transmissionLatency).filter(operation == MQOperation.Consume).percentile(10);
mq_endpoint_consume_sla = from(MQEndpointAccess.*).filter(operation == MQOperation.Consume).percent(status == true);
mq_endpoint_produce_cpm = from(MQEndpointAccess.*).filter(operation == MQOperation.Produce).cpm();
mq_endpoint_produce_sla = from(MQEndpointAccess.*).filter(operation == MQOperation.Produce).percent(status == true);
titles
{
general_service: "常规服务",
general_service_desc: "通过从SkyWalking代理收集的遥测数据来观察服务和相对直接的依赖关系。",
general_service_services: "服务",
general_service_services_desc: "通过SkyWalking Agent收集的遥测数据观察服务。",
general_service_virtual_database: "虚拟数据库",
general_service_virtual_database_desc: "观察语言代理通过各种插件推测的虚拟数据库。",
general_service_virtual_cache: "虚拟缓存",
general_service_virtual_cache_desc: "观察语言代理通过各种插件推测的虚拟缓存服务器。",
general_service_virtual_mq: "虚拟消息队列",
general_service_virtual_mq_desc: "观察语言代理通过各种插件推测的虚拟消息队列服务器。",
service_mesh: "服务网格",
service_mesh_desc: "服务网格(Istio)通过分布式或微服务架构解决了开发人员和运营商面临的挑战。",
service_mesh_service: "服务",
service_mesh_service_desc: "通过从Envoy访问日志服务(ALS)收集的遥测数据观察服务网格。",
service_mesh_control_plane: "控制平面",
service_mesh_control_plane_desc: "通过Istio的自我监控指标提供对其行为的监控。",
service_mesh_data_plane: "数据平面",
service_mesh_data_plane_desc: "通过Envoy Metrics Service观察Envoy Proxy。",
functions: "Functions",
functions_desc:
"FaaS(功能即服务)是一种云计算服务,允许您在没有通常与构建和启动微服务应用程序相关的复杂基础设施的情况下执行代码以响应事件。",
functions_openfunction: "OpenFunction",
functions_openfunction_desc: "OpenFunction作为一个FaaS平台,通过SkyWalking集成提供开箱即用的可观察性。",
kubernetes: "Kubernetes",
kubernetes_desc: "Kubernetes是一个开源的容器编排系统,用于自动化软件部署、扩展和管理。",
kubernetes_cluster: "集群",
kubernetes_cluster_desc: "提供对K8S集群的状态和资源的监控。",
kubernetes_service: "服务",
kubernetes_service_desc: "从Kubernetes中观察服务状态和资源。",
infrastructure: "基础设施",
infrastructure_desc: "操作系统是整个IT系统的基础设施。它的可观察性为所有分布式和现代复杂系统的运行提供了基础。",
infrastructure_linux: "Linux",
infrastructure_linux_desc: "提供Linux操作系统(OS)监控。",
infrastructure_windows: "Windows",
infrastructure_windows_desc: "提供Windows操作系统(OS)监控。",
aws_cloud: "AWS云服务",
aws_cloud_desc: "亚马逊网络服务(AWS)提供可靠、可扩展且价格低廉的云计算服务。",
aws_cloud_eks: "EKS",
aws_cloud_eks_desc: "通过AWS Container Insights Receiver提供AWS Cloud EKS监控。",
aws_cloud_s3: "S3",
aws_cloud_s3_desc: "通过AWS FireHose Receiver提供AWS Cloud S3监控",
aws_cloud_dynamodb: "DynamoDB",
aws_cloud_dynamodb_desc: "通过AWS FireHose Receiver提供DynamoDB监控。",
aws_cloud_api_gateway: "API Gateway",
aws_cloud_api_gateway_desc: "通过AWS FireHose Receiver提供AWS Cloud API网关监控。",
browser: "Browser",
browser_desc: "通过Apache SkyWalking Client JS提供Web应用程序、版本和页面的浏览器端监控。",
gateway: "网关",
gateway_desc: "API网关是位于客户端和后端服务集合之间的API管理工具。",
gateway_apisix: "APISIX",
gateway_apisix_desc: "通过OpenTelemetry的Prometheus接收器提供APISIX监控。",
gateway_aws_api_gateway: "AWS API Gateway",
gateway_aws_api_gateway_desc: "通过AWS FireHose Receiver提供AWS Cloud API网关监控。",
database: "数据库",
database_desc: "数据库是结构化信息或数据的有组织的集合,通常以电子方式存储在计算机系统中。",
database_mysql_mariadb: "MySQL/MariaDB",
database_mysql_mariadb_desc: "通过OpenTelemetry的Prometheus接收器提供MySQL和MariaDB服务器监控。",
database_postgresql: "PostgreSQL",
database_postgresql_desc: "通过OpenTelemetry的Prometheus接收器提供PostgreSQL监控。",
database_dynamodb: "DynamoDB",
database_dynamodb_desc: "通过AWS FireHose Receiver提供DynamoDB监控。",
database_redis: "Redis",
database_redis_desc: "通过OpenTelemetry的Prometheus接收器提供Redis监控。",
database_elasticsearch: "Elasticsearch",
database_elasticsearch_desc: "通过OpenTelemetry的Prometheus接收器提供Elasticsearch服务器监控。",
database_mongodb: "MongoDB",
database_mongodb_desc: "通过OpenTelemetry的Prometheus接收器提供MongoDB监控。",
mq: "消息队列",
mq_desc: "消息队列是无服务器和微服务架构中使用的异步服务对服务通信的一种形式。",
mq_rabbitmq: "RabbitMQ",
mq_rabbitmq_desc: "通过OpenTelemetry的Prometheus接收器提供RabbitMQ监控。",
self_observability: "自监控",
self_observability_desc: "自观察性为运行SkyWalking生态系统中的组件和服务器提供了可观察性。",
self_observability_oap: "SkyWalking服务",
self_observability_oap_desc: "OAP后端集群本身是一个分布式流处理系统,这是对OAP后端本身的监控。",
self_observability_satellite: "Satellite",
self_observability_satellite_desc:
"Satellite:为云原生基础设施设计的开源代理,提供了一种低成本、高效、更安全的遥测数据收集方式。它是遥测采集的推荐负载均衡器。",
}