kafka使用Yammer Metrics来记录JMX数据。
<dependency>
<groupId>com.yammer.metricsgroupId>
<artifactId>metrics-coreartifactId>
<version>2.2.0version>
dependency>
https://www.cnblogs.com/caizhenghui/p/9132414.html
这个套路理论上不仅仅适用于Kakfa,而是适用于所有提供JMX暴露端口,并能够注入java agent的方法。
我们需要JMX Exporter做java agent来获得更全面的内部数据。相应的,我们需要修改启动命令到下面这个形式。
java -javaagent:./jmx_prometheus_javaagent-0.3.1.jar=8080:config.yaml -jar yourJar.jar
官方提供的一个配置实例是
---
startDelaySeconds: 0
hostPort: 127.0.0.1:1234
username:
password:
jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:1234/jmxrmi
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
whitelistObjectNames: ["org.apache.cassandra.metrics:*"]
blacklistObjectNames: ["org.apache.cassandra.metrics:type=ColumnFamily,*"]
rules:
- pattern: 'org.apache.cassandra.metrics<>Value: (\d+)'
name: cassandra_$1_$2
value: $3
valueFactor: 0.001
labels: {}
help: "Cassandra metric $1 $2"
type: GAUGE
attrNameSnakeCase: false
我们实际用来监控kafka的配置文件就非常简单了
hostPort: 127.0.0.1:9095
lowercaseOutputName: true
whitelistObjectNames如果不配置,默认导出所有MBean。blacklistObjectNames如果不设置,默认为空。
我们可以通过访问JMX Exporter暴露的端口来获取所有可检测的metrics。其中9095也就是kafka进程自己的JMX暴露端口。我们通过JMX Exporter将监控数据转移到了9990端口。直接访问9990端口
curl http://localhost:9990/metrics
输出如下
# HELP jmx_config_reload_failure_total Number of times configuration have failed to be reloaded.
# TYPE jmx_config_reload_failure_total counter
jmx_config_reload_failure_total 0.0
# HELP jvm_buffer_pool_used_bytes Used bytes of a given JVM buffer pool.
# TYPE jvm_buffer_pool_used_bytes gauge
jvm_buffer_pool_used_bytes{pool="direct",} 5319596.0
jvm_buffer_pool_used_bytes{pool="mapped",} 2.39075298E9
# HELP jvm_buffer_pool_capacity_bytes Bytes capacity of a given JVM buffer pool.
# TYPE jvm_buffer_pool_capacity_bytes gauge
jvm_buffer_pool_capacity_bytes{pool="direct",} 5319596.0
jvm_buffer_pool_capacity_bytes{pool="mapped",} 2.39075298E9
# HELP jvm_buffer_pool_used_buffers Used buffers of a given JVM buffer pool.
# TYPE jvm_buffer_pool_used_buffers gauge
jvm_buffer_pool_used_buffers{pool="direct",} 26.0
jvm_buffer_pool_used_buffers{pool="mapped",} 241.0
# HELP jvm_info JVM version info
# TYPE jvm_info gauge
jvm_info{version="1.8.0_102-b14",vendor="Oracle Corporation",runtime="Java(TM) SE Runtime Environment",} 1.0
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 43.65
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.547018059742E9
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 338.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 65535.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 6.16871936E9
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 9.24340224E8
# HELP jvm_gc_collection_seconds Time spent in a given JVM garbage collector in seconds.
# TYPE jvm_gc_collection_seconds summary
jvm_gc_collection_seconds_count{gc="G1 Young Generation",} 35.0
jvm_gc_collection_seconds_sum{gc="G1 Young Generation",} 0.57
jvm_gc_collection_seconds_count{gc="G1 Old Generation",} 0.0
jvm_gc_collection_seconds_sum{gc="G1 Old Generation",} 0.0
# HELP jvm_threads_current Current thread count of a JVM
# TYPE jvm_threads_current gauge
jvm_threads_current 69.0
# HELP jvm_threads_daemon Daemon thread count of a JVM
# TYPE jvm_threads_daemon gauge
jvm_threads_daemon 48.0
# HELP jvm_threads_peak Peak thread count of a JVM
# TYPE jvm_threads_peak gauge
jvm_threads_peak 70.0
# HELP jvm_threads_started_total Started thread count of a JVM
# TYPE jvm_threads_started_total counter
jvm_threads_started_total 84.0
# HELP jvm_threads_deadlocked Cycles of JVM-threads that are in deadlock waiting to acquire object monitors or ownable synchronizers
# TYPE jvm_threads_deadlocked gauge
jvm_threads_deadlocked 0.0
# HELP jvm_threads_deadlocked_monitor Cycles of JVM-threads that are in deadlock waiting to acquire object monitors
# TYPE jvm_threads_deadlocked_monitor gauge
jvm_threads_deadlocked_monitor 0.0
# HELP jvm_classes_loaded The number of classes that are currently loaded in the JVM
# TYPE jvm_classes_loaded gauge
jvm_classes_loaded 5240.0
# HELP jvm_classes_loaded_total The total number of classes that have been loaded since the JVM has started execution
# TYPE jvm_classes_loaded_total counter
jvm_classes_loaded_total 5240.0
# HELP jvm_classes_unloaded_total The total number of classes that have been unloaded since the JVM has started execution
# TYPE jvm_classes_unloaded_total counter
jvm_classes_unloaded_total 0.0
# HELP kafka_server_replicamanager_value Attribute exposed for management (kafka.server<>Value)
# TYPE kafka_server_replicamanager_value untyped
kafka_server_replicamanager_value{name="UnderReplicatedPartitions",} 0.0
# HELP kafka_server_brokertopicmetrics_count Attribute exposed for management (kafka.server<>Count)
# TYPE kafka_server_brokertopicmetrics_count untyped
kafka_server_brokertopicmetrics_count{name="BytesInPerSec",} 502099.0
kafka_server_brokertopicmetrics_count{name="MessagesInPerSec",} 4206.0
# HELP kafka_server_brokertopicmetrics_oneminuterate Attribute exposed for management (kafka.server<>OneMinuteRate)
# TYPE kafka_server_brokertopicmetrics_oneminuterate untyped
kafka_server_brokertopicmetrics_oneminuterate{name="BytesInPerSec",} 5438.520538936058
kafka_server_brokertopicmetrics_oneminuterate{name="MessagesInPerSec",} 45.54372300762534
# HELP kafka_server_brokertopicmetrics_meanrate Attribute exposed for management (kafka.server<>MeanRate)
# TYPE kafka_server_brokertopicmetrics_meanrate untyped
kafka_server_brokertopicmetrics_meanrate{name="BytesInPerSec",} 8391.850324711393
kafka_server_brokertopicmetrics_meanrate{name="MessagesInPerSec",} 70.28899468899631
# HELP kafka_controller_kafkacontroller_value Attribute exposed for management (kafka.controller<>Value)
# TYPE kafka_controller_kafkacontroller_value untyped
kafka_controller_kafkacontroller_value{name="OfflinePartitionsCount",} 0.0
kafka_controller_kafkacontroller_value{name="ActiveControllerCount",} 0.0
# HELP kafka_server_replicafetchermanager_value Attribute exposed for management (kafka.server<>Value)
# TYPE kafka_server_replicafetchermanager_value untyped
kafka_server_replicafetchermanager_value{name="MaxLag",clientId="Replica",} 0.0
# HELP kafka_server_brokertopicmetrics_fifteenminuterate Attribute exposed for management (kafka.server<>FifteenMinuteRate)
# TYPE kafka_server_brokertopicmetrics_fifteenminuterate untyped
kafka_server_brokertopicmetrics_fifteenminuterate{name="BytesInPerSec",} 488.174726685667
kafka_server_brokertopicmetrics_fifteenminuterate{name="MessagesInPerSec",} 4.0869490140718785
# HELP kafka_server_brokertopicmetrics_fiveminuterate Attribute exposed for management (kafka.server<>FiveMinuteRate)
# TYPE kafka_server_brokertopicmetrics_fiveminuterate untyped
kafka_server_brokertopicmetrics_fiveminuterate{name="BytesInPerSec",} 1400.7117556545422
kafka_server_brokertopicmetrics_fiveminuterate{name="MessagesInPerSec",} 11.727110166511496
# HELP jmx_scrape_duration_seconds Time this JMX scrape took, in seconds.
# TYPE jmx_scrape_duration_seconds gauge
jmx_scrape_duration_seconds 0.018793424
# HELP jmx_scrape_error Non-zero if this scrape failed.
# TYPE jmx_scrape_error gauge
jmx_scrape_error 0.0
# HELP jvm_memory_bytes_used Used bytes of a given JVM memory area.
# TYPE jvm_memory_bytes_used gauge
jvm_memory_bytes_used{area="heap",} 6.28849256E8
jvm_memory_bytes_used{area="nonheap",} 5.4556888E7
# HELP jvm_memory_bytes_committed Committed (bytes) of a given JVM memory area.
# TYPE jvm_memory_bytes_committed gauge
jvm_memory_bytes_committed{area="heap",} 1.073741824E9
jvm_memory_bytes_committed{area="nonheap",} 5.57056E7
# HELP jvm_memory_bytes_max Max (bytes) of a given JVM memory area.
# TYPE jvm_memory_bytes_max gauge
jvm_memory_bytes_max{area="heap",} 1.073741824E9
jvm_memory_bytes_max{area="nonheap",} -1.0
# HELP jvm_memory_bytes_init Initial bytes of a given JVM memory area.
# TYPE jvm_memory_bytes_init gauge
jvm_memory_bytes_init{area="heap",} 1.073741824E9
jvm_memory_bytes_init{area="nonheap",} 2555904.0
# HELP jvm_memory_pool_bytes_used Used bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_used gauge
jvm_memory_pool_bytes_used{pool="Code Cache",} 1.7908864E7
jvm_memory_pool_bytes_used{pool="Metaspace",} 3.2436808E7
jvm_memory_pool_bytes_used{pool="Compressed Class Space",} 4211216.0
jvm_memory_pool_bytes_used{pool="G1 Eden Space",} 4.64519168E8
jvm_memory_pool_bytes_used{pool="G1 Survivor Space",} 3145728.0
jvm_memory_pool_bytes_used{pool="G1 Old Gen",} 1.6118436E8
# HELP jvm_memory_pool_bytes_committed Committed bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_committed gauge
jvm_memory_pool_bytes_committed{pool="Code Cache",} 1.8481152E7
jvm_memory_pool_bytes_committed{pool="Metaspace",} 3.2899072E7
jvm_memory_pool_bytes_committed{pool="Compressed Class Space",} 4325376.0
jvm_memory_pool_bytes_committed{pool="G1 Eden Space",} 6.73185792E8
jvm_memory_pool_bytes_committed{pool="G1 Survivor Space",} 3145728.0
jvm_memory_pool_bytes_committed{pool="G1 Old Gen",} 3.97410304E8
# HELP jvm_memory_pool_bytes_max Max bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_max gauge
jvm_memory_pool_bytes_max{pool="Code Cache",} 2.5165824E8
jvm_memory_pool_bytes_max{pool="Metaspace",} -1.0
jvm_memory_pool_bytes_max{pool="Compressed Class Space",} 1.073741824E9
jvm_memory_pool_bytes_max{pool="G1 Eden Space",} -1.0
jvm_memory_pool_bytes_max{pool="G1 Survivor Space",} -1.0
jvm_memory_pool_bytes_max{pool="G1 Old Gen",} 1.073741824E9
# HELP jvm_memory_pool_bytes_init Initial bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_init gauge
jvm_memory_pool_bytes_init{pool="Code Cache",} 2555904.0
jvm_memory_pool_bytes_init{pool="Metaspace",} 0.0
jvm_memory_pool_bytes_init{pool="Compressed Class Space",} 0.0
jvm_memory_pool_bytes_init{pool="G1 Eden Space",} 5.6623104E7
jvm_memory_pool_bytes_init{pool="G1 Survivor Space",} 0.0
jvm_memory_pool_bytes_init{pool="G1 Old Gen",} 1.01711872E9
# HELP jmx_config_reload_success_total Number of times configuration have successfully been reloaded.
# TYPE jmx_config_reload_success_total counter
jmx_config_reload_success_total 0.0
https://prometheus.io/
下载安装请参考上述网址。这可以说是一个专门为监控而生的数据存储系统,同时具备主动拉取数据,存储数据,数据触发报警的功能。
prometheus的主要概念是metrics,类似oracle中的表。一个metrics中可能包含多个时间序列(time series)。时间默认是GMT。
启动命令如下,默认端口是9090。
./prometheus --config.file=prometheus.yml --web.listen-address=:8080
我的配置文件的内容如下
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['10.237.78.22:9990','10.237.78.21:9990','10.237.78.10:9990']
启动完成后,我们便可以在网页端访问了。
我的prometheus是很久以前搭的,有一些无关数据,而我不知道如何清除指定的metrics。所以我直接删除data目录下的所有内容后重启。
grafana似乎只能用root权限进行启动。默认端口我只知道通过修改防火墙来更改。。。。