搭建prometheus的系统资源和kafka监控

什么是Prometheus?

Prometheus是由SoundCloud开发的开源监控报警系统和时序列数据库(TSDB)。Prometheus使用Go语言开发,是Google BorgMon监控系统的开源版本。
2016年由Google发起Linux基金会旗下的原生云基金会(Cloud Native Computing Foundation), 将Prometheus纳入其下第二大开源项目。
Prometheus目前在开源社区相当活跃。
Prometheus和Heapster(Heapster是K8S的一个子项目,用于获取集群的性能数据。)相比功能更完善、更全面。Prometheus性能也足够支撑上万台规模的集群。

Prometheus的特点

  • 多维度数据模型。
  • 灵活的查询语言。
  • 不依赖分布式存储,单个服务器节点是自主的。
  • 通过基于HTTP的pull方式采集时序数据。
  • 可以通过中间网关进行时序列数据推送。
  • 通过服务发现或者静态配置来发现目标服务对象。
  • 支持多种多样的图表和界面展示,比如Grafana等。

官网地址:https://prometheus.io/

架构图

image
image

基本原理

Prometheus的基本原理是通过HTTP协议周期性抓取被监控组件的状态,任意组件只要提供对应的HTTP接口就可以接入监控。不需要任何SDK或者其他的集成过程。这样做非常适合做虚拟化环境监控系统,比如VM、Docker、Kubernetes等。输出被监控组件信息的HTTP接口被叫做exporter 。目前互联网公司常用的组件大部分都有exporter可以直接使用,比如Varnish、Haproxy、Nginx、MySQL、Linux系统信息(包括磁盘、内存、CPU、网络等等)。

服务过程

  • Prometheus Daemon负责定时去目标上抓取metrics(指标)数据,每个抓取目标需要暴露一个http服务的接口给它定时抓取。Prometheus支持通过配置文件、文本文件、Zookeeper、Consul、DNS SRV Lookup等方式指定抓取目标。Prometheus采用PULL的方式进行监控,即服务器可以直接通过目标PULL数据或者间接地通过中间网关来Push数据。
  • Prometheus在本地存储抓取的所有数据,并通过一定规则进行清理和整理数据,并把得到的结果存储到新的时间序列中。
  • Prometheus通过PromQL和其他API可视化地展示收集的数据。Prometheus支持很多方式的图表可视化,例如Grafana、自带的Promdash以及自身提供的模版引擎等等。Prometheus还提供HTTP API的查询方式,自定义所需要的输出。
  • PushGateway支持Client主动推送metrics到PushGateway,而Prometheus只是定时去Gateway上抓取数据。
  • Alertmanager是独立于Prometheus的一个组件,可以支持Prometheus的查询语句,提供十分灵活的报警方式。

本教程内容简介

  • 1.演示安装Prometheus Server
  • 2.演示node-exporter、jmx_prometheus_javaagent安装使用,分别监控linux系统资源和kafka指标
  • 3.演示grafana的使用

一、prometheus安装

  • 官网下载prometheus-2.18.1.linux-amd64.tar.gz并解压
[root@master prometheus]# ll
total 148584
drwxr-xr-x  2 3434 3434     4096 May  8  2020 console_libraries
drwxr-xr-x  2 3434 3434     4096 May  8  2020 consoles
drwxr-xr-x 16 root root     4096 Nov 10 13:00 data
drwxr-xr-x  2 root root     4096 Nov  4 16:28 lib
-rw-r--r--  1 3434 3434    11357 May  8  2020 LICENSE
-rw-r--r--  1 3434 3434     3184 May  8  2020 NOTICE
-rwxr-xr-x  1 3434 3434 87173843 May  8  2020 prometheus
-rw-r--r--  1 3434 3434     1209 Nov  4 16:41 prometheus.yml
-rwxr-xr-x  1 3434 3434 49973547 May  8  2020 promtool
-rwxr-xr-x  1 3434 3434 14957614 May  8  2020 tsdb
  • 修改配置文件prometheus.yml

配置如下

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
    - targets: ['localhost:9090']
  • 运行prometheus
./prometheus

访问http://10.4.4.16:9090/graph如下图所示,及表示成功安装

image.png

二、exporter安装使用

  1. node_exporter安装运行
  • 下载node_exporter-1.0.1.linux-amd64.tar.gz并解压
[root@master node_exporter]# ll
total 19220
-rw-r--r-- 1 3434 3434    11357 Jun 16 21:19 LICENSE
-rwxr-xr-x 1 3434 3434 19657731 Jun 16 20:44 node_exporter
-rw------- 1 root root       47 Sep 30 03:02 nohup.out
-rw-r--r-- 1 3434 3434      463 Jun 16 21:19 NOTICE
  • 运行
./node_exporter
  1. kafka_exporter安装运行

方式一:非侵入式kafka监控(推荐)

  • 下载kafka_exporter-1.2.0.linux-amd64.tar.gz并解压
[root@master kafka_exporter]# ll
total 13276
-rwxr-xr-x 1 2000 2000 13578776 Jul  7  2018 kafka_exporter
-rw-rw-r-- 1 2000 2000    11357 Jul  7  2018 LICENSE
  • 运行
./kafka_exporter --kafka.server=127.0.0.1:9092

方式二:侵入式kafka监控exporter安装(比较复杂,可以跳过)

jmx_prometheus_javaagent安装运行

  • 下载二进制文件jmx_prometheus_javaagent-0.14.0.jar和配置文件kafka-0-8-2.yml 本教程将二进制文件放在了/opt/prometheus/lib目录下
[root@master prometheus]# ll
total 148584
drwxr-xr-x  2 3434 3434     4096 May  8  2020 console_libraries
drwxr-xr-x  2 3434 3434     4096 May  8  2020 consoles
drwxr-xr-x 17 root root     4096 Nov 10 15:00 data
drwxr-xr-x  2 root root     4096 Nov  4 16:28 lib
-rw-r--r--  1 3434 3434    11357 May  8  2020 LICENSE
-rw-r--r--  1 3434 3434     3184 May  8  2020 NOTICE
-rwxr-xr-x  1 3434 3434 87173843 May  8  2020 prometheus
-rw-r--r--  1 3434 3434     1209 Nov  4 16:41 prometheus.yml
-rwxr-xr-x  1 3434 3434 49973547 May  8  2020 promtool
-rwxr-xr-x  1 3434 3434 14957614 May  8  2020 tsdb
[root@master prometheus]# ll lib
total 412
-rw-r--r-- 1 root root 413862 Nov  4 16:25 jmx_prometheus_javaagent-0.14.0.jar
-rw-r--r-- 1 root root   2820 Nov  4 16:28 kafka-0-8-2.yml
  • 修改启动kafka的脚本,添加如下所示
export KAFKA_OPTS="$KAFKA_OPTS -javaagent:/opt/prometheus/lib/jmx_prometheus_javaagent-0.14.0.jar=7071:/opt/prometheus/lib/kafka-0-8-2.yml"
  • 启动kafka
[root@master prometheus]# ps axu|grep kafka
root     16400  0.0  0.0 103320   896 pts/2    S+   15:08   0:00 grep kafka
root     26565  3.9 34.4 13512408 5680568 pts/3 Sl  Nov04 341:41 /opt/local/jdk/bin/java -Xmx6G -Xms2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC -Djava.awt.headless=true -Xloggc:/data/logs/kafka/kafkaServer-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dkafka.logs.dir=/data/logs/kafka -Dlog4j.configuration=file:bin/../config/log4j.properties -cp :/opt/local/kafka/bin/../libs/aopalliance-repackaged-2.5.0-b05.jar:/opt/local/kafka/bin/../libs/argparse4j-0.7.0.jar:/opt/local/kafka/bin/../libs/connect-api-0.10.2.1.jar:/opt/local/kafka/bin/../libs/connect-file-0.10.2.1.jar:/opt/local/kafka/bin/../libs/connect-json-0.10.2.1.jar:/opt/local/kafka/bin/../libs/connect-runtime-0.10.2.1.jar:/opt/local/kafka/bin/../libs/connect-transforms-0.10.2.1.jar:/opt/local/kafka/bin/../libs/guava-18.0.jar:/opt/local/kafka/bin/../libs/hk2-api-2.5.0-b05.jar:/opt/local/kafka/bin/../libs/hk2-locator-2.5.0-b05.jar:/opt/local/kafka/bin/../libs/hk2-utils-2.5.0-b05.jar:/opt/local/kafka/bin/../libs/jackson-annotations-2.8.0.jar:/opt/local/kafka/bin/../libs/jackson-annotations-2.8.5.jar:/opt/local/kafka/bin/../libs/jackson-core-2.8.5.jar:/opt/local/kafka/bin/../libs/jackson-databind-2.8.5.jar:/opt/local/kafka/bin/../libs/jackson-jaxrs-base-2.8.5.jar:/opt/local/kafka/bin/../libs/jackson-jaxrs-json-provider-2.8.5.jar:/opt/local/kafka/bin/../libs/jackson-module-jaxb-annotations-2.8.5.jar:/opt/local/kafka/bin/../libs/javassist-3.20.0-GA.jar:/opt/local/kafka/bin/../libs/javax.annotation-api-1.2.jar:/opt/local/kafka/bin/../libs/javax.inject-1.jar:/opt/local/kafka/bin/../libs/javax.inject-2.5.0-b05.jar:/opt/local/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/opt/local/kafka/bin/../libs/javax.ws.rs-api-2.0.1.jar:/opt/local/kafka/bin/../libs/jersey-client-2.24.jar:/opt/local/kafka/bin/../libs/jersey-common-2.24.jar:/opt/local/kafka/bin/../libs/jersey-container-servlet-2.24.jar:/opt/local/kafka/bin/../libs/jersey-container-servlet-core-2.24.jar:/opt/local/kafka/bin/../libs/jersey-guava-2.24.jar:/opt/local/kafka/bin/../libs/jersey-media-jaxb-2.24.jar:/opt/local/kafka/bin/../libs/jersey-server-2.24.jar:/opt/local/kafka/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-http-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-io-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-security-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-server-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jetty-util-9.2.15.v20160210.jar:/opt/local/kafka/bin/../libs/jopt-simple-5.0.3.jar:/opt/local/kafka/bin/../libs/kafka_2.11-0.10.2.1.jar:/opt/local/kafka/bin/../libs/kafka_2.11-0.10.2.1-sources.jar:/opt/local/kafka/bin/../libs/kafka_2.11-0.10.2.1-test-sources.jar:/opt/local/kafka/bin/../libs/kafka-clients-0.10.2.1.jar:/opt/local/kafka/bin/../libs/kafka-log4j-appender-0.10.2.1.jar:/opt/local/kafka/bin/../libs/kafka-streams-0.10.2.1.jar:/opt/local/kafka/bin/../libs/kafka-streams-examples-0.10.2.1.jar:/opt/local/kafka/bin/../libs/kafka-tools-0.10.2.1.jar:/opt/local/kafka/bin/../libs/log4j-1.2.17.jar:/opt/local/kafka/bin/../libs/lz4-1.3.0.jar:/opt/local/kafka/bin/../libs/metrics-core-2.2.0.jar:/opt/local/kafka/bin/../libs/osgi-resource-locator-1.0.1.jar:/opt/local/kafka/bin/../libs/reflections-0.9.10.jar:/opt/local/kafka/bin/../libs/rocksdbjni-5.0.1.jar:/opt/local/kafka/bin/../libs/scala-library-2.11.8.jar:/opt/local/kafka/bin/../libs/scala-parser-combinators_2.11-1.0.4.jar:/opt/local/kafka/bin/../libs/slf4j-api-1.7.21.jar:/opt/local/kafka/bin/../libs/slf4j-log4j12-1.7.21.jar:/opt/local/kafka/bin/../libs/snappy-java-1.1.2.6.jar:/opt/local/kafka/bin/../libs/validation-api-1.1.0.Final.jar:/opt/local/kafka/bin/../libs/zkclient-0.10.jar:/opt/local/kafka/bin/../libs/zookeeper-3.4.9.jar -javaagent:/opt/prometheus/lib/jmx_prometheus_javaagent-0.14.0.jar=7071:/opt/prometheus/lib/kafka-0-8-2.yml kafka.Kafka /opt/local/kafka/config/server.properties
  1. 修改prometheus.yml,增加linux资源监控和kafka监控
  - job_name: linux
    static_configs:
      - targets: ['10.4.4.16:9100']
        labels:
          instance: node

  - job_name: 'kafka-exporter'   #方式一,非侵入式(推荐)
    static_configs:
    - targets: ['10.4.4.16:9308']

  - job_name: 'kafka'   #方式二,侵入式
    static_configs:
    - targets: ['10.4.4.16:7071']
  1. 重启prometheus程序

  2. 访问http://10.4.4.16:9090/targets
    可以看到,增加了linux和kafka的监控,下图红框所示:

    image.png

三、grafana的安装使用

  • 下载grafana-7.2.0.linux-amd64.tar.gz并解压
[root@master grafana]# ll
total 48
drwxr-xr-x  2 root root  4096 Sep 23 20:19 bin
drwxr-xr-x  3 root root  4096 Sep 23 20:19 conf
drwxr-xr-x  5 root root  4096 Nov 10 15:16 data
-rw-r--r--  1 root root 11343 Sep 23 20:16 LICENSE
-rw-r--r--  1 root root   108 Sep 23 20:16 NOTICE.md
drwxr-xr-x  4 root root  4096 Sep 23 20:19 plugins-bundled
drwxr-xr-x 12 root root  4096 Sep 23 20:19 public
-rw-r--r--  1 root root  2799 Sep 23 20:16 README.md
drwxr-xr-x  2 root root  4096 Sep 23 20:19 scripts
-rw-r--r--  1 root root     5 Sep 23 20:19 VERSION
  • 运行 grafana
[root@master grafana]# ps axu|grep graf
root     18047  0.0  0.0 103320   900 pts/2    S+   15:20   0:00 grep graf
root     24002  0.0  0.2 1695092 38780 ?       Sl   Sep30  45:16 ./grafana-server

访问http://10.4.4.16:3000/login出现如下,表示安装成功,默认登录用户名/密码为admin/admin

image.png

  • 添加data source 为prometheus,注意填写url为服务器地址加9090端口既可


    image.png
  • 官网下载并导入dashboard json文件


    image.png
  • 查看node exporter如下:


    image.png
  • 查看kafka exporter如下:

方式一结果:

image.png

方式二结果:

image.png

你可能感兴趣的:(搭建prometheus的系统资源和kafka监控)