Prometheus监控部署+Grafana可视化展示

文章目录

  • 1. Prometheus简介
  • 2. 时间序列数据
    • 2.1 什么是时间序列数据
    • 2.2 特点
  • 3. Prometheus的主要特征
  • 4. Prometheus原理图
  • 5. 适用场景
    • 5.1 什么时候适合?
    • 5.2 什么时候不合适?
  • 6. Prometheus+Grafana部署+监控linux主机+监控haproxy
    • 6.1 在web界面登录
    • 6.2 监控其它主机
    • 6.3 监控haproxy
  • 7.Grafana可视化展示

1. Prometheus简介

Prometheus(由go语言(golang)开发)是一套开源的监控&报警&时间序列数 据库的组合。适合监控docker容器。因为kubernetes(俗称k8s)的流行带动 了prometheus的发展。
普罗米修斯官网

2. 时间序列数据

2.1 什么是时间序列数据

时间序列数据(TimeSeries Data) : 按照时间顺序记录系统、设备状态变化的数据被称为时序数据。

应用的场景:

  • 无人驾驶车辆运行中要记录的经度,纬度,速度,方向,旁边物体的距 离等等。每时每刻都要将数据记录下来做分析。
  • 某一个地区的各车辆的行驶轨迹数据
  • 传统证券行业实时交易数据
  • 实时运维监控数据

2.2 特点

  1. 性能好
    关系型数据库对于大规模数据的处理性能糟糕。NOSQL可以比较好的处理大规模数据,但依然比不上时间序列数据库。
  2. 存储成本低
    高效的压缩算法,节省存储空间,有效降低IO
    Prometheus有着非常高效的时间序列数据存储方法,每个采样数据仅仅占用3.5byte左右空间,上百万条时间序列,30秒间隔,保留60天,大概花了200多G(来自官方数据)。

3. Prometheus的主要特征

多维度数据模型灵活的查询语言不依赖分布式存储,单个服务器节点是自主的 以HTTP方式,通过pull模型拉去时间序列数据也可以通过中间网关支持push模型通过服务发现或者静态配置,来发现目标服务对象支持多种多样的图表和界面展示。

4. Prometheus原理图

Prometheus监控部署+Grafana可视化展示_第1张图片
Prometheus 直接或通过中介推送网关从检测的作业中抓取指标,用于短期作业。它将所有抓取的样本存储在本地,并对这些数据运行规则,以从现有数据聚合和记录新的时间序列或生成警报。Grafana或其他 API 使用者可用于可视化收集的数据。

5. 适用场景

5.1 什么时候适合?

Prometheus 适用于记录任何纯数字时间序列。它既适合以机器为中心的监控,也适合监控高度动态的面向服务的架构。在微服务的世界中,它对多维数据收集和查询的支持是一个特殊的优势。

Prometheus 专为可靠性而设计,成为您在停电期间访问的系统,让您能够快速诊断问题。每个 Prometheus 服务器都是独立的,不依赖于网络存储或其他远程服务。当基础架构的其他部分损坏时,您可以依赖它,并且您无需设置大量基础架构即可使用它。

5.2 什么时候不合适?

Prometheus 重视可靠性。即使在出现故障的情况下,您也可以随时查看有关系统的可用统计信息。如果您需要 100% 的准确性,例如按请求计费,Prometheus 不是一个好的选择,因为收集的数据可能不够详细和完整。在这种情况下,您最好使用其他系统来收集和分析计费数据,并使用 Prometheus 进行其余的监控。

6. Prometheus+Grafana部署+监控linux主机+监控haproxy

环境需求

系统 主机名 IP 所需服务
Centos8 server 192.168.249.141 prometheus-2.28.0
grafana
Centos8 agent 192.168.249.145 node_exporter-1.1.2
Centos8 haproxy 192.168.249.146

相关软件下载地址

在server主机上部署普罗米修斯

//下载安装包
[root@server ~]# ls
anaconda-ks.cfg  prometheus-2.28.0.linux-amd64.tar.gz
//解压
[root@server ~]# tar xf prometheus-2.28.0.linux-amd64.tar.gz 
[root@server ~]# ls
anaconda-ks.cfg  prometheus-2.28.0.linux-amd64  prometheus-2.28.0.linux-amd64.tar.gz
[root@server ~]# mv prometheus-2.28.0.linux-amd64 /usr/local/prometheus
[root@server ~]# useradd -r -M -s /sbin/nologin prometheus
[root@server ~]# ls /usr/local/
bin  etc  games  include  lib  lib64  libexec  prometheus  sbin  share  src
[root@server ~]# chown -R prometheus.prometheus /usr/local/prometheus/

//查看主程序的帮助文档,怎么启动主程序
[root@server ~]# cd /usr/local/prometheus/
[root@server prometheus]# ls
console_libraries  consoles  LICENSE  NOTICE  prometheus  prometheus.yml  promtool
[root@server prometheus]# ./prometheus --help
usage: prometheus [<flags>]

The Prometheus monitoring server

Flags:
  -h, --help                     Show context-sensitive help (also try --help-long and --help-man).
      --version                  Show application version.
      --config.file="prometheus.yml"    #这个就是启动方法
                                 Prometheus configuration file path.
      --web.listen-address="0.0.0.0:9090"  
                                 Address to listen on for UI, API, and telemetry.
      --web.config.file=""       [EXPERIMENTAL] Path to configuration file that can enable TLS or
                                 authentication.
      --web.read-timeout=5m      Maximum duration before timing out read of the request, and closing idle
                                 connections.
      --web.max-connections=512  Maximum number of simultaneous connections.
      --web.external-url=<URL>   The URL under which Prometheus is externally reachable (for example, if
                                 Prometheus is served via a reverse proxy). Used for generating relative and
                                 absolute links back to Prometheus itself. If the URL has a path portion, it
                                 will be used to prefix all HTTP endpoints served by Prometheus. If omitted,
                                 relevant URL components will be derived automatically.
      --web.route-prefix=<path>  Prefix for the internal routes of web endpoints. Defaults to path of
                                 --web.external-url.
      --web.user-assets=<path>   Path to static asset directory, available at /user.
      --web.enable-lifecycle     Enable shutdown and reload via HTTP request.
      --web.enable-admin-api     Enable API endpoints for admin control actions.
      --web.console.templates="consoles"  
                                 Path to the console template directory, available at /consoles.
      --web.console.libraries="console_libraries"  
                                 Path to the console library directory.
      --web.page-title="Prometheus Time Series Collection and Processing Server"  
                                 Document title of Prometheus instance.
      --web.cors.origin=".*"     Regex for CORS origin. It is fully anchored. Example:
                                 'https?://(domain1|domain2)\.com'
      --storage.tsdb.path="data/"  
                                 Base path for metrics storage.
      --storage.tsdb.retention=STORAGE.TSDB.RETENTION  
                                 [DEPRECATED] How long to retain samples in storage. This flag has been
                                 deprecated, use "storage.tsdb.retention.time" instead.
      --storage.tsdb.retention.time=STORAGE.TSDB.RETENTION.TIME  
                                 How long to retain samples in storage. When this flag is set it overrides
                                 "storage.tsdb.retention". If neither this flag nor "storage.tsdb.retention" nor
                                 "storage.tsdb.retention.size" is set, the retention time defaults to 15d. Units
                                 Supported: y, w, d, h, m, s, ms.
      --storage.tsdb.retention.size=STORAGE.TSDB.RETENTION.SIZE  
                                 [EXPERIMENTAL] Maximum number of bytes that can be stored for blocks. A unit is
                                 required, supported units: B, KB, MB, GB, TB, PB, EB. Ex: "512MB". This flag is
                                 experimental and can be changed in future releases.
      --storage.tsdb.no-lockfile  
                                 Do not create lockfile in data directory.
      --storage.tsdb.allow-overlapping-blocks  
                                 [EXPERIMENTAL] Allow overlapping blocks, which in turn enables vertical
                                 compaction and vertical query merge.
      --storage.tsdb.wal-compression  
                                 Compress the tsdb WAL.
      --storage.remote.flush-deadline=<duration>  
                                 How long to wait flushing sample on shutdown or config reload.
      --storage.remote.read-sample-limit=5e7  
                                 Maximum overall number of samples to return via the remote read interface, in a
                                 single query. 0 means no limit. This limit is ignored for streamed response
                                 types.
      --storage.remote.read-concurrent-limit=10  
                                 Maximum number of concurrent remote read calls. 0 means no limit.
      --storage.remote.read-max-bytes-in-frame=1048576  
                                 Maximum number of bytes in a single frame for streaming remote read response
                                 types before marshalling. Note that client might have limit on frame size as
                                 well. 1MB as recommended by protobuf by default.
      --storage.exemplars.exemplars-limit=100000  
                                 [EXPERIMENTAL] Maximum number of exemplars to store in in-memory exemplar
                                 storage total. 0 disables the exemplar storage. This flag is effective only
                                 with --enable-feature=exemplar-storage.
      --rules.alert.for-outage-tolerance=1h  
                                 Max time to tolerate prometheus outage for restoring "for" state of alert.
      --rules.alert.for-grace-period=10m  
                                 Minimum duration between alert and restored "for" state. This is maintained
                                 only for alerts with configured "for" time greater than grace period.
      --rules.alert.resend-delay=1m  
                                 Minimum amount of time to wait before resending an alert to Alertmanager.
      --alertmanager.notification-queue-capacity=10000  
                                 The capacity of the queue for pending Alertmanager notifications.
      --query.lookback-delta=5m  The maximum lookback duration for retrieving metrics during expression
                                 evaluations and federation.
      --query.timeout=2m         Maximum time a query may take before being aborted.
      --query.max-concurrency=20  
                                 Maximum number of queries executed concurrently.
      --query.max-samples=50000000  
                                 Maximum number of samples a single query can load into memory. Note that
                                 queries will fail if they try to load more samples than this into memory, so
                                 this also limits the number of samples a query can return.
      --enable-feature= ...      Comma separated feature names to enable. Valid options: promql-at-modifier,
                                 promql-negative-offset, remote-write-receiver, exemplar-storage,
                                 expand-external-labels. See
                                 https://prometheus.io/docs/prometheus/latest/disabled_features/ for more
                                 details.
      --log.level=info           Only log messages with the given severity or above. One of: [debug, info, warn,
                                 error]
      --log.format=logfmt        Output format of log messages. One of: [logfmt, json]
//直接启动
[root@server prometheus]# ./prometheus --config.file="prometheus.yml" 
[root@server prometheus]# ss -antl
State        Recv-Q       Send-Q             Local Address:Port             Peer Address:Port      Process      
LISTEN       0            128                      0.0.0.0:22                    0.0.0.0:*                      
LISTEN       0            128                            *:9090                        *:*                      
LISTEN       0            128                         [::]:22                       [::]:* 

//手动启动很麻烦,可以编写一个service文件,添加到systemd下面来管理
[root@server ~]# vim /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network.target

[Service]
User=prometheus
Restart=on-failure
WorkingDirectory=/usr/local/prometheus/
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml

[Install]
WantedBy=multi-user.target

//现在就可以通过systemctl命令来启动了,并可以设置开机自启
[root@server ~]# systemctl daemon-reload
[root@server ~]# systemctl start prometheus
[root@server ~]# systemctl --now enable prometheus
Created symlink /etc/systemd/system/multi-user.target.wants/prometheus.service → /etc/systemd/system/prometheus.service.
[root@server ~]# ss -antl
State        Recv-Q       Send-Q             Local Address:Port             Peer Address:Port      Process      
LISTEN       0            128                      0.0.0.0:22                    0.0.0.0:*                      
LISTEN       0            128                            *:9090                        *:*                      
LISTEN       0            128                         [::]:22                       [::]:*                      
LISTEN       0            128                            *:3000                        *:*            
                     

关闭防火墙和selinux

[root@server ~]# systemctl stop firewalld
[root@server ~]# systemctl disable firewalld
Removed /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@server ~]# setenforce 0
[root@server ~]# vim /etc/selinux/config 
selinux=disabled

6.1 在web界面登录

Prometheus监控部署+Grafana可视化展示_第2张图片
查看监控列表,目前只有本主机在监控列表中
Prometheus监控部署+Grafana可视化展示_第3张图片
Prometheus监控部署+Grafana可视化展示_第4张图片
Prometheus监控部署+Grafana可视化展示_第5张图片

6.2 监控其它主机

在要监控的主机上安装node_exporter-1.1.2.linux-amd64.tar.gz组件,可去官网下载

//解压
[root@agent ~]# useradd -r -M -s /sbin/nologin prometheus
[root@agent ~]# ls
anaconda-ks.cfg  node_exporter-1.1.2.linux-amd64.tar.gz
[root@agent ~]# tar xf node_exporter-1.1.2.linux-amd64.tar.gz 
[root@agent ~]# ls
anaconda-ks.cfg  node_exporter-1.1.2.linux-amd64  node_exporter-1.1.2.linux-amd64.tar.gz
[root@agent ~]# mv node_exporter-1.1.2.linux-amd64 /usr/local/node_exporter
[root@agent ~]# chown -R prometheus.prometheus /usr/local/node_exporter/

//启动
[root@agent node_exporter]# nohup /usr/local/node_exporter/node_exporter &
[1] 10337
[root@agent node_exporter]# nohup: 忽略输入并把输出追加到'nohup.out'

[root@agent node_exporter]# ss -antl
State         Recv-Q        Send-Q               Local Address:Port               Peer Address:Port       Process       
LISTEN        0             128                        0.0.0.0:22                      0.0.0.0:*                        
LISTEN        0             128                              *:9100                          *:*                        
LISTEN        0             128                           [::]:22                         [::]:*         

//也可以编写service文件设置开机自启
[root@agent ~]# vim /etc/systemd/system/node_exporter.service

[Unit]
Description=node_exporter Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network.target

[Service]
User=prometheus
Restart=on-failure
WorkingDirectory=/usr/local/node_exporter/
ExecStart=/usr/local/node_exporter/node_exporter

[Install]
WantedBy=multi-user.target
[root@agent ~]# systemctl daemon-reload

//设置开机自启
[root@agent ~]# systemctl start node_exporter
[root@agent ~]# systemctl enable node_exporter              

查看监控信息
Prometheus监控部署+Grafana可视化展示_第6张图片
配置服务器的prometheus.yml文件

[root@lserver ~]# cd /usr/local/prometheus/
[root@server prometheus]# ls
console_libraries  consoles  data  LICENSE  NOTICE  prometheus  prometheus.yml  promtool
[root@server prometheus]# vim prometheus.yml

    static_configs:
            - targets: ['localhost:9090','192.168.249.145:9100']     #加入要监控的主机
//重启
[root@server prometheus]# pkill prometheus
[root@server prometheus]# systemctl start prometheus

查看web,可以看到监控了另外一台主机
Prometheus监控部署+Grafana可视化展示_第7张图片

6.3 监控haproxy

//下载相关模块,并解压
[root@haproxy ~]# wget https://github.com/prometheus/haproxy_exporter/releases/download/v0.12.0/haproxy_exporter-0.12.0.linux-amd64.tar.gz
[root@haproxy ~]# ls
anaconda-ks.cfg  haproxy_exporter-0.12.0.linux-amd64.tar.gz
[root@haproxy ~]# tar xf haproxy_exporter-0.12.0.linux-amd64.tar.gz 
[root@localhost ~]# ls
anaconda-ks.cfg  haproxy_exporter-0.12.0.linux-amd64  haproxy_exporter-0.12.0.linux-amd64.tar.gz
[root@haproxy ~]# mv haproxy_exporter-0.12.0.linux-amd64 /usr/local/haproxy_exporter
[root@haproxy ~]# useradd -r -M -s /sbin/nologin prometheus
[root@haproxy ~]# chown -R  prometheus.prometheus /usr/local/haproxy_exporter/

//编写service文件
[root@haproxy ~]# vim /etc/systemd/system/haproxy_exporter.service
[Unit]
Description=haproxy_exporter Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network.target

[Service]
User=prometheus
Restart=on-failure
WorkingDirectory=/usr/local/haproxy_exporter/
ExecStart=/usr/local/haproxy_exporter/haproxy_exporter

[Install]
WantedBy=multi-user.target

//设置开机自启
[root@haproxy ~]# systemctl daemon-reload
[root@haproxy ~]# systemctl start haproxy_exporter
[root@haproxy ~]# systemctl --now enable haproxy_exporter
Created symlink /etc/systemd/system/multi-user.target.wants/haproxy_exporter.service → /etc/systemd/system/haproxy_exporter.service.
[root@haproxy ~]# ss -antl
State         Recv-Q        Send-Q               Local Address:Port               Peer Address:Port       Process       
LISTEN        0             128                        0.0.0.0:22                      0.0.0.0:*                        
LISTEN        0             128                           [::]:22                         [::]:*                        
LISTEN        0             128                              *:9101                          *:*                        

web界面可以查看监控的数据
Prometheus监控部署+Grafana可视化展示_第8张图片
prometheus主机界面可以看到新增了一台主机
Prometheus监控部署+Grafana可视化展示_第9张图片

7.Grafana可视化展示

//去官网下载软件包
[root@server ~]# ls
anaconda-ks.cfg  grafana-7.5.6-1.x86_64.rpm  prometheus-2.28.0.linux-amd64.tar.gz
[root@server ~]# dnf -y install grafana-7.5.6-1.x86_64.rpm
//开启
[root@server ~]# systemctl start grafana-server
[root@server ~]# systemctl enable grafana-server
[root@server ~]# ss -antl
State        Recv-Q       Send-Q             Local Address:Port             Peer Address:Port      Process      
LISTEN       0            128                      0.0.0.0:22                    0.0.0.0:*                      
LISTEN       0            128                         [::]:22                       [::]:*                      
LISTEN       0            128                            *:3000                        *:*                      
LISTEN       0            128                            *:9090                        *:*                      

web界面访问,第一次登录默认用户名是admin,密码admin。然后需要设置新密码。

添加普罗米修斯数据源
Prometheus监控部署+Grafana可视化展示_第10张图片
Prometheus监控部署+Grafana可视化展示_第11张图片
Prometheus监控部署+Grafana可视化展示_第12张图片

Prometheus监控部署+Grafana可视化展示_第13张图片
监控haproxy
导入haproxy_rev7.json文件
官方下载
Prometheus监控部署+Grafana可视化展示_第14张图片
Prometheus监控部署+Grafana可视化展示_第15张图片

你可能感兴趣的:(Linux)