Prometheus是一款开源的监控和警报工具,最初由SoundCloud开发。它被设计用于记录实时指标数据,并提供强大的查询功能,以便用户可以对其进行分析和可视化。Prometheus采用了多维数据模型,这意味着它可以有效地处理多维度的数据,例如指标名称和标签。
Prometheus的主要特点包括:
Prometheus的工作原理是通过exporter采集数据,将数据存储在本地的时序数据库中,然后通过PromQL查询语言进行查询和分析。Prometheus还提供了Alertmanager组件,用于管理警报和通知。
环境介绍:
主机 | 主机名 | ip | 操作系统 | 部署应用 |
---|---|---|---|---|
监控主机 | master | 192.168.37.130 | centos8 | prometheus +node_exporter |
被监控主机 | node1 | 192.168.37.140 | centos8 | node_exporter |
[root@master opt]# systemctl disable --now firewalld //两台主机都要做,做完重启
Removed /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@master opt]# sed -i 's/enforcing/disabled/' /etc/selinux/config
[root@master ~]# cd /opt //我这边是直接下载之后通过ftp传到本地的
[root@master opt]# ls
grafana-enterprise-10.2.1-1.x86_64.rpm prometheus-2.48.0.linux-amd64.tar.gz
node_exporter-1.7.0.linux-amd64.tar.gz
node_exporter下载地址
prometheus下载地址
grafana下载地址
//解压之后执行prometheus文件
[root@master opt]# tar xf prometheus-2.48.0.linux-amd64.tar.gz
[root@master opt]# mv prometheus-2.48.0.linux-amd64 prometheus
[root@master opt]# cd prometheus
[root@master prometheus]# ls
console_libraries consoles LICENSE NOTICE prometheus prometheus.yml promtool
[root@master prometheus]# ./prometheus
ts=2023-11-20T08:38:26.012Z caller=main.go:539 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2023-11-20T08:38:26.012Z caller=main.go:583 level=info msg="Starting Prometheus Server" mode=server version="(version=2.48.0, branch=HEAD, revision=6d80b30990bc297d95b5c844e118c4011fad8054)"
ts=2023-11-20T08:38:26.013Z caller=main.go:588 level=info build_context="(go=go1.21.4, platform=linux/amd64, user=root@26117804242c, date=20231116-04:35:21, tags=netgo,builtinassets,stringlabels)"
ts=2023-11-20T08:38:26.013Z caller=main.go:589 level=info host_details="(Linux 4.18.0-499.el8.x86_64 #1 SMP Thu Jun 22 12:08:06 UTC 2023 x86_64 master (none))"
....
启动完成之后可以直接192.168.37.130:9090访问页面
为prometheus服务创建service并设置开机自动启动
[root@master ~]# cd /usr/lib/systemd/system
[root@master system]# vim prometheus.service
[root@master system]# systemctl daemon-reload
[root@master system]# systemctl enable prometheus
Created symlink /etc/systemd/system/multi-user.target.wants/prometheus.service → /usr/lib/systemd/system/prometheus.service.
[root@master system]# cat prometheus.service
[Unit]
Description=prometheus
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=root
Group=root
ExecStart=/opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
[root@master system]# reboot
reboot重启,查看服务是否正常
[root@master ~]# systemctl status prometheus
● prometheus.service - prometheus
Loaded: loaded (/usr/lib/systemd/system/prometheus.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2023-11-20 16:46:41 CST; 22s ago
Docs: https://prometheus.io/
Main PID: 858 (prometheus)
Tasks: 9 (limit: 24648)
Memory: 76.5M
CGroup: /system.slice/prometheus.service
└─858 /opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml
Nov 20 16:46:43 master prometheus[858]: ts=2023-11-20T08:46:43.103Z caller=tls_config.go:274 level=info compon>
Nov 20 16:46:43 master prometheus[858]: ts=2023-11-20T08:46:43.104Z caller=tls_config.go:277 level=info compon>
Nov 20 16:46:43 master prometheus[858]: ts=2023-11-20T08:46:43.105Z caller=head.go:761 level=info component=ts>
Nov 20 16:46:43 master prometheus[858]: ts=2023-11-20T08:46:43.105Z caller=head.go:798 level=info component=ts>
Nov 20 16:46:43 master prometheus[858]: ts=2023-11-20T08:46:43.108Z caller=main.go:1045 level=info fs_type=XFS>
Nov 20 16:46:43 master prometheus[858]: ts=2023-11-20T08:46:43.108Z caller=main.go:1048 level=info msg="TSDB s>
Nov 20 16:46:43 master prometheus[858]: ts=2023-11-20T08:46:43.108Z caller=main.go:1229 level=info msg="Loadin>
Nov 20 16:46:43 master prometheus[858]: ts=2023-11-20T08:46:43.148Z caller=main.go:1266 level=info msg="Comple>
Nov 20 16:46:43 master prometheus[858]: ts=2023-11-20T08:46:43.149Z caller=main.go:1009 level=info msg="Server>
Nov 20 16:46:43 master prometheus[858]: ts=2023-11-20T08:46:43.149Z caller=manager.go:1012 level=info componen
//直接用yum安装
[root@master opt]# yum -y install grafana-enterprise-10.2.1-1.x86_64.rpm
Running scriptlet: grafana-enterprise-10.2.1-1.x86_64 5/5
### NOT starting on installation, please execute the following statements to configure grafana to start automatically using systemd
sudo /bin/systemctl daemon-reload
sudo /bin/systemctl enable grafana-server.service
### You can start grafana-server by executing
sudo /bin/systemctl start grafana-server.service
//这里安装完成之后会提示你接下来要做什么操作
[root@master opt]# /bin/systemctl daemon-reload
[root@master opt]# /bin/systemctl enable grafana-server.service
Synchronizing state of grafana-server.service with SysV service script with /usr/lib/systemd/systemd-sysv-install.
Executing: /usr/lib/systemd/systemd-sysv-install enable grafana-server
Created symlink /etc/systemd/system/multi-user.target.wants/grafana-server.service → /usr/lib/systemd/system/grafana-server.service.
[root@master opt]# /bin/systemctl start grafana-server.service
//直接复制操作
通过以下地址访问garfana,并且配置相应的数据源
http://192.168.37.130:3000/login
初始账号密码都是admin
登录进入grafana系统后,对数据源进行配置,点击,add your first data source,进入添加数据源页面:
进入后,点击Prometheus,进入配置页面
编辑HTTP下的URL,这里是主机IP是192.168.20.231,所以填:
http://192.168.37.130:9090
//解压,并把安装包放到自己的安装目录下
[root@master opt]# ls
grafana-enterprise-10.2.1-1.x86_64.rpm prometheus
node_exporter-1.7.0.linux-amd64.tar.gz prometheus-2.48.0.linux-amd64.tar.gz
[root@master opt]# tar xf node_exporter-1.7.0.linux-amd64.tar.gz
[root@master opt]# mv node_exporter-1.7.0.linux-amd64 /usr/local/
[root@master opt]# cd /usr/local/
[root@master local]# ls
bin etc games include lib lib64 libexec node_exporter-1.7.0.linux-amd64 sbin share src
//将安装包中的可执行文件node_exporter拷贝到/usr/local/bin目录中
[root@master local]# cd node_exporter-1.7.0.linux-amd64/
[root@master node_exporter-1.7.0.linux-amd64]# ls
LICENSE node_exporter NOTICE
[root@master node_exporter-1.7.0.linux-amd64]# cp node_exporter /usr/local/bin/
//为node_exporter服务创建service服务
[root@master ~]# vim /usr/lib/systemd/system/node_exporter.service
[root@master ~]# systemctl daemon-reload
[root@master ~]# systemctl enable node_exporter
Created symlink /etc/systemd/system/multi-user.target.wants/node_exporter.service → /usr/lib/systemd/system/node_exporter.service.
[root@master ~]# systemctl start node_exporter
[root@master ~]# cat /usr/lib/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=root
Group=root
ExecStart=/usr/local/bin/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
查看服务状态
[root@master ~]# systemctl status node_exporter
● node_exporter.service - node_exporter
Loaded: loaded (/usr/lib/systemd/system/node_exporter.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2023-11-20 17:03:49 CST; 32s ago
Docs: https://prometheus.io/
Main PID: 4209 (node_exporter)
Tasks: 5 (limit: 24648)
Memory: 3.2M
CGroup: /system.slice/node_exporter.service
└─4209 /usr/local/bin/node_exporter
Nov 20 17:03:49 master node_exporter[4209]: ts=2023-11-20T09:03:49.180Z caller=node_exporter.go:117 level=info>
Nov 20 17:03:49 master node_exporter[4209]: ts=2023-11-20T09:03:49.180Z caller=node_exporter.go:117 level=info>
Nov 20 17:03:49 master node_exporter[4209]: ts=2023-11-20T09:03:49.180Z caller=node_exporter.go:117 level=info>
Nov 20 17:03:49 master node_exporter[4209]: ts=2023-11-20T09:03:49.180Z caller=node_exporter.go:117 level=info>
Nov 20 17:03:49 master node_exporter[4209]: ts=2023-11-20T09:03:49.180Z caller=node_exporter.go:117 level=info>
Nov 20 17:03:49 master node_exporter[4209]: ts=2023-11-20T09:03:49.180Z caller=node_exporter.go:117 level=info>
Nov 20 17:03:49 master node_exporter[4209]: ts=2023-11-20T09:03:49.180Z caller=node_exporter.go:117 level=info>
Nov 20 17:03:49 master node_exporter[4209]: ts=2023-11-20T09:03:49.180Z caller=node_exporter.go:117 level=info>
Nov 20 17:03:49 master node_exporter[4209]: ts=2023-11-20T09:03:49.180Z caller=tls_config.go:274 level=info ms>
Nov 20 17:03:49 master node_exporter[4209]: ts=2023-11-20T09:03:49.180Z caller=tls_config.go:277 level=info ms
监控主机中进行监控相关参数配置
主要在对应该控件软件安装目录下/opt/prometheus/prometheus.yml这个文件中;
在原文件的scrape_configs模块下增加如下配置内容:
//在文件末尾添加这一段
[root@master ~]# vim /opt/prometheus/prometheus.yml
- job_name: 'master_prometheus'
static_configs:
- targets: ['192.168.37.130:9100']
整体文件如下
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: 'master_prometheus'
static_configs:
- targets: ['192.168.37.130:9100']
检查配置是否正确
[root@master ~]# /opt/prometheus/promtool check config /opt/prometheus/prometheus.yml
Checking /opt/prometheus/prometheus.yml
SUCCESS: /opt/prometheus/prometheus.yml is valid prometheus config file syntax
重启prometheus服务进行相关测试
[root@master ~]# systemctl restart prometheus
打开如下测试地址
http://192.168.37.130:9090/targets
如下图所示,可以看到targets已经增加了对监控主机master_prometheus的监控
node_exporter安装方法前面已经演示了这里不再赘述
//在监控主机修改/opt/prometheus/prometheus.yml,添加这一部分
- job_name: "node1-prometheus"
static_configs:
- targets: ['192.168.37.140:9100']
[root@master ~]# /opt/prometheus/promtool check config /opt/prometheus/prometheus.yml
Checking /opt/prometheus/prometheus.yml
SUCCESS: /opt/prometheus/prometheus.yml is valid prometheus config file syntax
//检查没有问题
[root@master ~]# systemctl restart prometheus
//重启服务
待服务起来后,打开下面地址进行测试
http://192.168.37.130:9090/targets
正常如下图所示,targets已经增加了对被监控主机node1的监控
点击+号后,选择import dashboard
搜索系统默认模板8919,然后点击Load
点Load后,稍等会,grafana会直接从官方网站导入编号为8919的面板如下图所示:
选择数据源点击Import,显示master和node1监控界面显示如下图所示: