公司服务器很多,最近要求每台服务器都要安装服务器监控,我就顺理成章的当了一回运维。坑很多,客户那边有些机器不提供root权限,有些又开不了端口,只能自行想办法解决。不过话说回来看,这玩意还是挺实用的,可对服务器的资源,如CPU、内存、磁盘等进行实时监控,好工具用起来。
参考地址:https://www.cnblogs.com/fatyao/p/11007357.html
参考地址:https://devopscube.com/monitor-linux-servers-prometheus-node-exporter/
https://github.com/prometheus/prometheus/releases/download/v2.8.1/prometheus-2.8.1.linux-amd64.tar.gz
tar -xvf prometheus-2.8.1.linux-amd64.tar.gz
groupadd prometheus
useradd -g prometheus -s /sbin/nologin prometheus
chown -R prometheus:prometheus /develop/server/prometheus-2.8.1.linux-amd64/
mkdir -p /develop/software/prometheus-data
chown -R prometheus:prometheus /develop/software/prometheus-data/
vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=prometheus
# --storage.tsdb.path是可选项,默认数据目录在运行目录的./data目录中
ExecStart=/develop/server/prometheus-2.8.1.linux-amd64/prometheus --config.file=/develop/server/prometheus-2.8.1.linux-amd64/prometheus.yml --storage.tsdb.path=/develop/software/prometheus-data
Restart=on-failure
[Install]
WantedBy=multi-user.target
chown prometheus:prometheus /usr/lib/systemd/system/prometheus.service
systemctl enable prometheus
systemctl start prometheus
systemctl status prometheus
firewall-cmd --add-port=9090/tcp --permanent
firewall-cmd --reload
http://192.168.157.133:9090/
Status -> Configuration 可查看prometheus.yml配置
Status -> Targets 可查看配置的节点信息
方式一:需要联网
wget https://dl.grafana.com/oss/release/grafana-6.1.3-1.x86_64.rpm
yum -y localinstall grafana-6.1.3-1.x86_64.rpm
systemctl enable grafana-server
systemctl start grafana-server
方式二:无需联网
tar -xvf grafana-6.7.1.linux-amd64.tar.gz
vim /usr/lib/systemd/system/grafana-server.service
[Unit]
Description=Grafana
After=network.target
[Service]
Type=notify
ExecStart=/develop/server/grafana-6.7.1/bin/grafana-server -homepath /develop/server/grafana-6.7.1 -config=/develop/server/software/grafana-6.7.1/conf/defaults.ini
Restart=on-failure
[Install]
WantedBy=multi-user.target
systemctl enable grafana-server
systemctl start grafana-server
两种方式都要做的事
firewall-cmd --add-port=3000 --permanent
firewall-cmd --reload
http://192.168.157.133:3000/login
Add data source -> Prometheus -> 填写Prometheus的访问地址
https://grafana.com/grafana/dashboards,挑一个你觉得好看的dashboard,下载json文件。
加号 -> Import -> Upload .json file -> Prometheus选择数据源 -> Import(Override)
https://grafana.com/tutorials/run-grafana-behind-a-proxy/
# vim
location /grafana/ {
proxy_pass http://127.0.0.1:3000/;
}
# vim /develop/server/grafana-6.7.1/conf/defaults.ini
# root_url后面的路径要和nginx一样,比如这里是/grafana/
root_url = %(protocol)s://%(domain)s:%(http_port)s/grafana/
serve_from_sub_path = true
If you're seeing this Grafana has failed to load its application files
错误的解决办法?查看控制台是206 (Partial Content)错误,需要改nginx配置。
proxy_buffer_size 128k;
proxy_buffers 32 128k;
proxy_busy_buffers_size 128k;
下载最新的安装包
https://github.com/prometheus/node_exporter/releases
解压
tar -xvf node_exporter-1.0.0-rc.1.linux-amd64.tar.gz
mv node_exporter-1.0.0-rc.1.linux-amd64/node_exporter /develop/server/node_exporter-1.0.0-rc.1.linux-amd64
sudo useradd -rs /bin/false node_exporter
sudo vi /etc/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/develop/server/node_exporter-1.0.0-rc.1.linux-amd64/node_exporter
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter
sudo systemctl status node_exporter
firewall-cmd --add-port=9100/tcp --permanent
firewall-cmd --reload
http://192.168.157.134:9100/metrics
https://github.com/martinlindhe/wmi_exporter下载适合版本的msi文件或exe文件,msi是后台启动,exe会在控制台运行,访问地址http://127.0.0.1:9182/metrics
vim /develop/server/prometheus-2.8.1.linux-amd64/prometheus.yml
- job_name: 'cbl-local02'
scrape_interval: 10s
static_configs:
- targets: ['192.168.157.134:9100']
labels:
instance: '192.168.157.134'
systemctl restart prometheus
总觉得这种安装式教程记也不是,不记也不是,所以还是为了以后有参考的文档。在写教程的过程中,一直在公司环境和本地环境互相切换,所以写的博客会有针对公司和面向公众的版本,容易发生混乱。希望下次能够有所改进。