为什么80%的码农都做不了架构师?>>>
Promethus 的监控能力好像特别不错,自己来搭建一个尝试监控下 mysql 的运行情况。
1、下载安装
$ wget https://github.com/prometheus/prometheus/releases/download/v1.7.1/prometheus-1.7.1.linux-amd64.tar.gz
$ mkdir app
$ tar zxvf prometheus-1.7.1.darwin-amd64.tar.gz -C app
2、修改配置文件
$ cd app/prometheus-1.7.1
$ vim prometheus.yml
初始化的配置文件类似这样
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'codelab-monitor'
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first.rules"
# - "second.rules"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
可以看到有一个默认的任务,这是 prometheus 在监控自己的状态。
再在下面加入两个新的任务
- job_name: 'linux'
static_configs:
- targets: ['127.0.0.1:9100']
labels:
instance: db1
其中 127.0.0.1
是我要监控的服务器的ip,这里我监控本机,后面 9100 则是 prometheus 去访问的端口(即 exporter 的端口)。
注意 yaml 文件不允许有 tab 符,一律得使用空格
2.1 编辑 systemd 脚本
以后肯定还是得用 systemd 管理进程的,所以这里附上脚本。
# /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=bot
ExecStart=/home/bot/app/prometheus/prometheus \
-config.file=/home/bot/app/prometheus/prometheus.yml \
-storage.local.path=/home/bot/data/prometheus
[Install]
WantedBy=multi-user.target
上面的 storage.local.path 需要创建并设置好权限。
3、启动
$ ./prometheus -config.file=./prometheus.yml
INFO[0000] Starting prometheus (version=1.7.1, branch=master, revision=3afb3fffa3a29c3de865e1172fb740442e9d0133) source="main.go:88"
INFO[0000] Build context (go=go1.8.3, user=root@0aa1b7fc430d, date=20170612-11:44:05) source="main.go:89"
INFO[0000] Host details (Linux 3.10.0-514.2.2.el7.x86_64 #1 SMP Tue Dec 6 23:06:41 UTC 2016 x86_64 l-test (none)) source="main.go:90"
INFO[0000] Loading configuration file ./prometheus.yml source="main.go:252"
INFO[0000] Loading series map and head chunks... source="storage.go:428"
INFO[0000] 0 series loaded. source="storage.go:439"
INFO[0000] Starting target manager... source="targetmanager.go:63"
INFO[0000] Listening on :9090 source="web.go:259"
4、访问
接下来访问 http://prometheus_host:9090
可以看到这样的界面
在 Status -> Targets 下可以看到刚刚添加的 linux 任务,因为我并没有开 linux 任务的那个端口,所以这里显示 down。
5、监控服务器
为了监控服务器,我们需要 exporter。在这里 https://prometheus.io/download/ 可以找到 promethues 提供的各种 exporter。监控机器的是 node_exporter。
// 首先下载 exporter
$ wget https://github.com/prometheus/node_exporter/releases/download/v0.14.0/node_exporter-0.14.0.linux-amd64.tar.gz
$ tar zxvf node_exporter-0.14.0.linux-amd64.tar.gz
$ cd node_exporter-0.14.0.linux-amd64/
$ ls
LICENSE node_exporter NOTICE
$ ./node_exporter
INFO[0000] Starting node_exporter (version=0.14.0, branch=master, revision=840ba5dcc71a084a3bc63cb6063003c1f94435a6) source="node_exporter.go:140"
INFO[0000] Build context (go=go1.7.5, user=root@bb6d0678e7f3, date=20170321-12:12:54) source="node_exporter.go:141"
INFO[0000] No directory specified, see --collector.textfile.directory source="textfile.go:57"
INFO[0000] Enabled collectors: source="node_exporter.go:160"
INFO[0000] - infiniband source="node_exporter.go:162"
INFO[0000] - edac source="node_exporter.go:162"
INFO[0000] - entropy source="node_exporter.go:162"
INFO[0000] - loadavg source="node_exporter.go:162"
INFO[0000] - mdadm source="node_exporter.go:162"
INFO[0000] - meminfo source="node_exporter.go:162"
INFO[0000] - netstat source="node_exporter.go:162"
INFO[0000] - textfile source="node_exporter.go:162"
INFO[0000] - vmstat source="node_exporter.go:162"
INFO[0000] - diskstats source="node_exporter.go:162"
INFO[0000] - zfs source="node_exporter.go:162"
INFO[0000] - filefd source="node_exporter.go:162"
INFO[0000] - filesystem source="node_exporter.go:162"
INFO[0000] - hwmon source="node_exporter.go:162"
INFO[0000] - netdev source="node_exporter.go:162"
INFO[0000] - stat source="node_exporter.go:162"
INFO[0000] - uname source="node_exporter.go:162"
INFO[0000] - wifi source="node_exporter.go:162"
INFO[0000] - conntrack source="node_exporter.go:162"
INFO[0000] - time source="node_exporter.go:162"
INFO[0000] - sockstat source="node_exporter.go:162"
INFO[0000] Listening on :9100 source="node_exporter.go:186"
node exporter 监听的是 9100 端口,所以前面配置 prometheus 的时候端口写的 9100。
这个时候再去 Status -> Targets 看先前 down 的任务,已经变成 up 了。
这里也顺便附上 node exporter 的 systemd 脚本。
[Unit]
Description=Prometheus node exporter
After=local-fs.target network-online.target network.target
Wants=local-fs.target network-online.target network.target
[Service]
ExecStart=/home/bot/app/prometheus_exporter/node_exporter/node_exporter
Type=simple
[Install]
WantedBy=multi-user.target