1. 概述
node_exporter是我们最常用的exporter之一,我们可以把他认为是一个agent,需要被安装在操作系统之上,然后才能采集到系统的数据。采集Linux类的exporter被称为node_exporter,如果我们使用sidecar的方式来运行的话,他就可以采集pod的信息。
我们在前面的指标采集一章中,已经使用node_exporter做了一个demo,他的采集的指标比zabbix多了好多。其实,这只是他的冰山一角,因为我们配置exporter的时候,使用的是默认启动的方式,他还有很多开关,需要添加参数才能生效。同时,他还可以指定那类指标不收集,以此来节省我们的资源消耗。
2. 详解node_exporter
node_exporter是prometheus官方提供的agent,项目被托管在prometheus的账号之下。我们也可以通过官方提供的链接下载最新的版本。在官方的github里面已经提供了非常详细使用说明,我们这里带大家梳理一下
2.1. 默认启用的参数
官方地址,这些参数就是我们前面做实验的时候默认收集的指标
Name | Description | OS |
---|---|---|
arp | Exposes ARP statistics from /proc/net/arp . |
Linux |
bcache | Exposes bcache statistics from /sys/fs/bcache/ . |
Linux |
bonding | Exposes the number of configured and active slaves of Linux bonding interfaces. | Linux |
boottime | Exposes system boot time derived from the kern.boottime sysctl. |
Darwin, Dragonfly, FreeBSD, NetBSD, OpenBSD, Solaris |
conntrack | Shows conntrack statistics (does nothing if no /proc/sys/net/netfilter/ present). |
Linux |
cpu | Exposes CPU statistics | Darwin, Dragonfly, FreeBSD, Linux, Solaris |
cpufreq | Exposes CPU frequency statistics | Linux, Solaris |
diskstats | Exposes disk I/O statistics. | Darwin, Linux, OpenBSD |
edac | Exposes error detection and correction statistics. | Linux |
entropy | Exposes available entropy. | Linux |
exec | Exposes execution statistics. | Dragonfly, FreeBSD |
filefd | Exposes file descriptor statistics from /proc/sys/fs/file-nr . |
Linux |
filesystem | Exposes filesystem statistics, such as disk space used. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
hwmon | Expose hardware monitoring and sensor data from /sys/class/hwmon/ . |
Linux |
infiniband | Exposes network statistics specific to InfiniBand and Intel OmniPath configurations. | Linux |
ipvs | Exposes IPVS status from /proc/net/ip_vs and stats from /proc/net/ip_vs_stats . |
Linux |
loadavg | Exposes load average. | Darwin, Dragonfly, FreeBSD, Linux, NetBSD, OpenBSD, Solaris |
mdadm | Exposes statistics about devices in /proc/mdstat (does nothing if no /proc/mdstat present). |
Linux |
meminfo | Exposes memory statistics. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
netclass | Exposes network interface info from /sys/class/net/ |
Linux |
netdev | Exposes network interface statistics such as bytes transferred. | Darwin, Dragonfly, FreeBSD, Linux, OpenBSD |
netstat | Exposes network statistics from /proc/net/netstat . This is the same information as netstat -s . |
Linux |
nfs | Exposes NFS client statistics from /proc/net/rpc/nfs . This is the same information as nfsstat -c . |
Linux |
nfsd | Exposes NFS kernel server statistics from /proc/net/rpc/nfsd . This is the same information as nfsstat -s . |
Linux |
pressure | Exposes pressure stall statistics from /proc/pressure/ . |
Linux (kernel 4.20+ and/or CONFIG_PSI) |
rapl | Exposes various statistics from /sys/class/powercap . |
Linux |
schedstat | Exposes task scheduler statistics from /proc/schedstat . |
Linux |
sockstat | Exposes various statistics from /proc/net/sockstat . |
Linux |
softnet | Exposes statistics from /proc/net/softnet_stat . |
Linux |
stat | Exposes various statistics from /proc/stat . This includes boot time, forks and interrupts. |
Linux |
textfile | Exposes statistics read from local disk. The --collector.textfile.directory flag must be set. |
any |
thermal_zone | Exposes thermal zone & cooling device statistics from /sys/class/thermal . |
Linux |
time | Exposes the current system time. | any |
timex | Exposes selected adjtimex(2) system call stats. | Linux |
udp_queues | Exposes UDP total lengths of the rx_queue and tx_queue from /proc/net/udp and /proc/net/udp6 . |
Linux |
uname | Exposes system information as provided by the uname system call. | Darwin, FreeBSD, Linux, OpenBSD |
vmstat | Exposes statistics from /proc/vmstat . |
Linux |
xfs | Exposes XFS runtime statistics. | Linux (kernel 4.4+) |
zfs | Exposes ZFS performance statistics. | Linux, Solaris |
可以看到,每个指标都有他对应的操作系统,并不是每种系统都适用的。如果不想收集某个类型的指标,就使用--no-collector.
参数,比如
./node_exporter --no-collector.time
2.2. 默认不启用的参数
默认不启用的参数需要通过--collector.
参数来启用,官方提供的不启用的参数如下
Name | Description | OS |
---|---|---|
buddyinfo | Exposes statistics of memory fragments as reported by /proc/buddyinfo. | Linux |
devstat | Exposes device statistics | Dragonfly, FreeBSD |
drbd | Exposes Distributed Replicated Block Device statistics (to version 8.4) | Linux |
interrupts | Exposes detailed interrupts statistics. | Linux, OpenBSD |
ksmd | Exposes kernel and system statistics from /sys/kernel/mm/ksm . |
Linux |
logind | Exposes session counts from logind. | Linux |
meminfo_numa | Exposes memory statistics from /proc/meminfo_numa . |
Linux |
mountstats | Exposes filesystem statistics from /proc/self/mountstats . Exposes detailed NFS client statistics. |
Linux |
ntp | Exposes local NTP daemon health to check time | any |
processes | Exposes aggregate process statistics from /proc . |
Linux |
qdisc | Exposes queuing discipline statistics | Linux |
runit | Exposes service status from runit. | any |
supervisord | Exposes service status from supervisord. | any |
systemd | Exposes service and system status from systemd. | Linux |
tcpstat | Exposes TCP connection status information from /proc/net/tcp and /proc/net/tcp6 . (Warning: the current version has potential performance issues in high load situations.) |
Linux |
wifi | Exposes WiFi device and station statistics. | Linux |
perf | Exposes perf based metrics (Warning: Metrics are dependent on kernel configuration and settings). | Linux |
如果大家使用的是Centos系,或者ubuntu系的,我们建议大家打开ntp,mountstats,systemd,ntp,tcpstat这几个选项
2.3. 文本收集器
使用--collector.textfile.directory
选项可以把文本中的内容也用metrics的方式收集起来,但是用的不多,大家了解就好
2.4. 在prometheus中过滤collector
我们还可以在prometheus中过滤要采集的选项,在exporter的配置文件中使用下面的配置
params:
collect[]:
- foo
- bar
这个功能可以减少prometheus的压力,但是并不会减少node_exporter采集到的数据。有点鸡肋。。。既然不要这个,就不要采集了嘛。。。
2.5. TLS的配置
我们默认暴露的都是http的请求的,也没有验证,也就是明文传输,很容易存在安全隐患,如果需要对数据加密,我们就要使用参数./node_exporter --web.config=web-config.yml
,这个文件的内容如下
tls_server_config:
# Certificate and key files for server to use to authenticate to client.
cert_file:
key_file:
# Server policy for client authentication. Maps to ClientAuth Policies.
# For more detail on clientAuth options: [ClientAuthType](https://golang.org/pkg/crypto/tls/#ClientAuthType)
[ client_auth_type: | default = "NoClientCert" ]
# CA certificate for client certificate authentication to the server.
[ client_ca_file: ]
# Minimum TLS version that is acceptable.
[ min_version: | default = "TLS12" ]
# Maximum TLS version that is acceptable.
[ max_version: | default = "TLS13" ]
# List of supported cipher suites for TLS versions up to TLS 1.2. If empty,
# Go default cipher suites are used. Available cipher suites are documented
# in the go documentation:
# https://golang.org/pkg/crypto/tls/#pkg-constants
[ cipher_suites:
[ - ] ]
# prefer_server_cipher_suites controls whether the server selects the
# client's most preferred ciphersuite, or the server's most preferred
# ciphersuite. If true then the server's preference, as expressed in
# the order of elements in cipher_suites, is used.
[ prefer_server_cipher_suites: | default = true ]
# Elliptic curves that will be used in an ECDHE handshake, in preference
# order. Available curves are documented in the go documentation:
# https://golang.org/pkg/crypto/tls/#CurveID
[ curve_preferences:
[ - ] ]
http_server_config:
# Enable HTTP/2 support. Note that HTTP/2 is only supported with TLS.
# This can not be changed on the fly.
[ http2: | default = true ]
# Usernames and hashed passwords that have full access to the web
# server via basic authentication. If empty, no basic authentication is
# required. Passwords are hashed with bcrypt.
basic_auth_users:
[ : ... ]
这里面的密码是使用hash来加密的,我们常用的工具是htpasswd,比如,我们要加密密码Passw0rd
htpasswd -nBC 10 "" | tr -d ':\n'
New password:
Re-type new password:
$2y$10$6oDcZssS/R6yPy8ixVx7ue8LiPX7CHpNtXxdVGlYkNgzW3CT48TfC%
使用不交互的方式,生成文件
htpasswd -bc /application/prometheus/conf/htpasswd admin Passw0rd
在原有文件中添加一个用户
htpasswd -b /application/prometheus/conf/htpasswd admin Passw0rd
不更新密码文件,只在屏幕上输出用户名和经过加密后的密码
htpasswd -nb admin Passw0rd
2.6. systemd例子
为了便于管理,我们使用systemd来管理这个服务
cat > /etc/systemd/system/node_exporter.service << EOF
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/bin/node_exporter \
--web.config=web-config.yml \
--collector.ntp \
--collector.mountstats \
--collector.systemd \
--collector.ntp \
--collector.tcpstat
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
Restart=always
[Install]
WantedBy=multi-user.target
EOF
为了方便大家学习,请大家加我的微信,我会把大家加到微信群(微信群的二维码会经常变)和qq群821119334,问题答案云原生技术课堂,有问题可以一起讨论
微信公众号 云原生技术课堂
专题讲座
2020 CKA考试视频 真题讲解 https://www.bilibili.com/video/BV167411K7hp
2020 CKA考试指南 https://www.bilibili.com/video/BV1sa4y1479B/
2020年 5月CKA考试真题 https://mp.weixin.qq.com/s/W9V4cpYeBhodol6AYtbxIA