Prometheus 是由 SoundCloud 开发的开源监控报警系统和时序列数据库(TSDB).自2012年起,许多公司及组织已经采用 Prometheus,并且该项目有着非常活跃的开发者和用户社区.现在已经成为一个独立的开源项目核,并且保持独立于任何公司,Prometheus 在2016加入 CNCF ( Cloud Native Computing Foundation ), 作为在 kubernetes 之后的第二个由基金会主持的项目.
和其他监控系统相比,Prometheus的特点包括:
Prometheus生态系统由多个组件组成,其中许多是可选的:
grafana是一个优秀的数据看板类工具,他提供了强大和优雅的方式去创建、共享、浏览数据。dashboard中显示了你不同metric数据源中的数据。
Grafana是在网络架构和应用分析中最流行的时序数据展示工具,并且也在工业控制、自动化监控和过程管理等领域有着广泛的应用
grafana有热插拔控制面板和可扩展的数据源,目前已经支持绝大部分常用的时序数据库。
promethus的测试环境部署非常简单,首先从promethus的github中下载二进制包,地址:https://github.com/prometheus/prometheus/releases
下载后将promethus解压到/opt/promethus目录中
tar -zxvf prometheus-1.5.2.linux-amd64.tar.gz -C /opt/prometheus --strip-components=1
然后配置promethus的配置文件 prometheus.yml,内容如下:
global:
scrape_interval: 15s
external_labels:
monitor: 'codelab-monitor'
scrape_configs:
- job_name: 'kubernetes-nodes-cadvisor'
kubernetes_sd_configs:
- api_server: 'http://172.16.7.1:8080'
role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- source_labels: [__meta_kubernetes_role]
action: replace
target_label: kubernetes_role
#将默认10250端口改成10255端口
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:10255'
target_label: __address__
- job_name: 'kubernetes_node'
kubernetes_sd_configs:
- role: node
api_server: 'http://172.16.7.1:8080'
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
配置文件解释:
上面配置文件共有两个job,其中kubernetes-nodes-cadvisor用于监控kubernetes的node节点的10255端口(cAdvisor数据),另一个kubernetes_node用于监控node节点的Linux系统,该监控需要再node端安装监控插件node_exporter(https://github.com/prometheus/node_exporter),最简单的方式是使用容器化安装,直接在运行docker后执行命令:
docker run -d \
--net="host" \
--pid="host" \
quay.io/prometheus/node-exporter
配置完成、监控插件安装完成后,执行命令,运行promethus
nohup ./prometheus --config.file=prometheus.yml &
在运行后,访问 http://部署IP:9090/graph ,即可访问promethus,,数据采集情况可以从菜单-status-target出看到
grafana的rpm部署命令:
wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-4.0.1-1480694114.x86_64.rpm
yum localinstall grafana-4.0.1-1480694114.x86_64.rpm
service grafana-server start
安装完成。
浏览器打开 http://部署IP:3000 ,输入默认用户名密码 (admin/admin) 可以进入 Grafana 。
然后配置数据源:
Configuration - Data Resource - Add Data Resource
添加promethus类型的数据源并输入地址端口即可。
添加完成后,即可自己绘制或导入json面板等等,json面板可以在 https://grafana.com/dashboards 下载、或者有一些其他开源dashboard,入percona的MySQL相关面板:https://github.com/percona/grafana-dashboards
编辑 Grafana config
vi /etc/grafana/grafana.ini
[dashboards.json]
enabled = true
path = /var/lib/grafana/dashboards
然后重启grafana
方法1:在菜单-Create-Import中导入.json文件
方法2:将json文件放入/var/lib/grafana/dashboards目录中并重启grafana
global:
scrape_interval: 30s
scrape_timeout: 30s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'kubernetes-cluster'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- api_servers:
- 'https://kubernetes.default.svc'
in_cluster: true
role: apiserver
- job_name: 'kubernetes-nodes'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- api_servers:
- 'https://kubernetes.default.svc'
in_cluster: true
role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-service-endpoints'
scheme: https
kubernetes_sd_configs:
- api_servers:
- 'https://kubernetes.default.svc'
in_cluster: true
role: endpoint
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: (.+)(?::\d+);(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_service_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
- job_name: 'kubernetes-services'
scheme: https
metrics_path: /probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- api_servers:
- 'https://kubernetes.default.svc'
in_cluster: true
role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_service_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name
- job_name: 'kubernetes-pods'
scheme: https
kubernetes_sd_configs:
- api_servers:
- 'https://kubernetes.default.svc'
in_cluster: true
role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: (.+):(?:\d+);(\d+)
replacement: ${1}:${2}
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_pod_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name