Prometheus 监控K8S Node监控

Prometheus 监控K8S Node监控

  Prometheus社区提供的NodeExporter项目可以对主机的关键度量指标进行监控,通过Kubernetes的DeamonSet可以在各个主机节点上部署有且仅有一个NodeExporter实例,实现对主机性能指标数据的监控,但由于容器隔离原因,使用容器NodeExporter并不能正确获取到宿主机磁盘信息,故此本课程将NodeExporter部署到宿主机。

node_exporter:用于*NIX系统监控,使用Go语言编写的收集器

  • 使用文档:https://prometheus.io/docs/guides/node-exporter/
  • GitHub:https://github.com/prometheus/node_exporter
  • exporter列表:https://prometheus.io/docs/instrumenting/exporters/

官方文档:https://github.com/kubernetes/kube-state-metrics

node-exporter所采集的指标主要有:

 node_cpu_*
 node_disk_*
 node_entropy_*
 node_filefd_*
 node_filesystem_*
 node_forks_*
 node_intr_total_*
 node_ipvs_*
 node_load_*
 node_memory_*
 node_netstat_*
 node_network_*
 node_nf_conntrack_*
 node_scrape_*
 node_sockstat_*
 node_time_seconds_*
 node_timex _*
 node_xfs_*
View Code

配置文件

修改过得配置文件

  • # prometheus 配置文件
  • prometheus-configmap.yaml
  • # Prometheus configuration format https://prometheus.io/docs/prometheus/latest/configuration/configuration/
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: prometheus-config
      namespace: kube-system 
      labels:
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: EnsureExists
    data:
      # 存放prometheus配置文件
      prometheus.yml: |
        # 配置采集目标
        scrape_configs:
        - job_name: prometheus
          static_configs:
          - targets:
            # 采集自身
            - localhost:9090
    
      prometheus.yml: |
        # 配置采集目标
        scrape_configs:
        - job_name: kubernetes-nodes
          static_configs:
          - targets:
            # 采集自身
            - 192.168.1.110:9100
            - 192.168.1.111:9100
        
        # 采集:Apiserver 生存指标
        # 创建的job name 名称为 kubernetes-apiservers
        - job_name: kubernetes-apiservers
          # 基于k8s的服务发现
          kubernetes_sd_configs:
          - role: endpoints
          # 使用通信标记标签
          relabel_configs:
          # 保留正则匹配标签
          - action: keep
            # 已经包含
            regex: default;kubernetes;https
            source_labels:
            - __meta_kubernetes_namespace
            - __meta_kubernetes_service_name
            - __meta_kubernetes_endpoint_port_name
          # 使用方法为https、默认http
          scheme: https
          tls_config:
            # promethus访问Apiserver使用认证
            ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            # 跳过https认证
            insecure_skip_verify: true
          # promethus访问Apiserver使用认证
          bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
     
        # 采集:Kubelet 生存指标
        - job_name: kubernetes-nodes-kubelet
          kubernetes_sd_configs:
          # 发现集群中所有的Node
          - role: node
          relabel_configs:
          # 通过regex获取关键信息
          - action: labelmap
            regex: __meta_kubernetes_node_label_(.+)
          scheme: https
          tls_config:
            ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            insecure_skip_verify: true
          bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    
        # 采集:nodes-cadvisor 信息
        - job_name: kubernetes-nodes-cadvisor
          kubernetes_sd_configs:
          - role: node
          relabel_configs:
          - action: labelmap
            regex: __meta_kubernetes_node_label_(.+)
          # 重命名标签
          - target_label: __metrics_path__
            replacement: /metrics/cadvisor
          scheme: https
          tls_config:
            ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            insecure_skip_verify: true
          bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    
        # 采集:service-endpoints 信息
        - job_name: kubernetes-service-endpoints
          # 选定指标
          kubernetes_sd_configs:
          - role: endpoints
          relabel_configs:
          - action: keep
            regex: true
            # 指定源标签
            source_labels:
            - __meta_kubernetes_service_annotation_prometheus_io_scrape
          - action: replace
            regex: (https?)
            source_labels:
            - __meta_kubernetes_service_annotation_prometheus_io_scheme
            # 重命名标签采集
            target_label: __scheme__
          - action: replace
            regex: (.+)
            source_labels:
            - __meta_kubernetes_service_annotation_prometheus_io_path
            target_label: __metrics_path__
          - action: replace
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $1:$2
            source_labels:
            - __address__
            - __meta_kubernetes_service_annotation_prometheus_io_port
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_service_label_(.+)
          - action: replace
            source_labels:
            - __meta_kubernetes_namespace
            target_label: kubernetes_namespace
          - action: replace
            source_labels:
            - __meta_kubernetes_service_name
            target_label: kubernetes_name
    
        # 采集:kubernetes-services 服务指标
        - job_name: kubernetes-services
          kubernetes_sd_configs:
          - role: service
          # 黑盒探测,探测IP与端口是否可用
          metrics_path: /probe
          params:
            module:
            - http_2xx
          relabel_configs:
          - action: keep
            regex: true
            source_labels:
            - __meta_kubernetes_service_annotation_prometheus_io_probe
          - source_labels:
            - __address__
            target_label: __param_target
          # 使用 blackbox进行黑盒探测
          - replacement: blackbox
            target_label: __address__
          - source_labels:
            - __param_target
            target_label: instance
          - action: labelmap
            regex: __meta_kubernetes_service_label_(.+)
          - source_labels:
            - __meta_kubernetes_namespace
            target_label: kubernetes_namespace
          - source_labels:
            - __meta_kubernetes_service_name
            target_label: kubernetes_name
    
        # 采集: kubernetes-pods 信息
        - job_name: kubernetes-pods
          kubernetes_sd_configs:
          - role: pod
          relabel_configs:
          - action: keep
            regex: true
            source_labels:
            # 只保留采集的信息
            - __meta_kubernetes_pod_annotation_prometheus_io_scrape
          - action: replace
            regex: (.+)
            source_labels:
            - __meta_kubernetes_pod_annotation_prometheus_io_path
            target_label: __metrics_path__
          - action: replace
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $1:$2
            source_labels:
            # 采集地址
            - __address__
            # 采集端口 
            - __meta_kubernetes_pod_annotation_prometheus_io_port
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
          - action: replace
            source_labels:
            - __meta_kubernetes_namespace
            target_label: kubernetes_namespace
          - action: replace
            source_labels:
            - __meta_kubernetes_pod_name
            target_label: kubernetes_pod_name
        alerting:
          # 告警配置文件
          alertmanagers:
          - kubernetes_sd_configs:
              # 采用动态获取
              - role: pod
            tls_config:
              ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
            relabel_configs:
            - source_labels: [__meta_kubernetes_namespace]
              regex: kube-system 
              action: keep
            - source_labels: [__meta_kubernetes_pod_label_k8s_app]
              regex: alertmanager
              action: keep
            - source_labels: [__meta_kubernetes_pod_container_port_number]
              regex:
              action: drop
    配置文件

Node部署:node_exporter

1、生效配置文件

kubectl apply -f prometheus-configmap.yaml 

 2、查看是否生效

Prometheus 监控K8S Node监控_第1张图片

 3、使用Grafana可视化模板:9276

Prometheus 监控K8S Node监控_第2张图片

 4、选择分组

Prometheus 监控K8S Node监控_第3张图片 

5、显示节点信息(为显示可根据自身情况进行微调)

Prometheus 监控K8S Node监控_第4张图片

 

你可能感兴趣的:(Prometheus 监控K8S Node监控)