Prometheus(八)-网络嗅探-黑盒监控

介绍

Blackbox Exporter是Prometheus社区提供的官方黑盒监控解决方案,其允许用户通过:HTTP、HTTPS、DNS、TCP以及ICMP的方式对网络进行探测。用户可以直接使用go get命令获取Blackbox Exporter源码并生成本地可执行文件:

go get prometheus/blackbox_exporter

github 地址:
https://github.com/prometheus/blackbox_exporter

部署

1 二进制方式

1.1 下载解压

curl -o blackbox_exporter-0.24.0.linux-amd64.tar.gz https://github.com/prometheus/blackbox_exporter/releases/download/v0.24.0/blackbox_exporter-0.24.0.linux-amd64.tar.gz

tar -xf blackbox_exporter-0.24.0.linux-amd64.tar.gz  -C /usr/local/
mv /usr/local/blackbox_exporter-0.24.0.linux-amd64 /usr/local/blackbox_exporter-0.24.0

1.2 配置 systemd

[Unit]
Description=The blackbox exporter
After=network-online.target remote-fs.target nss-lookup.target
Wants=network-online.target

[Service]
ExecStart=/usr/local/blackbox_exporter-0.24.0/blackbox_exporter --config.file=/usr/local/blackbox_exporter-0.24.0/blackbox.yml

KillSignal=SIGQUIT

Restart=always

RestartPreventExitStatus=1 6 SIGABRT

TimeoutStopSec=5
KillMode=process
PrivateTmp=true
LimitNOFILE=1048576
LimitNPROC=1048576

[Install]
WantedBy=multi-user.target

1.3 配置文件 blackbox.yml

2 容器方式

docker 镜像地址
https://hub.docker.com/r/prom/blackbox-exporter/tags

docker pull prom/blackbox-exporter:v0.23.0

运行Blackbox Exporter时,需要用户提供探针的配置信息,这些配置信息可能是一些自定义的HTTP头信息,也可能是探测时需要的一些TSL配置,也可能是探针本身的验证行为。在Blackbox Exporter每一个探针配置称为一个module,并且以YAML配置文件的形式提供给Blackbox Exporter。 每一个module主要包含以下配置内容,包括探针类型(prober)、验证访问超时时间(timeout)、以及当前探针的具体配置项:

  # 探针类型:http、 tcp、 dns、 icmp.
  prober: 

  # 超时时间
  [ timeout:  ]

  # 探针的详细配置,最多只能配置其中的一个
  [ http:  ]
  [ tcp:  ]
  [ dns:  ]
  [ icmp:  ]

下面是一个简化的探针配置文件blockbox.yml,包含两个HTTP探针配置项:

modules:
  http_2xx:
    prober: http
    timeout: 10s
    http:
      method: GET
      preferred_ip_protocol: "ip4"
  http_post_2xx:
    prober: http
    http:
      method: POST

通过运行以下命令,并指定使用的探针配置文件启动Blockbox Exporter实例:

blackbox_exporter --config.file=/etc/prometheus/blackbox.yml

启动成功后,就可以通过访问http://127.0.0.1:9115/probe?module=http_2xx&target=baidu.com对baidu.com进行探测。这里通过在URL中提供module参数指定了当前使用的探针,target参数指定探测目标,探针的探测结果通过Metrics的形式返回:

# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 0.011633673
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 0.117332275
# HELP probe_failed_due_to_regex Indicates if probe failed due to regex
# TYPE probe_failed_due_to_regex gauge
probe_failed_due_to_regex 0
# HELP probe_http_content_length Length of http content response
# TYPE probe_http_content_length gauge
probe_http_content_length 81
# HELP probe_http_duration_seconds Duration of http request by phase, summed over all redirects
# TYPE probe_http_duration_seconds gauge
probe_http_duration_seconds{phase="connect"} 0.055551141
probe_http_duration_seconds{phase="processing"} 0.049736019
probe_http_duration_seconds{phase="resolve"} 0.011633673
probe_http_duration_seconds{phase="tls"} 0
probe_http_duration_seconds{phase="transfer"} 3.8919e-05
# HELP probe_http_redirects The number of redirects
# TYPE probe_http_redirects gauge
probe_http_redirects 0
# HELP probe_http_ssl Indicates if SSL was used for the final redirect
# TYPE probe_http_ssl gauge
probe_http_ssl 0
# HELP probe_http_status_code Response HTTP status code
# TYPE probe_http_status_code gauge
probe_http_status_code 200
# HELP probe_http_version Returns the version of HTTP of the probe response
# TYPE probe_http_version gauge
probe_http_version 1.1
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 4
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 1

从返回的样本中,用户可以获取站点的DNS解析耗时、站点响应时间、HTTP响应状态码等等和站点访问质量相关的监控指标,从而帮助管理员主动的发现故障和问题。

与Prometheus集成

接下来,只需要在Prometheus下配置对Blockbox Exporter实例的采集任务即可。最直观的配置方式

- job_name: baidu_http2xx_probe
  params:
    module:
    - http_2xx
    target:  
    - baidu.com
  metrics_path: /probe
  static_configs:
  - targets:
    - 127.0.0.1:9115
- job_name: prometheus_http2xx_probe
  params:
    module:
    - http_2xx
    target:
    - prometheus.io
  metrics_path: /probe
  static_configs:
  - targets:
    - 127.0.0.1:9115

假如我们有N个目标站点且都需要M种探测方式,那么Prometheus中将包含N * M个采集任务,从配置管理的角度来说显然是不可接受的。

这里我们也可以采用Relabling的方式对这些配置进行简化,配置方式如下:

scrape_configs:
  - job_name: 'blackbox'
    metrics_path: /probe
    params:
      module: [http_2xx]
    static_configs:
      - targets:
        - http://prometheus.io    # Target to probe with http.
        - https://prometheus.io   # Target to probe with https.
        - http://example.com:8080 # Target to probe with http on port 8080.
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 127.0.0.1:9115 

http://127.0.0.1:9115/probe?module=http_2xx&target=baidu.com

  • 第1步,根据 static_configs.targets 实例的地址,写入 __param_target 标签中。__param_ 形式的标签表示,采集任务时会在请求目标地址中添加参数的值,等同于params的设置;
  • 第2步,获取 __param_target的值,并覆写到 instance 标签中;
  • 第3步,覆写Target实例的__address__标签值为BlockBox Exporter实例的访问地址。

blackbox.yml

modules:
  http_2xx:
    prober: http
  http_post_2xx:
    prober: http
    http:
      method: POST
      preferred_ip_protocol: "ip4"
  tcp_connect:
    prober: tcp
  pop3s_banner:
    prober: tcp
    tcp:
      query_response:
      - expect: "^+OK"
      tls: true
      tls_config:
        insecure_skip_verify: false
  grpc:
    prober: grpc
    grpc:
      tls: true
      preferred_ip_protocol: "ip4"
  grpc_plain:
    prober: grpc
    grpc:
      tls: false
      service: "service1"
  ssh_banner:
    prober: tcp
    tcp:
      query_response:
      - expect: "^SSH-2.0-"
      - send: "SSH-2.0-blackbox-ssh-check"
  irc_banner:
    prober: tcp
    tcp:
      query_response:
      - send: "NICK prober"
      - send: "USER prober prober prober :prober"
      - expect: "PING :([^ ]+)"
        send: "PONG ${1}"
      - expect: "^:[^ ]+ 001" 
  icmp:
    prober: icmp
  icmp_ttl5:
    prober: icmp
    timeout: 5s
    icmp:
      ttl: 5

example.yml

modules:
 http_2xx_example:
   prober: http
   timeout: 5s
   http:
     valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]
     valid_status_codes: []  # Defaults to 2xx
     method: GET
     headers:
       Host: vhost.example.com
       Accept-Language: en-US
       Origin: example.com
     no_follow_redirects: false
     fail_if_ssl: false
     fail_if_not_ssl: false
     fail_if_body_matches_regexp:
       - "Could not connect to database"
     fail_if_body_not_matches_regexp:
       - "Download the latest version here"
     fail_if_header_matches: # Verifies that no cookies are set
       - header: Set-Cookie
         allow_missing: true
         regexp: '.*'
     fail_if_header_not_matches:
       - header: Access-Control-Allow-Origin
         regexp: '(\*|example\.com)'
     tls_config:
       insecure_skip_verify: false
     preferred_ip_protocol: "ip4" # defaults to "ip6"
     ip_protocol_fallback: false  # no fallback to "ip6"
 http_with_proxy:
   prober: http
   http:
     proxy_url: "http://127.0.0.1:3128"
     skip_resolve_phase_with_proxy: true
 http_with_proxy_and_headers:
   prober: http
   http:
     proxy_url: "http://127.0.0.1:3128"
     proxy_connect_header:
       Proxy-Authorization:
         - Bearer token
 http_post_2xx:
   prober: http
   timeout: 5s
   http:
     method: POST
     headers:
       Content-Type: application/json
     body: '{}'
 http_basic_auth_example:
   prober: http
   timeout: 5s
   http:
     method: POST
     headers:
       Host: "login.example.com"
     basic_auth:
       username: "username"
       password: "mysecret"
 http_custom_ca_example:
   prober: http
   http:
     method: GET
     tls_config:
       ca_file: "/certs/my_cert.crt"
 http_gzip:
   prober: http
   http:
     method: GET
     compression: gzip
 http_gzip_with_accept_encoding:
   prober: http
   http:
     method: GET
     compression: gzip
     headers:
       Accept-Encoding: gzip
 tls_connect:
   prober: tcp
   timeout: 5s
   tcp:
     tls: true
 tcp_connect_example:
   prober: tcp
   timeout: 5s
 imap_starttls:
   prober: tcp
   timeout: 5s
   tcp:
     query_response:
       - expect: "OK.*STARTTLS"
       - send: ". STARTTLS"
       - expect: "OK"
       - starttls: true
       - send: ". capability"
       - expect: "CAPABILITY IMAP4rev1"
 smtp_starttls:
   prober: tcp
   timeout: 5s
   tcp:
     query_response:
       - expect: "^220 ([^ ]+) ESMTP (.+)$"
       - send: "EHLO prober\r"
       - expect: "^250-STARTTLS"
       - send: "STARTTLS\r"
       - expect: "^220"
       - starttls: true
       - send: "EHLO prober\r"
       - expect: "^250-AUTH"
       - send: "QUIT\r"
 irc_banner_example:
   prober: tcp
   timeout: 5s
   tcp:
     query_response:
       - send: "NICK prober"
       - send: "USER prober prober prober :prober"
       - expect: "PING :([^ ]+)"
         send: "PONG ${1}"
       - expect: "^:[^ ]+ 001"
 icmp_example:
   prober: icmp
   timeout: 5s
   icmp:
     preferred_ip_protocol: "ip4"
     source_ip_address: "127.0.0.1"
 dns_udp_example:
   prober: dns
   timeout: 5s
   dns:
     query_name: "www.prometheus.io"
     query_type: "A"
     valid_rcodes:
       - NOERROR
     validate_answer_rrs:
       fail_if_matches_regexp:
         - ".*127.0.0.1"
       fail_if_all_match_regexp:
         - ".*127.0.0.1"
       fail_if_not_matches_regexp:
         - "www.prometheus.io.\t300\tIN\tA\t127.0.0.1"
       fail_if_none_matches_regexp:
         - "127.0.0.1"
     validate_authority_rrs:
       fail_if_matches_regexp:
         - ".*127.0.0.1"
     validate_additional_rrs:
       fail_if_matches_regexp:
         - ".*127.0.0.1"
 dns_soa:
   prober: dns
   dns:
     query_name: "prometheus.io"
     query_type: "SOA"
 dns_tcp_example:
   prober: dns
   dns:
     transport_protocol: "tcp" # defaults to "udp"
     preferred_ip_protocol: "ip4" # defaults to "ip6"
     query_name: "www.prometheus.io"

Granfana

你可能感兴趣的:(prometheus,网络)