prometheus数据持久化 docker部署

https://segmentfault.com/a/1190000015710814

prometheus修改配置不需要停掉,只要修改yml之后用docker restart重启

prometheus存储方式

prometheus提供了本地存储,即tsdb时序数据库。
本地存储的优势就是运维简单,缺点就是无法海量的metrics持久化和数据存在丢失的风险,我们在实际使用过程中,出现过几次wal文件损坏,无法再写入的问题。
prometheus没有自己实现集群存储,而是提供了远程读写的接口,让用户自己选择合适的时序数据库来实现prometheus的扩展性。
prometheus通过下面两种方式来实现与其他的远端存储系统对接
Prometheus 按照标准的格式将metrics写到远端存储
prometheus 按照标准格式从远端的url来读取metrics
参考:https://prometheus.io/docs/prometheus/latest/storage/

持久化选型

AppOptics: write
Chronix: write
Cortex: read and write *
CrateDB: read and write *
Elasticsearch: write
Gnocchi: write
Graphite: write
InfluxDB: read and write *
OpenTSDB: write
PostgreSQL/TimescaleDB: read and write *
SignalFx: write
clickhouse: read and write *

选择TimescaleDB,配合Promscale

InfluxDB

概念

influxDB中的名词 传统数据库中的概念
database 数据库
measurement 数据库中的表
points 表里面的一行数据

命令

  1. 进入命令行
docker exec -it myinfluxdb influx
influx -precision rfc3339
  1. 显示数据库
show databases
  1. 使用指定数据库
use prometheus
  1. 显示所有表
SHOW MEASUREMENTS
  1. 查询
select * from test limit 2 offset 2

Promscale + PostgreSQL/TimescaleDB

Promscale:https://github.com/timescale/promscale

部署

docker-compose部署 Promscale、TimescaleDB、Prometheus

https://github.com/timescale/promscale/blob/master/docs/docker.md

  1. /opt/dcprometheus/prometheus.yml
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    ## MUST
    static_configs:
    - targets: ['192.168.200.137:9090']

  - job_name: 'kong'
    static_configs:
    - targets: ['192.168.200.137:8091']

remote_write:
  - url: "http://192.168.200.137:9201/write"
    write_relabel_configs:
    - action: keep
      source_labels: [__name__]
      regex: kong_http_con_status|kong_con_bandwidth|kong_http_log_status
remote_read:
  - url: "http://192.168.200.137:9201/read"
    read_recent: true
  1. docker-compose.yml
version: '3'

services:
  db:
    image: timescaledev/timescaledb-ha:pg12-latest
    container_name: sigma-dc-tsdb
    networks:
        - prometheus-net
    ports:
      - 5433:5432/tcp
    environment:
      POSTGRES_PASSWORD: password
      POSTGRES_USER: postgres

  prometheus:
    image: prom/prometheus:v2.13.1
    container_name: dc-prometheus
    networks:
        - prometheus-net
    ports:
        - "9090:9090"
    volumes:
        - /opt/dcprometheus/prometheus.yml:/etc/prometheus/prometheus.yml
        - /etc/localtime:/etc/localtime
    restart: always

  promscale:
    image: timescale/promscale:0.2.0
    container_name: dc-promscale
    networks:
        - prometheus-net
    ports:
      - 9201:9201/tcp
    build:
      context: .
    restart: on-failure
    depends_on:
      - db
      - prometheus
    environment:
      PROMSCALE_DB_CONNECT_RETRIES: 10
      PROMSCALE_WEB_TELEMETRY_PATH: /metrics-text
      PROMSCALE_DB_URI: postgres://postgres:password@db:5432/postgres?sslmode=allow

networks:
  prometheus-net:
    driver: bridge
  1. docker-compose up -d 启动
    注意:可能会出现promscale重启的现象。因为depend on只限制了启动时间先后,并非在依赖容器ready后再创建新容器。
    参考:https://segmentfault.com/a/1190000021504344
    https://docs.docker.com/compose/startup-order/

prometheus.yml文件对远程读写的配置

https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write

  1. read_recent: true
# Remote read configuration (TSDB).
remote_read:
  - url: "http://ts-xxxxxxxxxxxx.hitsdb.rds.aliyuncs.com:3242/api/prom_read"
    read_recent: true

上面的配置中,’read_recent: true’ 表示近期数据也要读远程存储。
因为Prometheus近期数据无论如何都是要读本地存储的,如果开启这个标志位,Prometheus会把本地和远程的数据进行Merge。开启这个标志位,可以方便验证读取远程TSDB是否正常工作。
如果正式在生产环境,可以根据实际情况将’read_recent: true’去掉,可提升Prometheus的查询性能。
参考:https://help.aliyun.com/document_detail/114508.html

  1. write_relabel_configs
    注意,大部分以label为粒度筛选
    参考:https://studygolang.com/articles/13522?fr=sidebar
    https://blog.csdn.net/liangkiller/article/details/105758857

target重启导致的数据不连续问题

  1. target重启时间,prometheus应该显示0 或 原数据?
  2. target重启完成之后,数据应该与重启之前相同

相关问题:

  1. Kubernetes节点更换后的时间序列混乱
    https://github.com/prometheus/prometheus/issues/7944

postgresql操作

http://www.ruanyifeng.com/blog/2013/12/getting_started_with_postgresql.html
https://www.runoob.com/postgresql/postgresql-create-table.html

  1. 容器进入
psql
  1. 查看全部数据库
\l
  1. 进入某个数据库
\c postgres
  1. 查看数据库的全部表
\d

你可能感兴趣的:(中间件等,prometheus,docker,数据库)