WAIT_TIME

Prometheus常用告警规则

参考: https://awesome-prometheus-alerts.grep.to/rules

热加载告警规则

启动参数中加入: --web.enable-lifecycle参数, 然后终端执行如下POST请求
curl -X POST http://IP:port/-/reload

Prometheus.rules

cat > /data/prometheus/conf/rules/Prometheus.yaml << 'EOF'
groups:
- name: Prometheus.rules
  rules:
  - alert: PrometheusAllTargetsMissing
    expr: count by (job) (up) == 0
    for: 2m
    labels:
      severity: critical
    annotations:
      title: 'Prometheus all targets missing'
      description: "A Prometheus job does not have living target anymore."
  - alert: PrometheusConfigurationReloadFailure
    expr: prometheus_config_last_reload_successful != 1
    for: 0m
    labels:
      severity: warning
    annotations:
      title: 'Prometheus configuration reload failure'
      description: "Prometheus: 【{{ $labels.instance }}】 configuration reload error."
  - alert: PrometheusTooManyRestarts
    expr: changes(process_start_time_seconds{job=~"prometheus|pushgateway|alertmanager"}[15m]) > 2
    for: 0m
    labels:
      severity: warning
    annotations:
      title: 'Prometheus too many restarts'
      description: "Prometheus: 【{{ $labels.instance }}】 has restarted more than twice in the last 15 minutes. It might be crashlooping."
  - alert: PrometheusAlertmanagerConfigurationReloadFailure
    expr: alertmanager_config_last_reload_successful != 1
    for: 0m
    labels:
      severity: warning
    annotations:
      title: 'Prometheus AlertManager configuration reload failure'
      description: "AlertManager: 【{{ $labels.instance }}】 configuration reload error"
  - alert: PrometheusNotificationsBacklog
    expr: min_over_time(prometheus_notifications_queue_length[10m]) > 0
    for: 1m
    labels:
      severity: warning
    annotations:
      title: 'Prometheus notifications backlog'
      description: "Prometheus: 【{{ $labels.instance }}】 The notification queue has not been empty for 10 minutes"
  - alert: PrometheusAlertmanagerNotificationFailing
    expr: rate(alertmanager_notifications_failed_total[1m]) > 0
    for: 1m
    labels:
      severity: critical
    annotations:
      title: 'Prometheus AlertManager notification failing'
      description: "AlertManager: 【{{ $labels.instance }}】 is failing sending notifications"
  - alert: PrometheusTsdbCheckpointCreationFailures
    expr: increase(prometheus_tsdb_checkpoint_creations_failed_total[1m]) > 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Prometheus TSDB checkpoint creation failures'
      description: "Prometheus: 【{{ $labels.instance }}】 encountered {{ $value }} checkpoint creation failures"
  - alert: PrometheusTsdbCheckpointDeletionFailures
    expr: increase(prometheus_tsdb_checkpoint_deletions_failed_total[1m]) > 0
    for: 1m
    labels:
      severity: critical
    annotations:
      title: 'Prometheus TSDB checkpoint deletion failures'
      description: "Prometheus: 【{{ $labels.instance }}】 encountered {{ $value }} checkpoint deletion failures"
  - alert: PrometheusTsdbCompactionsFailed
    expr: increase(prometheus_tsdb_compactions_failed_total[1m]) > 0
    for: 1m
    labels:
      severity: critical
    annotations:
      title: 'Prometheus TSDB compactions failed'
      description: "Prometheus: 【{{ $labels.instance }}】 encountered {{ $value }} TSDB compactions failures"
  - alert: PrometheusTsdbHeadTruncationsFailed
    expr: increase(prometheus_tsdb_head_truncations_failed_total[1m]) > 0
    for: 1m
    labels:
      severity: critical
    annotations:
      title: 'Prometheus TSDB head truncations failed'
      description: "Prometheus: 【{{ $labels.instance }}】 encountered {{ $value }} TSDB head truncation failures"
  - alert: PrometheusTsdbReloadFailures
    expr: increase(prometheus_tsdb_reloads_failures_total[1m]) > 0
    for: 1m
    labels:
      severity: critical
    annotations:
      title: 'Prometheus TSDB reload failures'
      description: "Prometheus: 【{{ $labels.instance }}】 encountered {{ $value }} TSDB reload failures"
EOF

Host.rules

cat > /data/prometheus/conf/rules/Hosts.yaml << 'EOF'
groups:
- name: Hosts.rules
  rules:
  ## Custom By wangshui
  - alert: HostDown
    expr: up{job=~"node-exporter|prometheus|grafana|alertmanager"} == 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Instance down'
      description: "主机: 【{{ $labels.instance }}】has been down for more than 1 minute"

  - alert: HostCpuLoadAvage
    expr: sum(node_load5) by (instance) > 10
    for: 1m
    annotations:
      title: "5分钟内CPU负载过高"
      description: "主机: 【{{ $labels.instance }}】 5五分钟内CPU负载超过10 (当前值：{{ $value }})"
    labels:
      severity: 'warning'

  - alert: HostCpuUsage
    expr: (1-((sum(increase(node_cpu_seconds_total{mode="idle"}[5m])) by (instance))/ (sum(increase(node_cpu_seconds_total[5m])) by (instance))))*100 > 80
    for: 1m
    annotations:
      title: "CPU使用率过高"
      description: "主机: 【{{ $labels.instance }}】 5五分钟内CPU使用率超过80% (当前值：{{ $value }})"
    labels:
      severity: 'warning'

  - alert: HostMemoryUsage
    expr: (1-((node_memory_Buffers_bytes + node_memory_Cached_bytes + node_memory_MemFree_bytes)/node_memory_MemTotal_bytes))*100 > 80
    for: 1m
    annotations:
      title: "主机内存使用率超过80%"
      description: "主机: 【{{ $labels.instance }}】 内存使用率超过80% (当前使用率：{{ $value }}%)"
    labels:
      severity: 'warning'

  - alert: HostIOWait
    expr: ((sum(increase(node_cpu_seconds_total{mode="iowait"}[5m])) by (instance))/(sum(increase(node_cpu_seconds_total[5m])) by (instance)))*100 > 10
    for: 1m
    annotations:
      title: "磁盘负载过高"
      description: "主机: 【{{ $labels.instance }}】 5五分钟内磁盘负载过高 (当前负载值：{{ $value }})"
    labels:
      severity: 'warning'

  - alert: HostFileSystemUsage
    expr: (1-(node_filesystem_free_bytes{fstype=~"ext4|xfs",mountpoint!~".*tmp|.*boot" }/node_filesystem_size_bytes{fstype=~"ext4|xfs",mountpoint!~".*tmp|.*boot" }))*100 > 70
    for: 1m
    annotations:
      title: "磁盘空间剩余不足"
      description: "主机: 【{{ $labels.instance }}】 {{ $labels.mountpoint }}分区使用率超过70%, 当前值使用率：{{ $value }}%"
    labels:
      severity: 'warning'

  - alert: HostSwapIsFillingUp
    expr: (1 - (node_memory_SwapFree_bytes / node_memory_SwapTotal_bytes)) * 100 > 80
    for: 2m
    labels:
      severity: 'warning'
    annotations:
      title: "主机swap分区不足"
      description: "主机: 【{{ $labels.instance }}】 swap分区使用超过 (>80%), 当前值使用率: {{ $value }}%"

  - alert: HostNetworkConnection-ESTABLISHED
    expr:  sum(node_netstat_Tcp_CurrEstab) by (instance) > 1000
    for: 5m
    labels:
      severity: 'warning'
    annotations:
      title: "主机ESTABLISHED连接数过高"
      description: "主机: 【{{ $labels.instance }}】 ESTABLISHED连接数超过1000, 当前ESTABLISHED连接数: {{ $value }}"

  - alert: HostNetworkConnection-TIME_WAIT
    expr:  sum(node_sockstat_TCP_tw) by (instance) > 1000
    for: 5m
    labels:
      severity: 'warning'
    annotations:
      title: "主机TIME_WAIT连接数过高"
      description: "主机: 【{{ $labels.instance }}】 TIME_WAIT连接数超过1000, 当前TIME_WAIT连接数: {{ $value }}"

  - alert: HostUnusualNetworkThroughputIn
    expr:  sum by (instance, device) (rate(node_network_receive_bytes_total{device=~"ens.*"}[2m])) / 1024 / 1024 > 100
    for: 5m
    labels:
      severity: 'warning'
    annotations:
      title: "主机网卡入口流量过高"
      description: "主机: 【{{ $labels.instance }}】, 网卡: {{ $labels.device }} 入口流量超过 (> 100 MB/s), 当前值: {{ $value }}"

  - alert: HostUnusualNetworkThroughputOut
    expr: sum by (instance, device) (rate(node_network_transmit_bytes_total{device=~"ens.*"}[2m])) / 1024 / 1024 > 100
    for: 5m
    labels:
      severity: 'warning'
    annotations:
      title: "主机网卡出口流量过高"
      description: "主机: 【{{ $labels.instance }}】, 网卡: {{ $labels.device }} 出口流量超过 (> 100 MB/s), 当前值: {{ $value }}"

  - alert: HostUnusualDiskReadRate
    expr: sum by (instance, device) (rate(node_disk_read_bytes_total{device=~"sd.*"}[2m])) / 1024 / 1024 > 50
    for: 5m
    labels:
      severity: 'warning'
    annotations:
      title: "主机磁盘读取速率过高"
      description: "主机: 【{{ $labels.instance }}】, 磁盘: {{ $labels.device }} 读取速度超过(50 MB/s), 当前值: {{ $value }}"

  - alert: HostUnusualDiskWriteRate
    expr: sum by (instance, device) (rate(node_disk_written_bytes_total{device=~"sd.*"}[2m])) / 1024 / 1024 > 50
    for: 2m
    labels:
      severity: 'warning'
    annotations:
      title: "主机磁盘写入速率过高"
      description: "主机: 【{{ $labels.instance }}】, 磁盘: {{ $labels.device }} 写入速度超过(50 MB/s), 当前值: {{ $value }}"

  - alert: HostOutOfInodes
    expr: node_filesystem_files_free{fstype=~"ext4|xfs",mountpoint!~".*tmp|.*boot" } / node_filesystem_files{fstype=~"ext4|xfs",mountpoint!~".*tmp|.*boot" } * 100 < 10
    for: 2m
    labels:
      severity: 'warning'
    annotations:
      title: "主机分区Inode节点不足"
      description: "主机: 【{{ $labels.instance }}】 {{ $labels.mountpoint }}分区inode节点不足 (可用值小于{{ $value }}%)"

  - alert: HostUnusualDiskReadLatency
    expr: rate(node_disk_read_time_seconds_total{device=~"sd.*"}[1m]) / rate(node_disk_reads_completed_total{device=~"sd.*"}[1m]) > 0.1 and rate(node_disk_reads_completed_total{device=~"sd.*"}[1m]) > 0
    for: 2m
    labels:
      severity: 'warning'
    annotations:
      title: "主机磁盘Read延迟过高"
      description: "主机: 【{{ $labels.instance }}】, 磁盘: {{ $labels.device }} Read延迟过高 (read operations > 100ms), 当前延迟值: {{ $value }}ms"

  - alert: HostUnusualDiskWriteLatency
    expr: rate(node_disk_write_time_seconds_total{device=~"sd.*"}[1m]) / rate(node_disk_writes_completed_total{device=~"sd.*"}[1m]) > 0.1 and rate(node_disk_writes_completed_total{device=~"sd.*"}[1m]) > 0
    for: 2m
    labels:
      severity: 'warning'
    annotations:
      title: "主机磁盘Write延迟过高"
      description: "主机: 【{{ $labels.instance }}】, 磁盘: {{ $labels.device }} Write延迟过高 (write operations > 100ms), 当前延迟值: {{ $value }}ms"
EOF

Blackbox.rules

cat > /data/prometheus/conf/rules/Blackbox.yaml << 'EOF'
groups:
- name: Blackbox.rules
  rules:
  - alert: HostConnectionFailure
    expr: probe_success{job="ping-status"} == 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: Host Connection Failure
      description: "主机 【{{ $labels.instance }}】 cannot be connected"
      
  - alert: ServiceConnectionFailure
    expr: probe_success{job="port-status"} == 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: Service Connection Failure
      description: "服务 【{{ $labels.server }}】 on 主机 【{{ $labels.instance }}】 cannot be connected"

  - alert: BlackboxSlowProbeOnServer
    expr: avg_over_time(probe_duration_seconds{job="port-status"}[1m]) > 1
    for: 1m
    labels:
      severity: warning
    annotations:
      title: Service probe timeout
      description: "服务 【{{ $labels.server }}】 on 主机 【{{ $labels.instance }}】Blackbox probe took more than 1s to complete, Current Value: {{ $value }}s"

  - alert: BlackboxSlowProbeOnWebsite
    expr: avg_over_time(probe_duration_seconds{job="http-status"}[1m]) > 1
    for: 1m
    labels:
      severity: warning
    annotations:
      title: Service probe timeout
      description: "网站 【{{ $labels.instance }}】 Blackbox probe took more than 1s to complete, Current Value: {{ $value }}s"

  - alert: BlackboxProbeHttpFailure
    expr: probe_http_status_code <= 199 OR probe_http_status_code >= 400
    for: 0m
    labels:
      severity: critical
      service: web
    annotations:
      title: Blackbox probe HTTP failure
      description: "网站: 【{{ $labels.instance }}】HTTP status code is exception, Current status code: {{ $value }}"

  - alert: BlackboxSslCertificateWillExpireSoonIn30days
    expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 30
    for: 0m
    labels:
      severity: warning
    annotations:
      title: Blackbox SSL certificate will expire soon
      description: "网站: 【{{ $labels.instance }}】 SSL certificate expires in 30 days"
  - alert: BlackboxSslCertificateWillExpireSoonIn3days
    expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 3
    for: 0m
    labels:
      severity: critical
    annotations:
      title: Blackbox SSL certificate will expire soon
      description: "网站: 【{{ $labels.instance }}】 SSL certificate expires in 3 days"
  - alert: BlackboxSslCertificateExpired
    expr: probe_ssl_earliest_cert_expiry - time() <= 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: Blackbox SSL certificate expired
      description: "网站: 【{{ $labels.instance }}】 SSL certificate has expired already"
  - alert: BlackboxProbeSlowHttp
    expr: avg_over_time(probe_http_duration_seconds[1m]) > 1
    for: 1m
    labels:
      severity: warning
    annotations:
      title: Blackbox probe slow HTTP
      description: "网站: 【{{ $labels.instance }}】HTTP request took more than 1s, Current Value: {{ $value }}s"
  - alert: BlackboxProbeSlowPing
    expr: avg_over_time(probe_icmp_duration_seconds[1m]) > 1
    for: 1m
    labels:
      severity: warning
    annotations:
      title: Blackbox probe slow ping
      description: "主机: 【{{ $labels.instance }}】Blackbox ping took more than 1s, Current Value: {{ $value }}s"  
EOF

Mysql.rules

cat > /data/prometheus/conf/rules/Mysql.yaml << 'EOF'
groups:
- name: Mysql.rules
  rules:
  ## Mysql Alarm Rules
  - alert: MysqlDown
    expr: mysql_up == 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'MySQL down'
      description: "Mysql实例: 【{{ $labels.instance }}】, MySQL instance is down"

  - alert: MysqlRestarted
    expr: mysql_global_status_uptime < 60
    for: 0m
    labels:
      severity: info
    annotations:
      title: 'MySQL Restarted'
      description: "Mysql实例: 【{{ $labels.instance }}】, MySQL has just been restarted, less than one minute ago"

  - alert: MysqlTooManyConnections(>80%)
    expr: avg by (instance) (rate(mysql_global_status_threads_connected[1m])) / avg by (instance) (mysql_global_variables_max_connections) * 100 > 80
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'MySQL too many connections (> 80%)'
      description: "Mysql实例: 【{{ $labels.instance }}】, More than 80% of MySQL connections are in use, Current Value: {{ $value }}%"

  - alert: MysqlThreadsRunningHigh
    expr: mysql_global_status_threads_running > 40
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'MySQL Threads_Running High'
      description: "Mysql实例: 【{{ $labels.instance }}】, Threads_Running above the threshold(40), Current Value: {{ $value }}"

  - alert: MysqlQpsHigh
    expr: sum by (instance) (rate(mysql_global_status_queries[2m])) > 500
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'MySQL QPS High'
      description: "Mysql实例: 【{{ $labels.instance }}】, MySQL QPS above 500"

  - alert: MysqlSlowQueries
    expr: increase(mysql_global_status_slow_queries[1m]) > 0
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'MySQL slow queries'
      description: "Mysql实例: 【{{ $labels.instance }}】, has some new slow query."

  - alert: MysqlTooManyAbortedConnections
    expr: round(increase(mysql_global_status_aborted_connects[5m])) > 20
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'MySQL too many Aborted connections in 2 minutes'
      description: "Mysql实例: 【{{ $labels.instance }}】, {{ $value }} Aborted connections within 2 minutes"

  - alert: MysqlTooManyAbortedClients
    expr: round(increase(mysql_global_status_aborted_clients[120m])) > 10
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'MySQL too many Aborted connections in 2 hours'
      description: "Mysql实例: 【{{ $labels.instance }}】, {{ $value }} Aborted Clients within 2 hours"

  - alert: MysqlSlaveIoThreadNotRunning
    expr: mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_io_running == 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'MySQL Slave IO thread not running'
      description: "Mysql实例: 【{{ $labels.instance }}】, MySQL Slave IO thread not running"

  - alert: MysqlSlaveSqlThreadNotRunning
    expr: mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_sql_running == 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'MySQL Slave SQL thread not running'
      description: "Mysql实例: 【{{ $labels.instance }}】, MySQL Slave SQL thread not running"

  - alert: MysqlSlaveReplicationLag
    expr: mysql_slave_status_master_server_id > 0 and ON (instance) (mysql_slave_status_seconds_behind_master - mysql_slave_status_sql_delay) > 30
    for: 1m
    labels:
      severity: critical
    annotations:
      title: 'MySQL Slave replication lag'
      description: "Mysql实例: 【{{ $labels.instance }}】, MySQL replication lag"

  - alert: MysqlInnodbLogWaits
    expr: rate(mysql_global_status_innodb_log_waits[15m]) > 10
    for: 0m
    labels:
      severity: warning
    annotations:
      title: 'MySQL InnoDB log waits'
      description: "Mysql实例: 【{{ $labels.instance }}】, innodb log writes stalling"
EOF

Redis.rules

cat > /data/prometheus/conf/rules/redis.yaml << 'EOF'
groups:
- name: Redis.rules
  rules:
## Redis Alarm Rules
  - alert: RedisDown
    expr: redis_up == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      title: 'Redis down'
      description: "Redis实例: 【{{ $labels.instance }}】, Redis instance is down"

  - alert: RedisMissingMaster
    expr: count(redis_instance_info{role="master"}) < 1
    for: 2m
    labels:
      severity: critical
    annotations:
      title: 'Redis missing master'
      description: "Redis cluster has no node marked as master."

  - alert: RedisTooManyMasters
    expr: count(redis_instance_info{role="master"}) > 1
    for: 2m
    labels:
      severity: critical
    annotations:
      title: 'Redis too many masters'
      description: "Redis cluster has too many nodes marked as master."

  - alert: RedisDisconnectedSlaves
    expr: count without (instance, job) (redis_connected_slaves) - sum without (instance, job) (redis_connected_slaves) - 1 > 1
    for: 2m
    labels:
      severity: critical
    annotations:
      title: 'Redis disconnected slaves'
      description: "Redis not replicating for all slaves. Consider reviewing the redis replication status."

  - alert: RedisReplicationBroken
    expr: delta(redis_connected_slaves[1m]) < 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Redis replication broken'
      description: "Redis实例: 【{{ $labels.instance }}】,Redis instance lost a slave"

  - alert: RedisClusterFlapping
    expr: changes(redis_connected_slaves[1m]) > 1
    for: 2m
    labels:
      severity: critical
    annotations:
      title: 'Redis cluster flapping'
      description: "Redis实例: 【{{ $labels.instance }}】,Changes have been detected in Redis replica connection. This can occur when replica nodes lose connection to the master and reconnect (a.k.a flapping)."

  - alert: RedisMissingBackup
    expr: time() - redis_rdb_last_save_timestamp_seconds > 60 * 60 * 24
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Redis missing backup'
      description: "Redis实例: 【{{ $labels.instance }}】,Redis has not been backuped for 24 hours"

  - alert: RedisOutOfConfiguredMaxmemory
    expr: redis_memory_used_bytes / redis_memory_max_bytes * 100 > 90
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'Redis out of configured maxmemory'
      description: "Redis实例: 【{{ $labels.instance }}】,Redis is running out of configured maxmemory (> 90%), Current Value: {{ $value }}"

  - alert: RedisTooManyConnections
    expr: redis_connected_clients > 100
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'Redis too many connections'
      description: "Redis实例: 【{{ $labels.instance }}】, Redis instance has too many connections, Current Value: {{ $value }}"

  - alert: RedisNotEnoughConnections
    expr: redis_connected_clients < 5
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'Redis not enough connections'
      description: "Redis实例: 【{{ $labels.instance }}】, Redis instance should have more connections (> 5), Current Value: {{ $value }}"

  - alert: RedisRejectedConnections
    expr: increase(redis_rejected_connections_total[1m]) > 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Redis rejected connections'
      description: "Redis实例: 【{{ $labels.instance }}】, Some connections to Redis has been rejected, Current Value: {{ $value }}"
EOF

Elasticsearch.rules

cat > /data/prometheus/conf/rules/elasticsearch.yaml << 'EOF'
groups:
- name: Elasticsearch.rules
  rules:
   ## ES Alarm Rules
  - alert: ElasticsearchHeapUsageTooHigh
    expr: (elasticsearch_jvm_memory_used_bytes{area="heap"} / elasticsearch_jvm_memory_max_bytes{area="heap"}) * 100 > 90
    for: 2m
    labels:
      severity: critical
    annotations:
      title: "Elasticsearch Heap Usage Too High"
      description: "主机: 【{{ $labels.instance }}】, The heap usage is over 90%, Current Value: {{ $value }}"

  - alert: ElasticsearchHeapUsageWarning
    expr: (elasticsearch_jvm_memory_used_bytes{area="heap"} / elasticsearch_jvm_memory_max_bytes{area="heap"}) * 100 > 80
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'Elasticsearch Heap Usage warning'
      description: "主机: 【{{ $labels.instance }}】, The heap usage is over 80%, Current Value: {{ $value }}"

  - alert: ElasticsearchDiskOutOfSpace
    expr: elasticsearch_filesystem_data_available_bytes / elasticsearch_filesystem_data_size_bytes * 100 < 10
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Elasticsearch disk out of space'
      description: "主机: 【{{ $labels.instance }}】, The disk usage is over 90%, Current Value: {{ $value }}"

  - alert: ElasticsearchDiskSpaceLow
    expr: elasticsearch_filesystem_data_available_bytes / elasticsearch_filesystem_data_size_bytes * 100 < 20
    for: 2m
    labels:
      severity: warning
    annotations:
      title: 'Elasticsearch disk space low'
      description: "主机: 【{{ $labels.instance }}】, The disk usage is over 80%, Current Value: {{ $value }}"

  - alert: ElasticsearchClusterRed
    expr: elasticsearch_cluster_health_status{color="red"} == 1
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Elasticsearch Cluster Red'
      description: "主机: 【{{ $labels.instance }}】, Elastic Cluster Red status"

  - alert: ElasticsearchClusterYellow
    expr: elasticsearch_cluster_health_status{color="yellow"} == 1
    for: 0m
    labels:
      severity: warning
    annotations:
      title: 'Elasticsearch Cluster Yellow'
      description: "主机: 【{{ $labels.instance }}】, Elastic Cluster Yellow status"

  - alert: ElasticsearchHealthyNodes
    expr: elasticsearch_cluster_health_number_of_nodes < 3
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Elasticsearch Healthy Nodes'
      description: "Missing node in Elasticsearch cluster"

  - alert: ElasticsearchHealthyDataNodes
    expr: elasticsearch_cluster_health_number_of_data_nodes < 3
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Elasticsearch Healthy Data Nodes'
      description: "Missing data node in Elasticsearch cluster"

  - alert: ElasticsearchRelocatingShards
    expr: elasticsearch_cluster_health_relocating_shards > 0
    for: 0m
    labels:
      severity: info
    annotations:
      title: 'Elasticsearch relocating shards'
      description: "主机: 【{{ $labels.instance }}】, Elasticsearch is relocating shards"

  - alert: ElasticsearchRelocatingShardsTooLong
    expr: elasticsearch_cluster_health_relocating_shards > 0
    for: 15m
    labels:
      severity: warning
    annotations:
      title: 'Elasticsearch relocating shards too long'
      description: "主机: 【{{ $labels.instance }}】, Elasticsearch has been relocating shards for 15min"

  - alert: ElasticsearchInitializingShards
    expr: elasticsearch_cluster_health_initializing_shards > 0
    for: 0m
    labels:
      severity: info
    annotations:
      title: 'Elasticsearch initializing shards'
      description: "主机: 【{{ $labels.instance }}】, Elasticsearch is initializing shards"

  - alert: ElasticsearchInitializingShardsTooLong
    expr: elasticsearch_cluster_health_initializing_shards > 0
    for: 15m
    labels:
      severity: warning
    annotations:
      title: 'Elasticsearch initializing shards too long'
      description: "主机: 【{{ $labels.instance }}】, Elasticsearch has been initializing shards for 15 min"

  - alert: ElasticsearchUnassignedShards
    expr: elasticsearch_cluster_health_unassigned_shards > 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Elasticsearch unassigned shards'
      description: "主机: 【{{ $labels.instance }}】, Elasticsearch has unassigned shards"

  - alert: ElasticsearchPendingTasks
    expr: elasticsearch_cluster_health_number_of_pending_tasks > 0
    for: 15m
    labels:
      severity: warning
    annotations:
      title: 'Elasticsearch pending tasks'
      description: "主机: 【{{ $labels.instance }}】, Elasticsearch has pending tasks. Cluster works slowly, Current Value: {{ $value }}"

  - alert: ElasticsearchNoNewDocuments
    expr: increase(elasticsearch_indices_docs{es_data_node="true"}[10m]) < 1
    for: 0m
    labels:
      severity: warning
    annotations:
      title: 'Elasticsearch no new documents'
      description: "主机: 【{{ $labels.instance }}】, Elasticsearch No new documents for 10 min!"
EOF

kafka.rules

cat > /data/prometheus/conf/rules/kafka.yaml << 'EOF'
groups:
- name: kafka.rules
  rules:
##  KAFKA Alarm Rules
  - alert: KafkaTopicsReplicas
    expr: sum(kafka_topic_partition_in_sync_replica) by (topic) < 3
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Kafka topics replicas less than 3'
      description: "Topic: {{ $labels.topic }} partition less than 3, Current Value: {{ $value }}"

  - alert: KafkaConsumersGroupLag
    expr: sum(kafka_consumergroup_lag) by (consumergroup) > 50
    for: 1m
    labels:
      severity: critical
    annotations:
      title: 'Kafka consumers group 消费滞后'
      description: "Kafka consumers group 消费滞后 (Lag > 50), Lag值: {{ $value }}"
      
  - alert: KafkaConsumersTopicLag
    expr: sum(kafka_consumergroup_lag) by (topic) > 50
    for: 1m
    labels:
      severity: critical
    annotations:
      title: 'Kafka Topic 消费滞后'
      description: "Kafka Topic 消费滞后 (Lag > 50), Lag值: {{ $value }}"
EOF

Docker.rules

cat > /data/prometheus/conf/rules/Docker.yaml << 'EOF'
groups:
- name: Docker.rules
  rules:
  - alert: DockerInstanceDown
    expr: up{job="cAdvisor"} == 0
    for: 0m
    labels:
      severity: critical
    annotations:
      title: 'Docker Instance down'
      description: "容器实例: 【{{ $labels.instance }}】has been down for more than 1 minute"
      
  - alert: ContainerKilled
    expr: time() - container_last_seen{name!=""} > 60
    for: 1m
    labels:
      severity: critical
    annotations:
      title: "A Container has disappeared"
      description: "Container Name 【{{ $labels.name }}】 on 主机【{{ $labels.instance }}】 has disappeared"

  - alert: ContainerCpuUsage
    expr: (sum by(instance, name) (rate(container_cpu_usage_seconds_total{name!=""}[3m])) * 100) > 80
    for: 2m
    labels:
      severity: warning
    annotations:
      title: "Container CPU usaged above 80%"
      description: "Container Name 【{{ $labels.name }}】 on 主机【{{ $labels.instance }}】 CPU usage is above 80%, Current Value: {{ $value }}"

  - alert: ContainerMemoryUsage
    expr: (sum by(instance, name) (container_memory_working_set_bytes{name!=""}) / sum by(instance, name) (container_spec_memory_limit_bytes{name!=""} > 0) * 100)  > 80
    for: 2m
    labels:
      severity: warning
    annotations:
      title: "Container CPU usaged above 80%"
      description: "Container Name 【{{ $labels.name }}】 on 主机【{{ $labels.instance }}】 Memory usage is above 80%, Current Value: {{ $value }}"
  - alert: ContainerVolumeUsage
    expr: (1 - (sum(container_fs_inodes_free) BY (instance) / sum(container_fs_inodes_total) BY (instance))) * 100 > 80
    for: 5m
    labels:
      severity: warning
    annotations:
      title: "Container Volume usaged above 80%"
      description: "Container Name 【{{ $labels.name }}】 on 主机【{{ $labels.instance }}】 Volume usage is above 80%, Current Value: {{ $value }}"
EOF

deepseek时代，快消行业AI搜索破局战：3步抢占3亿用户决策入口白雪讲堂人工智能大数据
——2025年滋补品牌必须掌握的AI搜索生存法则一、残酷现状：滋补行业正被AI搜索重构规则1.AI搜索用户规模爆发，高净值人群加速迁移3.31亿用户：2025年AI搜索用户规模（QuestMobile数据），中青年、高学历人群占比超60%决策路径缩短50%：用户从“搜索-比价-购买”转变为“提问-获取答案-下单”品牌生死线：当用户搜索“阿胶品牌推荐”，若答案中无品牌露出，等于永久失去客户2.滋补行
K8S学习之基础四十：配置altermanager发送告警到钉钉群云上艺旅 K8S学习 kubernetes 学习钉钉 prometheus 云原生容器
配置altermanager发送告警到钉钉群创建钉钉群，设置机器人助手(必须是管理员才能设置)，获取webhookwebhook：https://oapi.dingtalk.com/robot/send?access_token=25bed933a52d69f192347b5be4b2193bc0b257a6d9ae68d81619e3ae3d93f7c6#创建cm，配置钉钉群信息vialertm
【Leetcode刷题随笔】59 螺旋矩阵 Poor_DayDreamer leetcode数组篇 Medium Tag leetcode 矩阵算法
1.题目描述给定一个正整数n，生成一个包含1到n2所有元素，且元素按顺时针顺序螺旋排列的nxn正方形矩阵matrix。可结合以下原题链接阅读。原题链接：59螺旋矩阵2.解题思路本题为模拟矩阵填充过程，不需要设计算法，只要完成正确的填充过程即可。首先初始化一个nxn的二维矩阵（涉及到动态内存分配），从矩阵左上角开始往顺时针填充，关键在于填充的转角处不要重复填充，所以对于每条边都要遵循严格的统一规则，
【计算机系统概论】计算机框架是什么？冯诺依曼架构为什么重要？我们要记住冯·诺依曼架构的什么？爱吃羊的老虎计算机系统架构系统架构计算机网络
什么是计算机的框架？计算机的框架（架构）就是计算机工作的基本规则，规定了它如何存储数据、如何执行指令、如何传输信息。可以理解成是计算机的大脑结构，它决定了一台计算机的工作方式。如果把计算机比作一个工厂，那么架构就像是生产流程，比如：存储区（仓库）：存放数据和指令。控制中心（调度室）：决定接下来做什么。加工车间（计算单元）：执行计算和逻辑处理。运输系统（总线）：负责不同部件之间的信息传输。冯·诺依曼
力扣Hot100——136. 只出现一次的数字飞奔的马里奥 leetcode 算法职场和发展
难点在于时间与空间复杂度的要求，一般遇到这样的限制，就要考虑使用位运算，位运算效率最高了。异或当且仅当两个输入值不同时，异或运算输出为真（1），否则输出为假（0），即“同为0，异为1”。这是针对二进制运算的规则，整数进行异或运算，需要转换为二进制，一样遵循这个运算规则。异或的运算律：交换律：p⊕q=q⊕p结合律：p⊕(q⊕r)=(p⊕q)⊕r恒等律：p⊕0=p归零律：p⊕p=0对合运算：p⊕q⊕q
去中心化金融的风起与未来：从边缘创新到主流趋势 Echo_Wish 前沿技术人工智能 python 区块链开发语言
去中心化金融的风起与未来：从边缘创新到主流趋势在区块链技术的大潮中，去中心化金融（DeFi，DecentralizedFinance）以其革命性的理念彻底颠覆了传统金融世界的规则。DeFi的发展让普通用户得以无需依赖银行或中介机构就能在全球范围内转账、借贷、投资。然而，DeFi的光环背后也隐藏着种种挑战，未来将如何发展？本文从现状分析入手，并结合实际案例，带你展望其广阔前景。DeFi的现状：金融民
K8S学习之基础三十六：node-exporter部署云上艺旅 K8S学习 kubernetes 学习贪心算法 prometheus 云原生
Prometheusv2.2.1编写yaml文件，包含创建ns、configmap、deployment、service#创建monitoring空间viprometheus-ns.yamlapiVersion:v1kind:Namespacemetadata:name:monitoring#创建SA并绑定权限kubectlcreateserviceaccountmonitor-nmonitori
Axure高级功能深度解析一一高效原型设计的利器招风的黑耳 Axure
Axure作为一款专业的原型设计工具，凭借其强大的功能和灵活的交互设计，成为了众多设计师和开发者的首选。本文将深入探讨Axure的高级功能，帮助大家更好地利用这款工具，提升原型设计的效率和质量。一、Axure高级功能概览•变量管理：介绍局部变量和全局变量的使用场景、命名规则以及如何在原型设计中实现数据传递和交互逻辑。•动态面板：详细解析动态面板的工作原理，包括如何创建、编辑和管理动态面板状态，以及
【RabbitMQ】RabbitMQ中死信交换机是什么？延迟队列呢？有哪些应用场景？熏鱼的小迷弟Liu 中间件 rabbitmq ruby 分布式
1.死信交换机(DeadLetterExchangeDLX)1.1什么是死信交换机？死信：在RabbitMQ中，无法被消费者正常处理的消息称为死信(DeadLetter)。死信交换机：用于接收死信的交换机。当消息成为死信时，RabbitMQ会将其重新路由到死信交换机，再由死信交换机根据绑定规则路由到死信队列。1.2消息成为死信的条件1.消息被拒绝：消费者调用basic.reject或basic.n
疯狂python讲义学习日志06——异常处理静笃归心方得平和心气 Python学习日志异常处理 python学习 python笔记 python速成
疯狂python讲义学习日志06——异常处理引言1异常处理机制1.1使用try...except处理异常1.2异常类的继承体系1.3多异常捕获1.4访问异常信息1.5else块1.6使用finally回收资源2使用raise处理异常2.1引发异常2.2自定义异常类2.3except和raise同时使用3.python的异常传播轨迹4.异常处理规则4.1不要过度使用异常4.2不要忽略异常引言异常机制
以量子“自相干—波函数”理论的破产奠基唯物唯一的《自然集合论》留下一片云科技
违背守恒定律-物质唯一性的“自相干即可改变衍射方向”思想实验：在接受屏光栅“电子落点处”继续开缝衍射。多级重复角度叠加后，按量子“波函数”理论，“电子只靠自相干，不需任何外部作用即可任意变向、返回”，“拔着自己的头发离开了地球”。唯心的经典骗术：“天机不可泄露”—“观察导致坍缩”。—————————自然集合论自然是融洽无矛盾的客观存在，唯物唯一。集合有统属，万物归自然。集合内性本善，逻辑/规则在集
Kafka集群部署实战 Gold Steps. 技术博文分享 kafka 分布式
服务背景ApacheKafka作为分布式流处理平台，在金融交易系统、物联网数据处理、实时日志分析等场景中发挥关键作用。某电商平台日均处理订单消息1.2亿条，峰值QPS达5万，采用Kafka集群实现订单状态流转、用户行为追踪和库存同步等功能。以下是经过生产验证的集群部署方案及典型故障处理经验。集群运维最佳实践1.容量规划建议指标推荐值监控阈值分区数量/Broker≤4000≥3500告警副本同步延迟
【OpenCV C++】如何快速高效的计算出图像中大于值的像素个数？遍历比较吗？ No，效率太低！那么如何更高效？ R-G-B OpenCV C++opencv c++计算机视觉
文章目录1问题2分析3代码实现（两种方法实现）方法1:使用cv::compare方法2:使用cv::threshold3.2compare和threshold看起来都有二值化效果？那么二者效率？4compare函数解释4.1参数解释4.2底层行为规则4.3应用示例4.4典型应用场景1问题一幅图像的目标区域ROI尺寸为60*35的灰度图，快速计算出大于backVal的像素个数，其中backVal=2
leetcode501-二叉搜索树中的众数记得早睡~ 算法小课堂 leetcode 算法 javascript 数据结构
leetcode501思路由于是二叉搜索树，那么我们知道它的特性：使用中序遍历得到的是从小到大排序的，所以我们利用这个规则，使用count来统计每次出现一个新的数的总个数，maxCount统计最大的个数值，result来存储二叉树中的众数，也就是要得到的结果值，pre用于统计前一个节点值初始化定义好值以后，我们需要使用中序遍历，中间处理逻辑值当pre还不存在的时候或者前一个节点跟后一个节点不相同时
指针中*的位置闯闯爱编程开发语言 c语言
1.两种常见风格风格A（靠近类型）int*p;//*靠近类型，强调"p是一个int*类型"优点：强调指针是类型的一部分，逻辑上更直观。缺点：在多变量声明时可能引发歧义（例如int*a,b;中b不是指针）。风格B（靠近变量名）int*p;//*靠近变量名，强调"*p是一个int"优点：符合语法解析规则（*属于变量名），多变量声明更清晰（例如int*a,*b;）。缺点：类型与变量名的分离可能降低可读性
数据结构篇——线索二叉树张二娃同学数据结构
一、引入遍历二叉树是按一定规则将二叉树结点排成线性序列，得到先序、中序或后序序列，本质是对非线性结构线性化，使结点（除首尾）在线性序列中有唯一前驱和后继；但以二叉链表作存储结构时，只能获取结点左右孩子信息，无法直接得任一序列中的前驱和后继信息，该信息需在遍历动态过程中获取，所以我们将引入线索二叉树来保存遍历动态过程中得到的前驱和后继信息。二、线索二叉树的基本概念试做如下规定:若结点有左子树,则其l
uni-app——计时器和界面交互API 阿常11 uni-app移动应用开发 uni-app
API基本概要概念说明API（应用程序接口）是预先定义的方法集合，用于实现特定功能。在uni-app中，通过全局对象uni调用API，例如uni.getSystemInfoSync获取设备信息。API分类与调用规则事件监听型以on开头，如uni.onNetworkStatusChange监听网络变化。数据操作型获取数据：以get开头，如uni.getStorage读取本地缓存。设置数据：以set开
TCP/IP学习笔记(5) --IP选路 ox0080 Linux 网络 linux网络
静态IP选路一个简单的路由表选路是IP层最重要的一个功能之一。前面的部分已经简单的讲过路由器是通过何种规则来根据IP数据包的IP地址来选择路由。这里就不重复了。首先来看看一个简单的系统路由表。命令:routeprint|more对于一个给定的路由器，可以打印出五种不同的flag。U表明该路由可用。G表明该路由是到一个网关。如果没有这个标志，说明和Destination是直连的，而相应的Gatewa
施磊老师高级c++(一) Zhuai-行淮施磊老师cpp c++开发语言
对象被优化后,才是高效的c++编程文章目录对象被优化后,才是高效的c++编程1.对象使用背后调用了哪些方法2.函数调用过程中对象背后调用方法3.总结三条对象优化的规则4.CMyString的代码问题5.添加带右值引用参数的拷贝构造和赋值函数6.String类在vector上的应用--面试题7.move移动语义和forword类型完美转发move移动语义的作用代码:**问题:**解决办法:最终代码:
3.5 Spring Boot邮件服务：从基础发送到模板邮件进阶 Sendingab Spring boot 从入门到精通零基础7天精通Spring Boot spring boot python 后端
SpringBoot邮件服务：从基础发送到模板邮件进阶引言在现代企业级应用中，邮件服务是不可或缺的基础能力。从用户注册验证、密码重置，到订单通知、系统告警，再到营销推广等场景，邮件始终扮演着关键角色。SpringBoot通过spring-boot-starter-mail模块，将JavaMail的复杂配置简化为几行代码即可实现的便捷操作。本文将手把手带您实现从基础文本邮件发送到高级模板邮件的完整开
【Leetcode】12. 整数转罗马数字 Leuanghing leetcode 算法 python
一、题目描述七个不同的符号代表罗马数字，其值如下：罗马数字是通过添加从最高到最低的小数位值的转换而形成的。将小数位值转换为罗马数字有以下规则：如果该值不是以4或9开头，请选择可以从输入中减去的最大值的符号，将该符号附加到结果，减去其值，然后将其余部分转换为罗马数字。如果该值以4或9开头，使用减法形式，表示从以下符号中减去一个符号，例如4是5(V)减1(I):IV，9是10(X)减1(I)：IX。仅
Mininet树形拓扑解析漫谈网络网络技术进阶通途网络 mininet sdn nfv
在Mininet中，tree,depth,fanout用于定义树形拓扑的参数，其中：depth：树的深度（层数），包括根节点所在的层。fanout：每层节点的分叉数（每个节点连接的子节点数量）。对于tree,4,3，即深度为4，分叉数为3，其节点生成规则如下：一、拓扑参数定义depth=4：交换机的层级数（根为第1层，共4层交换机）。fanout=3：每台交换机（非最后一层）连接3台子交换机或主机
Spring Boot 中使用 @Transactional 注解配置事务管理 m0_74823434 面试学习路线阿里巴巴 spring boot 数据库 sql
事务管理是应用系统开发中必不可少的一部分。Spring为事务管理提供了丰富的功能支持。Spring事务管理分为编程式和声明式的两种方式。编程式事务指的是通过编码方式实现事务；声明式事务基于AOP,将具体业务逻辑与事务处理解耦。声明式事务管理使业务代码逻辑不受污染,因此在实际使用中声明式事务用的比较多。声明式事务有两种方式，一种是在配置文件（xml）中做相关的事务规则声明，另一种是基于@Transa
集团公司邮箱格式怎么写？ html安全
在现代企业中，邮箱不仅是日常沟通的工具，更是企业形象的重要组成部分。尤其是对于集团公司来说，邮箱格式的规范性和专业性尤为重要，因为它直接影响外界对公司的第一印象。那么，集团公司邮箱的格式该如何设计？本文将从邮箱格式的重要性、常见格式设计、命名规则以及注意事项等方面为您一一解析。一、为什么集团公司邮箱格式如此重要？集团公司邮箱不仅是员工之间沟通的工具，也是对外展示企业文化和品牌形象的重要窗口。一个规
【C++基础十】泛型编程(模板初阶) Pacify_The_North C++c++算法 windows visualstudio 开发语言
【C++基础十】泛型编程—模板1.什么是模板2.函数模板的实例化：2.1隐式实例化2.2显示实例化3.函数模板参数的匹配规则4.什么是类模板5.类模板的实例化6.声明和定义分离1.什么是模板voidswap(int&a,int&b){inttmp=0;tmp=a;a=b;b=tmp;}voidswap(double&a,double&b){doubletmp=0;tmp=a;a=b;b=tmp;}
基于 Python 将 PDF 转 Markdown 并拆解为 JSON，支持自定义标题处理 drebander python pdf json
在日常工作中，我们经常需要将PDF文件转换为可编辑、可结构化的数据格式，比如Markdown和JSON。但实际操作中，自动化工具往往会出现标题识别不准确的问题，尤其是PDF转换过程中，缺乏明确的标题标识。这篇文章将教你如何使用Python将PDF转换为Markdown，并通过自定义规则精准识别标题，最终将内容按标题结构拆解为JSON，方便后续快速检索与使用。1.实现目标将PDF文件转换为Markd
A Survey of Large Language Models大模型综述论文章节总结 WhyteHighmore 论文语言模型人工智能自然语言处理论文笔记
ASurveyofLLM人大译ASurveyofLargeLanguageModels这篇论文全面回顾了大型语言模型(LLM)的最新进展，重点关注其发展背景、关键发现和主流技术。文章主要围绕LLM的四个主要方面展开：1引言自从1950年图灵测试被提出以来，人类一直在探索机器掌握语言智能的方法。语言本质上是一种受语法规则支配的复杂、精细的人类表达系统，这使得开发能够理解和掌握语言的强大人工智能(AI
跨境电商多账号管理革命！2025年团队协作工具深度解析跨境卫士萌萌跨境电商大数据人工智能业界资讯经验分享
跨境电商多账号管理革命！2025年团队协作工具深度解析引言：跨境电商团队协作的新挑战近年来，全球跨境电商市场持续高速增长，企业面对的竞争也愈加激烈。随着亚马逊、eBay、Shopify等多个平台的布局需求增加，商家需要管理多个卖家账号。然而，新规则的不断更新、多账户风控的升级，使得团队在运营过程中面临诸多挑战。如何高效管理多账号？如何避免IP关联、账号封禁？如何提升团队协作效率？2025年，随着多
红黑树详解？红黑树设计的背景？ F_windy java
红黑树详解1.红黑树的基本概念红黑树（Red-BlackTree）是一种自平衡的二叉搜索树（BST），通过节点颜色（红或黑）和一组规则来保持近似平衡，确保插入、删除、查找等操作的时间复杂度为O(logn)。它的核心思想是通过颜色标记和旋转操作，减少树的高度差异，从而提升性能。2.红黑树的五大规则红黑树必须满足以下规则：颜色规则：每个节点非红即黑。根节点规则：根节点必须是黑色。叶子节点规则：所有叶子
hashmap为什么每次扩容都是2倍？给我个面子中不哈希算法散列表 java
HashMap扩容为什么是2倍，且可以用移位操作代替与运算？在HashMap中，哈希桶（数组）的大小总是2的幂，扩容时也是原大小的2倍。这样做的主要目的是优化哈希计算，使得索引计算可以用位运算（&）替代取模（%），提高性能。1.HashMap扩容规则HashMap的数组容量始终是2的幂（16,32,64...）。扩容时，容量翻倍。索引计算采用(n-1)&hash，而不是hash%n。2.为什么扩容
redis学习笔记——不仅仅是存取数据 Everyday都不同 returnSource expire/del incr/lpush 数据库分区 redis
最近项目中用到比较多redis，感觉之前对它一直局限于get/set数据的层面。其实作为一个强大的NoSql数据库产品，如果好好利用它，会带来很多意想不到的效果。（因为我搞java，所以就从jedis的角度来补充一点东西吧。PS：不一定全，只是个人理解，不喜勿喷） 1、关于JedisPool.returnSource(Jedis jeids) 这个方法是从red
SQL性能优化-持续更新中。。。。。。 atongyeye oracle sql
1 通过ROWID访问表--索引你可以采用基于ROWID的访问方式情况,提高访问表的效率, , ROWID包含了表中记录的物理位置信息..ORACLE采用索引(INDEX)实现了数据和存放数据的物理位置(ROWID)之间的联系. 通常索引提供了快速访问ROWID的方法,因此那些基于索引列的查询就可以得到性能上的提高. 2 共享SQL语句--相同的sql放入缓存 3 选择最有效率的表
[JAVA语言]JAVA虚拟机对底层硬件的操控还不完善 comsci JAVA虚拟机
如果我们用汇编语言编写一个直接读写CPU寄存器的代码段，然后利用这个代码段去控制被操作系统屏蔽的硬件资源，这对于JVM虚拟机显然是不合法的，对操作系统来讲，这样也是不合法的，但是如果是一个工程项目的确需要这样做，合同已经签了，我们又不能够这样做，怎么办呢？那么一个精通汇编语言的那种X客，是否在这个时候就会发生某种至关重要的作用呢？ &n
lvs- real 男人50 LVS
#!/bin/bash # # Script to start LVS DR real server. # description: LVS DR real server # #. /etc/rc.d/init.d/functions VIP=10.10.6.252 host='/bin/hostname' case "$1" in sta
生成公钥和私钥 oloz DSA 安全加密
package com.msserver.core.util; import java.security.KeyPair; import java.security.PrivateKey; import java.security.PublicKey; import java.security.SecureRandom; public class SecurityUtil {
UIView 中加入的cocos2d，背景透明 374016526 cocos2d glClearColor
要点是首先pixelFormat:kEAGLColorFormatRGBA8，必须有alpha层才能透明。然后view设置为透明glView.opaque = NO;[director setOpenGLView:glView];[self.viewController.view setBackgroundColor:[UIColor clearColor]];[self.viewControll
mysql常用命令香水浓 mysql
连接数据库 mysql -u troy -ptroy 备份表 mysqldump -u troy -ptroy mm_database mm_user_tbl > user.sql 恢复表（与恢复数据库命令相同） mysql -u troy -ptroy mm_database < user.sql 备份数据库 mysqldump -u troy -ptroy
我的架构经验系列文章 - 后端架构 - 系统层面 agevs JavaScript jquery css html5
系统层面：高可用性所谓高可用性也就是通过避免单独故障加上快速故障转移实现一旦某台物理服务器出现故障能实现故障快速恢复。一般来说，可以采用两种方式，如果可以做业务可以做负载均衡则通过负载均衡实现集群，然后针对每一台服务器进行监控，一旦发生故障则从集群中移除；如果业务只能有单点入口那么可以通过实现Standby机加上虚拟IP机制，实现Active机在出现故障之后虚拟IP转移到Standby的快速
利用ant进行远程tomcat部署 aijuans tomcat
在javaEE项目中，需要将工程部署到远程服务器上，如果部署的频率比较高，手动部署的方式就比较麻烦，可以利用Ant工具实现快捷的部署。这篇博文详细介绍了ant配置的步骤（http://www.cnblogs.com/GloriousOnion/archive/2012/12/18/2822817.html），但是在tomcat7以上不适用，需要修改配置，具体如下： 1.配置tomcat的用户角色
获取复利总收入 baalwolf 获取
public static void main(String args[]){ int money=200; int year=1; double rate=0.1; &
eclipse.ini解释 BigBird2012 eclipse
大多数java开发者使用的都是eclipse，今天感兴趣去eclipse官网搜了一下eclipse.ini的配置，供大家参考，我会把关键的部分给大家用中文解释一下。还是推荐有问题不会直接搜谷歌，看官方文档，这样我们会知道问题的真面目是什么，对问题也有一个全面清晰的认识。 Overview 1、Eclipse.ini的作用 Eclipse startup is controlled by th
AngularJS实现分页功能 bijian1013 JavaScript AngularJS 分页
对于大多数web应用来说显示项目列表是一种很常见的任务。通常情况下，我们的数据会比较多，无法很好地显示在单个页面中。在这种情况下，我们需要把数据以页的方式来展示，同时带有转到上一页和下一页的功能。既然在整个应用中这是一种很常见的需求，那么把这一功能抽象成一个通用的、可复用的分页（Paginator）服务是很有意义的。 &nbs
[Maven学习笔记三]Maven archetype bit1129 ArcheType
archetype的英文意思是原型，Maven archetype表示创建Maven模块的模版，比如创建web项目，创建Spring项目等等. mvn archetype提供了一种命令行交互式创建Maven项目或者模块的方式， mvn archetype 1.在LearnMaven-ch03目录下，执行命令mvn archetype:gener
【Java命令三】jps bit1129 Java命令
jps很简单，用于显示当前运行的Java进程，也可以连接到远程服务器去查看 [hadoop@hadoop bin]$ jps -help usage: jps [-help] jps [-q] [-mlvV] [<hostid>] Definitions: <hostid>: <hostname>[:
ZABBIX2.2 2.4 等各版本之间的兼容性 ronin47
zabbix更新很快，从2009年到现在已经更新多个版本，为了使用更多zabbix的新特性，随之而来的便是升级版本，zabbix版本兼容性是必须优先考虑的一点客户端AGENT兼容 zabbix1.x到zabbix2.x的所有agent都兼容zabbix server2.4：如果你升级zabbix server，客户端是可以不做任何改变，除非你想使用agent的一些新特性。 Zabbix代理（p
unity 3d还是cocos2dx哪个适合游戏？ brotherlamp unity自学 unity教程 unity视频 unity资料 unity
unity 3d还是cocos2dx哪个适合游戏？问：unity 3d还是cocos2dx哪个适合游戏？答：首先目前来看unity视频教程因为是3d引擎，目前对2d支持并不完善，unity 3d 目前做2d普遍两种思路，一种是正交相机，3d画面2d视角，另一种是通过一些插件，动态创建mesh来绘制图形单元目前用的较多的是2d toolkit，ex2d，smooth moves，sm2，
百度笔试题：一个已经排序好的很大的数组，现在给它划分成m段，每段长度不定，段长最长为k，然后段内打乱顺序，请设计一个算法对其进行重新排序 bylijinnan java 算法面试百度招聘
import java.util.Arrays; /** * 最早是在陈利人老师的微博看到这道题： * #面试题#An array with n elements which is K most sorted，就是每个element的初始位置和它最终的排序后的位置的距离不超过常数K * 设计一个排序算法。It should be faster than O(n*lgn)。
获取checkbox复选框的值 chiangfai checkbox
<title>CheckBox</title> <script type = "text/javascript"> doGetVal: function doGetVal() { //var fruitName = document.getElementById("apple").value;//根据
MySQLdb用户指南 chenchao051 mysqldb
原网页被墙，放这里备用。 MySQLdb User's Guide Contents Introduction Installation _mysql MySQL C API translation MySQL C API function mapping Some _mysql examples MySQLdb
HIVE 窗口及分析函数 daizj hive 窗口函数分析函数
窗口函数应用场景：（1）用于分区排序（2）动态Group By （3）Top N （4）累计计算（5）层次查询一、分析函数用于等级、百分点、n分片等。函数说明 RANK() &nbs
PHP ZipArchive 实现压缩解压Zip文件 dcj3sjt126com PHP zip
PHP ZipArchive 是PHP自带的扩展类，可以轻松实现ZIP文件的压缩和解压，使用前首先要确保PHP ZIP 扩展已经开启，具体开启方法就不说了，不同的平台开启PHP扩增的方法网上都有，如有疑问欢迎交流。这里整理一下常用的示例供参考。一、解压缩zip文件 01 02 03 04 05 06 07 08 09 10 11
精彩英语贺词 dcj3sjt126com 英语
I'm always here 我会一直在这里支持你 &nb
基于Java注解的Spring的IoC功能 e200702084 java spring bean IOC Office
java模拟post请求 geeksun java
一般API接收客户端（比如网页、APP或其他应用服务）的请求，但在测试时需要模拟来自外界的请求，经探索，使用HttpComponentshttpClient可模拟Post提交请求。此处用HttpComponents的httpclient来完成使命。 import org.apache.http.HttpEntity ; import org.apache.http.HttpRespon
Swift语法之 ---- ?和!区别 hongtoushizi ?swift !
转载自： http://blog.sina.com.cn/s/blog_71715bf80102ux3v.html Swift语言使用var定义变量，但和别的语言不同，Swift里不会自动给变量赋初始值，也就是说变量不会有默认值，所以要求使用变量之前必须要对其初始化。如果在使用变量之前不进行初始化就会报错： var stringValue : String //
centos7安装jdk1.7 jisonami jdk centos
安装JDK1.7 步骤1、解压tar包在当前目录 [root@localhost usr]#tar -xzvf jdk-7u75-linux-x64.tar.gz 步骤2：配置环境变量在etc/profile文件下添加 export JAVA_HOME=/usr/java/jdk1.7.0_75 export CLASSPATH=/usr/java/jdk1.7.0_75/lib
数据源架构模式之数据映射器 home198979 PHP 架构数据映射器 datamapper
前面分别介绍了数据源架构模式之表数据入口、数据源架构模式之行和数据入口数据源架构模式之活动记录，相较于这三种数据源架构模式，数据映射器显得更加“高大上”。一、概念数据映射器（Data Mapper）：在保持对象和数据库（以及映射器本身）彼此独立的情况下，在二者之间移动数据的一个映射器层。概念永远都是抽象的，简单的说，数据映射器就是一个负责将数据映射到对象的类数据。 &nb
在Python中使用MYSQL pda158 mysql python
缘由　　近期在折腾一个小东西须要抓取网上的页面。然后进行解析。将结果放到数据库中。　　了解到 Python在这方面有优势，便选用之。　　由于我有台 server上面安装有 mysql，自然使用之。在进行数据库的这个操作过程中遇到了不少问题，这里记录一下，大家共勉。　　 python中mysql的调用　　百度之后能够通过MySQLdb进行数据库操作。
单例模式 hxl1988_0311 java 单例设计模式单件
package com.sosop.designpattern.singleton; /* * 单件模式：保证一个类必须只有一个实例，并提供全局的访问点 * * 所以单例模式必须有私有的构造器，没有私有构造器根本不用谈单件 * * 必须考虑到并发情况下创建了多个实例对象 * */ /** * 虽然有锁，但是只在第一次创建对象的时候加锁，并发时不会存在效率
27种迹象显示你应该辞掉程序员的工作 vipshichg 工作
1、你仍然在等待老板在2010年答应的要提拔你的暗示。 2、你的上级近10年没有开发过任何代码。 3、老板假装懂你说的这些技术，但实际上他完全不知道你在说什么。 4、你干完的项目6个月后才部署到现场服务器上。 5、时不时的，老板在检查你刚刚完成的工作时，要求按新想法重新开发。 6、而最终这个软件只有12个用户。 7、时间全浪费在办公室政治中，而不是用在开发好的软件上。 8、部署前5分钟才开始测试。

Prometheus常用告警规则

Prometheus常用告警规则

Prometheus.rules

Host.rules

Blackbox.rules

Mysql.rules

Redis.rules

Elasticsearch.rules

kafka.rules

Docker.rules

你可能感兴趣的:(监控相关,prometheus,alertmanager,rules,告警规则)