_三石_

运用prometheus+grafana 搭建监控体系(二)

内容概要

上一篇主要说了如何安装，本篇主要对监控配置文件进行说明

配置说明

参数名称	说明	默认值	参数所属
scrape_interval	指标数据采集间隔	1分钟	prometheus.yml
evaluation_interval	规则的计算间隔	1分钟	prometheus.yml
for: 时间	异常持续多长时间发送告警	0	规则配置
group_wait	分组等待时间。同一分组内收到第一个告警等待多久开始发送，目的是为了同组消息同时发送	30秒	alertmanager.yml
group_interval	上下两组发送告警的间隔时间。第一次告警发出后等待group_interval时间，开始为该组触发新告警	5分钟	alertmanager.yml
repeat_interval	重发间隔。告警已经发送，且无新增告警，再次发送告警需要的间隔时间	4小时	alertmanager.yml

# prometheus.yml配置
global:
  scrape_interval:     20s
  evaluation_interval: 30s

# 规则配置
  - alert: kakfa_down
    expr: kakfa_up_status == 0
    for: 1m
    annotations:
      summary: "Kafka挂掉了"

# alertmanager配置
route:
  group_by: [alertname]
  group_wait: 60s
  group_interval: 5m
  repeat_interval: 10m

事件流程
10:00:05 Kafka挂掉了
10:00:20 拉取指标kakfa_up_status=0
10:00:30 计算规则，发现Kafka挂掉了，将kakfa_down设置为pending
10:00:30~10:01:30 持续拉取指标、计算规则
10:01:30 kafka_down持续时间达到了1分钟，设置为firing，发送到alertmanager
10:01:30 alertmanager收到后，等待分组等待时间
10:02:30 分组等待时间完成，发出告警
10:12:30 告警还没有解决，重复发出告警

relabel简介
为了更好的识别监控指标，便于后期调用数据绘图、告警等需求，prometheus支持对发现的目标进行label修改，可以在目标被抓取之前动态重写目标的标签集。每个抓取配置可以配置多个重新标记步骤。它们按照它们在配置文件中出现的顺序应用于每个目标的标签集。

除了配置的每个目标标签之外，prometheus还会自动添加几个标签：

job标签：设置为job_name相应的抓取配置的值。
instance标签：__address__设置为目标的地址:。重新标记后，如果在重新标记期间未设置标签，则默认将__address__标签值赋值给instance。
schema：协议类型
__metrics_path：抓取指标数的url
scrape_interval：scrape抓取数据时间间隔（秒）
scrape_timeout：scrape超时时间（秒）
__meta_在重新标记阶段可能会提供带有前缀的附加标签。它们由提供目标的服务发现机制设置，并且因机制而异。

__目标重新标记完成后，将从标签集中删除以开头的标签。

如果重新标记步骤只需要临时存储标签值（作为后续重新标记步骤的输入），可以使用__tmp标签名称前缀。这个前缀保证不会被 Prometheus 本身使用。

常用的在以下两个阶段可以重新标记：

relabel_configs：在采集之前（比如在采集数据之前重新定义元标签），可以使用relabel_configs添加一些标签、也可以只采集特定目标或过滤目标

metric_relabel_configs：如果是已经抓取到指标数据时，可以使用metric_relabel_configs做最后的重新标记和过滤

配置监控项

下载地址 https://prometheus.io/download/#node_exporter

主机监控

wget https://github.com/prometheus/node_exporter/releases/download/v*/node_exporter-*.*-amd64.tar.gz
tar xvfz node_exporter-*.*-amd64.tar.gz
cd node_exporter-*.*-amd64
./node_exporter

导入模版1860

kafka监控

wget https://github.com/danielqsj/kafka_exporter/releases/download/v1.2.0/kafka_exporter-1.2.0.linux-amd64.tar.gz
tar -xvf  kafka_exporter-v1.2.0.linux-amd64.tar.gz
mv kafka_exporter-v1.2.0.linux-amd64 /data/kafka_exporter
cd /data/kafka_exporter
nohup ./kafka_exporter --kafka.server=kafkaIP或者域名:9092 &

time="2022-11-09T15:17:56+08:00" level=info msg="Starting kafka_exporter (version=1.2.0, branch=HEAD, revision=830660212e6c109e69dcb1cb58f5159fe3b38903)" source="kafka_exporter.go:474"
time="2022-11-09T15:17:56+08:00" level=info msg="Build context (go=go1.10.3, user=root@981cde178ac4, date=20180707-14:34:48)" source="kafka_exporter.go:475"
time="2022-11-09T15:17:56+08:00" level=info msg="Done Init Clients" source="kafka_exporter.go:213"
time="2022-11-09T15:17:56+08:00" level=info msg="Listening on :9308" source="kafka_exporter.go:499"

导入模板7589

Redis监控

wget https://github.com/oliver006/redis_exporter/releases/download/v1.3.2/redis_exporter-v1.3.2.linux-amd64.tar.gz
tar -xvf  redis_exporter-v1.3.2.linux-amd64.tar.gz
mv redis_exporter-v1.3.2.linux-amd64 /data/redis_exporter
nohup ./redis_exporter -redis.addr 192.168.0.11:7001(注意不要使用sentinal端口) -redis.password Redis@2022 &

time="2022-11-09T14:39:10+08:00" level=info msg="Redis Metrics Exporter v1.3.2    build date: 2019-11-06-02:25:20    sha1: 175a69f33e8267e0a0ba47caab488db5e83a592e    Go: go1.13.4    GOOS: linux    GOARCH: amd64"
time="2022-11-09T14:39:10+08:00" level=info msg="Providing metrics at :9121/metrics"

修改Prometheus的配置文件prometheus.yml

- job_name: redis
	static_configs:
  - targets: ['172.26.42.229:9121']
    labels:
      instance: redis120

集群redis监控

- job_name: 'redis_exporter_targets'
    static_configs:
      - targets:
        - redis://192.168.0.11:7001
        - redis://192.168.0.12:7001
        - redis://192.168.0.13:7001
    metrics_path: /scrape
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 192.168.0.11:9121
  - job_name: 'redis'
    metrics_path: /metrics
    static_configs:
    - targets: ['192.168.0.11:9121']

导入11835

mysql监控

wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.14.0/mysqld_exporter-0.14.0.linux-amd64.tar.gz

tar xvf mysqld_exporter-0.14.0.linux-amd64.tar.gz
mv mysqld_exporter-0.14.0.linux-amd64 /data/mysqld_exporter
vim /data/mysqld_exporter/.my.cnf
[client]
user=mysqlexpoter
password=prometheus
host=192.168.xx.xx
port=3306
nohup ./mysqld_exporter --config.my-cnf=/data/mysqld_exporter/.my.cnf &

ts=2022-11-09T07:25:16.492Z caller=mysqld_exporter.go:303 level=info msg="Listening on address" address=:9104
ts=2022-11-09T07:25:16.492Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false

若您需要获取MySQL数据库类型的监控指标数据，需要在MySQL数据库中开通相关的权限，将mysqld_exporter连接到MySQL数据库，本文介绍如何设置MySQL数据库的mysqld_exporter权限。

在MySQL数据库中为mysqld_exporter创建一个用户，用户密码可以自行设置。然后执行如下命令，为performance_schema.* 表添加读权限。

mysql> GRANT REPLICATION CLIENT, PROCESS ON *.* TO 
'mysqld_exporter'@'localhost' identified by 'arms_prometheus2022';      
mysql> FLUSH PRIVILEGES;
说明 mysqld_exporter和arms_prometheus2022是自定义的用户名称和密码，请根据实际情况替换。

导入模版7362

配置告警规则

服务器告警规则

修改Prometheus配置文件prometheus.yml,添加以下配置：

rule_files:
  - /etc/prometheus/rules/*.rules

热加载更新配置

在 Prometheus 的日常维护中，一定会对配置文件 prometheus.yml 进行再编辑操作，通常对 Prometheus 服务进行重启操作即可完成对配
置文件的加载。
当然也可以通过动态的热加载来更新 prometheus.yml 中的配置信息，一般热加载有两种方法：

1、查看 Prometheus 的进程 id，进程发送 SIGHUP 信号：
kill -HUP pid
2、通过HTTP API 发送 post 请求到 /-/reload:
curl -X POST http://localhost:9090/-/reload
若使用第二种方式进行热加载操作，需要在 Prometheus 服务启动时指定 --web.enable-lifecycle，添加到以上的 Prometheus 自启动文件中使用。

systemctl daemon-reload

在目录/etc/prometheus/rules/下创建告警文件hoststats-alert.rules内容如下：

groups:
- name: hostStatsAlert
  rules:
  - alert: hostCpuUsageAlert
    expr: sum(avg without (cpu)(irate(node_cpu{mode!='idle'}[5m]))) by (instance) > 0.85
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} CPU usgae high"
      description: "{{ $labels.instance }} CPU usage above 85% (current value: {{ $value }})"
  - alert: hostMemUsageAlert
    expr: (node_memory_MemTotal - node_memory_MemAvailable)/node_memory_MemTotal > 0.85
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} MEM usgae high"
      description: "{{ $labels.instance }} MEM usage above 85% (current value: {{ $value }})"

重启Prometheus后访问Prometheus UIhttp://127.0.0.1:9090/rules可以查看当前以加载的规则文件。

[root@grafana rules]# cat node_exporter_rules.yml 
# 服务器资源告警策略
groups:
- name: 服务器资源监控
  rules:
  - alert: 内存使用率过高
    expr: (node_memory_Buffers_bytes+node_memory_Cached_bytes+node_memory_MemFree_bytes)/node_memory_MemTotal_bytes*100 > 90 
    for: 5m  # 告警持续时间，超过这个时间才会发送给alertmanager
    labels:
      severity: 严重告警
    annotations:
      summary: "{{ $labels.instance }} 内存使用率过高，请尽快处理！"
      description: "{{ $labels.instance }}内存使用率超过90%,当前使用率{{ $value }}%."
          
  - alert: 服务器宕机
    expr: up == 0
    for: 3m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{$labels.instance}} 服务器宕机，请尽快处理！"
      description: "{{$labels.instance}} 服务器延时超过3分钟，当前状态{{ $value }}. "
 
  - alert: CPU高负荷
    expr: 100 - (avg by (instance,job)(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90
    for: 5m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{$labels.instance}} CPU使用率过高，请尽快处理！"
      description: "{{$labels.instance}} CPU使用大于90%，当前使用率{{ $value }}%. "
      
  - alert: 磁盘IO性能
    expr: avg(irate(node_disk_io_time_seconds_total[1m])) by(instance,job)* 100 > 90
    for: 5m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{$labels.instance}} 流入磁盘IO使用率过高，请尽快处理！"
      description: "{{$labels.instance}} 流入磁盘IO大于90%,当前使用率{{ $value }}%."
 
 
  - alert: 网络流入
    expr: ((sum(rate (node_network_receive_bytes_total{device!~'tap.*|veth.*|br.*|docker.*|virbr*|lo*'}[5m])) by (instance,job)) / 100) > 102400
    for: 5m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{$labels.instance}} 流入网络带宽过高，请尽快处理！"
      description: "{{$labels.instance}} 流入网络带宽持续5分钟高于100M. RX带宽使用量{{$value}}."
 
  - alert: 网络流出
    expr: ((sum(rate (node_network_transmit_bytes_total{device!~'tap.*|veth.*|br.*|docker.*|virbr*|lo*'}[5m])) by (instance,job)) / 100) > 102400
    for: 5m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{$labels.instance}} 流出网络带宽过高,请尽快处理！"
      description: "{{$labels.instance}} 流出网络带宽持续5分钟高于100M. RX带宽使用量{$value}}."
  
  - alert: TCP连接数
    expr: node_netstat_Tcp_CurrEstab > 10000
    for: 2m
    labels:
      severity: 严重告警
    annotations:
      summary: " TCP_ESTABLISHED过高！"
      description: "{{$labels.instance}} TCP_ESTABLISHED大于100%,当前使用率{{ $value }}%."
 
  - alert: 磁盘容量
    expr: 100-(node_filesystem_free_bytes{fstype=~"ext4|xfs"}/node_filesystem_size_bytes {fstype=~"ext4|xfs"}*100) > 90
    for: 1m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{$labels.mountpoint}} 磁盘分区使用率过高，请尽快处理！"
      description: "{{$labels.instance}} 磁盘分区使用大于90%，当前使用率{{ $value }}%."

Mysql告警规则

groups:
- name: MySQLStatsAlert
  rules:
  - alert: MySQL is down
    expr: mysql_up == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Instance {{ $labels.instance }} MySQL is down"
      description: "MySQL database is down. This requires immediate action!"
  - alert: open files high
    expr: mysql_global_status_innodb_num_open_files > (mysql_global_variables_open_files_limit) * 0.75
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} open files high"
      description: "Open files is high. Please consider increasing open_files_limit."
  - alert: Read buffer size is bigger than max. allowed packet size
    expr: mysql_global_variables_read_buffer_size > mysql_global_variables_slave_max_allowed_packet 
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} Read buffer size is bigger than max. allowed packet size"
      description: "Read buffer size (read_buffer_size) is bigger than max. allowed packet size (max_allowed_packet).This can break your replication."
  - alert: Sort buffer possibly missconfigured
    expr: mysql_global_variables_innodb_sort_buffer_size <256*1024 or mysql_global_variables_read_buffer_size > 4*1024*1024 
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} Sort buffer possibly missconfigured"
      description: "Sort buffer size is either too big or too small. A good value for sort_buffer_size is between 256k and 4M."
  - alert: Thread stack size is too small
    expr: mysql_global_variables_thread_stack <196608
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} Thread stack size is too small"
      description: "Thread stack size is too small. This can cause problems when you use Stored Language constructs for example. A typical is 256k for thread_stack_size."
  - alert: Used more than 80% of max connections limited 
    expr: mysql_global_status_max_used_connections > mysql_global_variables_max_connections * 0.8
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} Used more than 80% of max connections limited"
      description: "Used more than 80% of max connections limited"
  - alert: InnoDB Force Recovery is enabled
    expr: mysql_global_variables_innodb_force_recovery != 0 
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} InnoDB Force Recovery is enabled"
      description: "InnoDB Force Recovery is enabled. This mode should be used for data recovery purposes only. It prohibits writing to the data."
  - alert: InnoDB Log File size is too small
    expr: mysql_global_variables_innodb_log_file_size < 16777216 
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} InnoDB Log File size is too small"
      description: "The InnoDB Log File size is possibly too small. Choosing a small InnoDB Log File size can have significant performance impacts."
  - alert: InnoDB Flush Log at Transaction Commit
    expr: mysql_global_variables_innodb_flush_log_at_trx_commit != 1
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} InnoDB Flush Log at Transaction Commit"
      description: "InnoDB Flush Log at Transaction Commit is set to a values != 1. This can lead to a loss of commited transactions in case of a power failure."
  - alert: Table definition cache too small
    expr: mysql_global_status_open_table_definitions > mysql_global_variables_table_definition_cache
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Table definition cache too small"
      description: "Your Table Definition Cache is possibly too small. If it is much too small this can have significant performance impacts!"
  - alert: Table open cache too small
    expr: mysql_global_status_open_tables >mysql_global_variables_table_open_cache * 99/100
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Table open cache too small"
      description: "Your Table Open Cache is possibly too small (old name Table Cache). If it is much too small this can have significant performance impacts!"
  - alert: Thread stack size is possibly too small
    expr: mysql_global_variables_thread_stack < 262144
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Thread stack size is possibly too small"
      description: "Thread stack size is possibly too small. This can cause problems when you use Stored Language constructs for example. A typical is 256k for thread_stack_size."
  - alert: InnoDB Buffer Pool Instances is too small
    expr: mysql_global_variables_innodb_buffer_pool_instances == 1
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} InnoDB Buffer Pool Instances is too small"
      description: "If you are using MySQL 5.5 and higher you should use several InnoDB Buffer Pool Instances for performance reasons. Some rules are: InnoDB Buffer Pool Instance should be at least 1 Gbyte in size. InnoDB Buffer Pool Instances you can set equal to the number of cores of your machine."
  - alert: InnoDB Plugin is enabled
    expr: mysql_global_variables_ignore_builtin_innodb == 1
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} InnoDB Plugin is enabled"
      description: "InnoDB Plugin is enabled"
  - alert: Binary Log is disabled
    expr: mysql_global_variables_log_bin != 1
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} Binary Log is disabled"
      description: "Binary Log is disabled. This prohibits you to do Point in Time Recovery (PiTR)."
  - alert: Binlog Cache size too small
    expr: mysql_global_variables_binlog_cache_size < 1048576
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Binlog Cache size too small"
      description: "Binlog Cache size is possibly to small. A value of 1 Mbyte or higher is OK."
  - alert: Binlog Statement Cache size too small
    expr: mysql_global_variables_binlog_stmt_cache_size <1048576 and mysql_global_variables_binlog_stmt_cache_size > 0
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Binlog Statement Cache size too small"
      description: "Binlog Statement Cache size is possibly to small. A value of 1 Mbyte or higher is typically OK."
  - alert: Binlog Transaction Cache size too small
    expr: mysql_global_variables_binlog_cache_size  <1048576
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Binlog Transaction Cache size too small"
      description: "Binlog Transaction Cache size is possibly to small. A value of 1 Mbyte or higher is typically OK."
  - alert: Sync Binlog is enabled
    expr: mysql_global_variables_sync_binlog == 1
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Sync Binlog is enabled"
      description: "Sync Binlog is enabled. This leads to higher data security but on the cost of write performance."
  - alert: IO thread stopped
    expr: mysql_slave_status_slave_io_running != 1
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Instance {{ $labels.instance }} IO thread stopped"
      description: "IO thread has stopped. This is usually because it cannot connect to the Master any more."
  - alert: SQL thread stopped 
    expr: mysql_slave_status_slave_sql_running == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Instance {{ $labels.instance }} SQL thread stopped"
      description: "SQL thread has stopped. This is usually because it cannot apply a SQL statement received from the master."
  - alert: Slave lagging behind Master
    expr: rate(mysql_slave_status_seconds_behind_master[1m]) >30 
    for: 1m
    labels:
      severity: warning 
    annotations:
      summary: "Instance {{ $labels.instance }} Slave lagging behind Master"
      description: "Slave is lagging behind Master. Please check if Slave threads are running and if there are some performance issues!"
  - alert: Slave is NOT read only(Please ignore this warning indicator.)
    expr: mysql_global_variables_read_only != 0
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Slave is NOT read only"
      description: "Slave is NOT set to read only. You can accidentally manipulate data on the slave and get inconsistencies..."

保存热加载prometheus：

curl  -XPOST localhost:9090/-/reload

配置调优

#Binlog Cache size too small 查询binlog缓存大小 show global status like 'bin%';
set global binlog_cache_size = 1048576;（立即生效重启后失效）
#Table open cache too small 查询打开表的数量 show global status like'open_tables'
# show global variables like 'table_open_cache';
set global table_open_cache = 根据打开的表数*1.2; （立即生效重启后失效）
# IO thread has stopped

Radis服务告警规则

[root@grafana rules]# cat redis_exporter_rules.yml 
# Redis服务监控
groups:
- name: Redis-监控告警
  rules:
  - alert: 警报！Redis应用不可用
    expr: redis_up == 0
    for: 0m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{ $labels.instance }} Redis应用不可用"
      description: "Redis应用不可达\n  当前值 = {{ $value }}"

  - alert: 警报！丢失Master节点
    expr: (count(redis_instance_info{role="master"}) ) < 1
    for: 0m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{ $labels.instance }} 丢失Redis master"
      description: "Redis集群当前没有主节点\n  当前值 = {{ $value }}"

  - alert: 警报！脑裂，主节点太多
    expr: count(redis_instance_info{role="master"}) > 1
    for: 0m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{ $labels.instance }} Redis脑裂，主节点太多"
      description: "{{ $labels.instance }} 主节点太多\n  当前值 = {{ $value }}"

  - alert: 警报！Slave连接不可达
    expr: count without (instance, job) (redis_connected_slaves) - sum without (instance, job) (redis_connected_slaves) - 1 > 1
    for: 0m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{ $labels.instance }} Redis丢失slave节点"
      description: "Redis slave不可达.请确认主从同步状态\n  当前值 = {{ $value }}"

  - alert: 警报！Redis副本不一致
    expr: delta(redis_connected_slaves[1m]) < 0
    for: 0m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{ $labels.instance }}  Redis 副本不一致"
      description: "Redis集群丢失一个slave节点\n  当前值 = {{ $value }}"

  - alert: 警报！Redis集群抖动
    expr: changes(redis_connected_slaves[1m]) > 1
    for: 2m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{ $labels.instance }}  Redis集群抖动"
      description: "Redis集群抖动,请检查.\n  当前值 = {{ $value }}"

  - alert: 警报！持久化失败
    expr: (time() - redis_rdb_last_save_timestamp_seconds) / 3600 > 24
    for: 0m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{ $labels.instance }}  Redis持久化失败"
      description: "Redis持久化失败(>24小时)\n  当前值 = {{ printf \"%.1f\" $value }}小时"

  - alert: 警报！内存不足
    expr: redis_memory_used_bytes / redis_total_system_memory_bytes * 100 > 90
    for: 2m
    labels:
      severity: 一般告警
    annotations:
      summary: "{{ $labels.instance }}系统内存不足"
      description: "Redis占用系统内存(> 90%)\n  当前值 = {{ printf \"%.2f\" $value }}%"

  - alert: 警报！Maxmemory不足
    expr: redis_config_maxmemory !=0 and redis_memory_used_bytes / redis_memory_max_bytes * 100 > 80
    for: 2m
    labels:
      severity: 一般告警
    annotations:
      summary: "{{ $labels.instance }} Maxmemory设置太小"
      description: "超出设置最大内存(> 80%)\n  当前值 = {{ printf \"%.2f\" $value }}%"

  - alert: 警报！连接数太多
    expr: redis_connected_clients > 200
    for: 2m
    labels:
      severity: 一般告警
    annotations:
      summary: "{{ $labels.instance }} 实时连接数太多"
      description: "连接数太多（>200）\n  当前值 = {{ $value }}"

  - alert: 警报！连接数太少
    expr: redis_connected_clients < 1
    for: 2m
    labels:
      severity: 一般告警
    annotations:
      summary: "{{ $labels.instance }}  实时连接数太少"
      description: "连接数(<1)\n  当前值 = {{ $value }}"

  - alert: 警报！拒绝连接数
    expr: increase(redis_rejected_connections_total[1m]) > 0
    for: 0m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{ $labels.instance }} 拒绝连接"
      description: "Redis有拒绝连接，请检查连接数配置\n  当前值 = {{ printf \"%.0f\" $value }}"

  - alert: 警报！执行命令数大于1000
    expr: rate(redis_commands_processed_total[1m])  > 1000
    for: 0m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{ $labels.instance }} 执行命令次数太多"
      description: "Redis执行命令次数太多\n  当前值 = {{ printf \"%.0f\" $value }}"

解决方法

#脑裂问题解决，在配置文件redis.conf中添加如下配置
min-slaves-to-write 1
min-slaves-max-lag 10

redis-server redis.conf
redis-sentinel sentinel.conf #sentinel模式

RabbitMQ服务告警规则

[root@grafana rules]# cat rabbitmq_exporter_rules.yml
# RabbitMQ服务监控
groups:
- name: RabbitMQ服务监控
  rules:
  - alert: RabbitMQ服务停止
    expr: rabbitmq_up ==0
    for: 3m
    labels:
      severity: 严重告警
    annotations:
      description: "{{$labels.instance}}RabbitMQ服务已停止，当前状态{{ $value }}"
      summary:  "RabbitMQ服务已停止3分钟，请尽快处理！"
    
  - alert: RabbitMQ内存使用大于2G
    expr: rabbitmq_node_mem_used/1024/1024 > 2048
    for: 3m
    labels:
      severity: 严重告警
    annotations:
      description: "{{ $labels.instance }} RabbitMQ内存使占用过高 !"
      value: '{{ $value }} MB'
      summary:  "RabbitMQ内存使占用大于2G"

kafka集群服务告警规则

[root@grafana rules]# cat kafka_exporter_rules.yml
# kafka集群服务监控
groups:
- name: kafka服务监控
  rules:
  - alert: kafka消费滞后
    expr: sum(kafka_consumergroup_lag{topic!="sop_free_study_fix-student_wechat_detail"}) by (consumergroup, topic, job) > 50000
    for: 3m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{$labels.instance}} kafka消费滞后({{$.Labels.consumergroup}})"
      description: "{{$.Labels.topic}}消费滞后超过5万持续3分钟(当前{{$value}})"
 
  - alert: kafka集群节点减少
    expr: kafka_brokers < 3   #kafka集群节点数3
    for: 3m
    labels:
      severity: 严重告警
    annotations:
      summary: "kafka集群部分节点已停止，请尽快处理！"
      description: "{{$labels.instance}} kafka集群节点减少"
 
  - alert: emqx_rule_to_kafka最近五分钟内的每秒平均变化率为0
    expr: sum(rate(kafka_topic_partition_current_offset{topic="emqx_rule_to_kafka"}[5m])) by ( instance,topic,job) ==0
    for: 5m
    labels:
      severity: 严重告警
    annotations:
      summary: "{{$labels.instance}} emqx_rule_to_kafka未接收到消息"
      description: "{{$.Labels.topic}}emqx_rule_to_kafka持续5分钟未接收到消息(当前{{$value}})"

域名SSL证书过期监控规则

[root@grafana rules]# cat ssl_expiry.yml
groups: 
  - name: SSL证书监测
    rules:
    - alert: 证书还有30天过期
      expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 300
      for: 5m
      labels:
        severity: 重要告警
      annotations:
        summary: "SSL证书即将过期 (instance {{ $labels.instance }})"
        description: "SSL证书即将30天内过期 VALUE = {{ $value }}\n  LABELS: {{ $labels }}"
        
    - alert: 证书已过期
      expr: probe_ssl_earliest_cert_expiry - time()  <= 0
      for: 5m
      labels:
        severity: 严重告警
      annotations:
        summary: "SSL证书已经过期 (instance {{ $labels.instance }})"
        description: "SSL证书已经过期\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

Elasticsearch集群告警规则

[root@grafana rules]# cat elasticsearch_exporter_rules.yml
groups:
   - name: ElasticSearch服务监控
     rules:
     - alert: ES集群节点减少
       expr: elasticsearch_cluster_health_number_of_nodes < 3  #ES集群节点数3
       for: 5m
       labels:
         severity: 严重告警
       annotations:
         summary: "ES集群节点减少:{{$.Labels.job}}"
         description: "ES集群节点数减少:{{$.Labels.job}},(当前:{{$value}})"
    
     - alert: jvm内存使用率告警
       expr: elasticsearch_jvm_memory_used_bytes{area="heap"} / elasticsearch_jvm_memory_max_bytes{area="heap"}*100 > 90
       for: 5m
       labels:
         severity: 严重告警
       annotations:
         summary: "jvm内存使用率过高:{{$.Labels.job}}"
         description: "jvm内存使用率过高:{{$.Labels.job}}大于90%,(当前:{{$value}})"

你可能感兴趣的:(监控体系记录,prometheus,grafana,kafka)

前端高级面试题阿芯爱编程面试前端
以下是一些高级前端面试题及答案：一、性能优化如何对大型前端项目进行性能剖析（profiling）？答案：使用ChromeDevTools中的Performance面板。可以记录页面加载和交互过程中的各种性能指标，如脚本执行时间、渲染时间、重绘和回流次数等。利用Lighthouse工具，它可以对网页进行全面的性能评估，包括加载性能、可访问性、最佳实践等方面，并给出优化建议。在代码中手动插入性能测量点
Node.js 调用 DeepSeek API 完整指南 m0_74823490 面试学习路线阿里巴巴 node.js
简介本文将介绍如何使用Node.js调用DeepSeekAPI，实现流式对话并保存对话记录。Node.js版本使用现代异步编程方式实现，支持流式处理和错误处理。1.环境准备1.1系统要求Node.js14.0或更高版本npm包管理器1.2项目结构deepseek-project/├──main.js#主程序├──package.json#项目配置文件└──conversation.txt#对话记录
LabVIEW太阳能制冷监控系统 LabVIEW开发 LabVIEW开发案例 labview
在全球能源需求日益增长的背景下，太阳能作为一种无限再生能源，被广泛应用于各种能源系统中。本基于LabVIEW软件和STM32F105控制器的太阳能制冷监控系统的设计与实现，提供一个高效、经济的太阳能利用方案，以应对能源消耗的挑战。项目背景随着全球人口的增加，能源需求不断攀升，而传统能源的开采与使用伴随着环境污染和资源枯竭的风险。太阳能作为一种清洁的再生能源，具有广阔的开发前景。此太阳能制冷监控系统
第N11周：seq2seq翻译实战-Pytorch复现计算机真好丸 pytorch 人工智能 python
文章目录一、前期准备1.搭建语言类2.文本处理函数3.文件读取函数二、Seq2Seq模型1.编码器（encoder）2.解码器（decoder）三、训练1.数据预处理2.训练函数3.评估四、评估与训练1.Loss图2.可视化注意力五、总结本文为365天深度学习训练营中的学习记录博客原作者：K同学啊一、前期准备from__future__importunicode_literals,print_fu
第N5周：Pytorch文本分类入门计算机真好丸 pytorch 分类人工智能
文章目录一、前期准备1.环境安装2.加载数据3.构建词典4.生成数据批次和迭代器二、准备模型1.定义模型2.定义实例三、训练模型1.拆分数据集并运行模型2.使用测试数据集评估模型本文为365天深度学习训练营中的学习记录博客原作者：K同学啊一、前期准备1.环境安装确保安装了torchtext与portalocker库2.加载数据importtorch#强制使用CPUdevice=torch.devi
第TR5周：Transformer实战：文本分类计算机真好丸 transformer 分类深度学习
文章目录1.准备环境1.1环境安装1.2加载数据2.数据预处理2.1构建词典2.2生成数据批次和迭代器2.3构建数据集3.模型构建3.1定义位置编码函数3.2定义Transformer模型3.3初始化模型3.4定义训练函数3.5定义评估函数4.训练模型4.1模型训练5.总结：本文为365天深度学习训练营中的学习记录博客原作者：K同学啊1.准备环境1.1环境安装这是一个使用PyTorch通过Tran
多体动力学仿真软件：GT-SUITE_（7）.动力学分析 kkchenjj 多体动力学仿真仿真模拟模拟仿真多体动力学
动力学分析在多体动力学仿真软件中，动力学分析是核心功能之一，它可以帮助工程师和研究人员准确地模拟和分析复杂多体系统的运动和受力情况。动力学分析包括多种类型，如静力学分析、运动学分析和动力学分析。本节将详细介绍这些分析的原理和内容，并提供具体的代码示例和数据样例，以帮助读者更好地理解和应用这些技术。静力学分析静力学分析用于研究系统在力和约束作用下的静态平衡状态。在GT-SUITE中，静力学分析主要涉
非线性动力学笔记C2.1-2.2 一维流动中的不动点和稳定性阿北Ben 笔记
提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录前言C2一维流动（flowonaline)引言2.1几何思考方式2.不动点（fixedpoint)与稳定性（stability)Appendix1前言提示：这里可以添加本文要记录的大概内容：参考书《Nonlineardynamicsandchaos》StevenH.Strogatz本节重点Note第二章内容的引言的1-2小节，
【动态路由】系统Web URL资源整合系列（后端技术实现）【apisix实现】飞火流星02027 URL整合 apisix反向代理 apisix网关 apisix实现web资源整合系统URL资源整合 apisix基于请求参数的路由 apisix基于请求头的路由 APISIXDashboard
需求说明软件功能需求：反向代理功能（描述：apollo、eureka控、apisix、sentinel、普米、kibana、timetask、grafana、hbase、skywalking-ui、pinpoint、cmak界面、kafka-map、nacos、gateway、elasticsearch、oa-portal业务应用等多个web资源等只能通过有限个代理地址访问），不考虑SSO。软件质
2024年最全工控网络安全学习路线_工控网络安全专业，零基础学网络安全开发 2401_84545213 程序员 web安全学习安全
网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。需要这份系统化资料的朋友，可以点击这里获取一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！工业背景对于我国而言，工业控制系统安全所面临的重要问
JProfiler_windows：Java 性能分析与优化心灵宝贝 java 开发语言
JProfiler是一款功能强大的Java性能分析工具，专门用于监控和分析Java应用程序的性能瓶颈、内存泄漏、线程问题等。以下是关于JProfiler_windows-x64_8_0_2的详细介绍：一、JProfiler简介JProfiler是由ej-technologies公司开发的一款Java性能分析工具，支持对本地和远程JVM的监控。它能够帮助开发者深入分析Java应用程序的CPU使用、内
Python的垃圾回收机制，详解Python的GC体系李云龙炮击平安线程 python 系统架构面试跳槽后端架构
什么是垃圾回收？为什么需要垃圾回收？垃圾回收即Garbagecollection简称为GC，是Python，Java等高级语言所使用的内存回收机制，由虚拟机帮助我们管理内存，让它自动把我们去追踪和回收内存中的对象。没有作用的对象就是垃圾，虚拟机就是扫地机器人，在某个时机自动帮我们清除垃圾。区别于C和C++这种让用户自己进行内存管理的方式，由虚拟机代用户管理内存。让用户自己进行内存管理的方式固然自由
Zookeeper Watcher机制原理与代码实例讲解 AI天才研究院 AI大模型企业级应用开发实战 AI大模型应用入门实战与进阶 DeepSeek R1 &大数据AI人工智能大模型计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
ZookeeperWatcher机制原理与代码实例讲解1.背景介绍Zookeeper是一个分布式协调服务，广泛应用于分布式系统中，用于实现数据的同步、配置管理、命名服务等功能。Zookeeper的核心机制之一是Watcher机制，它允许客户端在Zookeeper节点上注册监听器，以便在节点数据或状态发生变化时接收通知。Watcher机制在分布式系统中具有重要意义，因为它提供了一种高效的方式来监控和
记录一个后台框架stylefeng——guns/roses 程序员琛琛 java面试 java
优秀的框架stylefeng——guns/roseshttps://github.com/stylefeng/Guns
【Linux】【进程】epoll内核实现总结+ET和LT模式内核实现方式钟离墨笺 Linux linux 网络运维
【Linux】【网络】epoll内核实现总结+ET和LT模式内核实现方式1.epoll的工作原理eventpoll结构当某一进程调用epoll_create方法时，Linux内核会创建一个eventpoll结构体，这个结构体中有两个成员与epoll的使用方式密切相关.structeventpoll{..../*红黑树的根节点，这颗树中存储着所有添加到epoll中的需要监控的事件*/structrb
【网络安全】Snort中文查询手册 Walter_Jia Network Security
Snort中文手册摘要snort有三种工作模式：嗅探器、数据包记录器、网络入侵检测系统。嗅探器模式仅仅是从网络上读取数据包并作为连续不断的流显示在终端上。数据包记录器模式把数据包记录到硬盘上。网路入侵检测模式是最复杂的，而且是可配置的。我们可以让snort分析网络数据流以匹配用户定义的一些规则，并根据检测结果采取一定的动作。(2003-12-1116:39:12)Snort用户手册第一章snort
从Android Studio上传项目到Github的步骤教程良辰吉日943 android studio github github android studio git
AndroidStudio上传项目到Github的步骤教程最近要做Android课设，老师说使用Gitee或GitHub等代码托管平台可以加分。所以本着不要白不要的原则试一试，随便记录一下步骤，方便小白上手。过程很简单，所以就不放图片了。1.在电脑上安装Git软件从官网下载Git，一路默认设置完成安装；打开GITBash命令行，手动输入，引号内改成自己的信息：$gitconfig--globalu
面试完整回答：SQL 分页查询中 limit 500000,10和 limit 10 速度一样快吗? 程序员琛琛数据库笔记 java面试面试 sql oracle
首先：在SQL分页查询中，LIMIT500000,10和LIMIT10的速度不会一样快，以下是原因和优化建议：性能差异的原因LIMIT10：只需要扫描前10条记录，然后返回结果。性能非常高，因为数据库只需读取少量数据。LIMIT500000,10：需要先扫描前500000条记录，然后跳过它们，再返回接下来的10条记录。性能较低，因为数据库需要读取并跳过大量数据。数据库的执行过程对于LIMIToff
工业物联网远程监控系统优化方案，基于巨控GRM553Y-CHE 何工13763355074 物联网巨控GRM553Y-C 巨控GRM552Y-C
工业物联网远程监控系统优化方案——基于巨控GRM553Y-CHE的西门子S7-1500PLC多站点无线集成方案1.项目背景与概述巨控科技作为工业物联网解决方案提供商，专注于PLC无线通信与远程监控技术研发，其YunPLC安全平台已服务超30,000+工业终端，覆盖智能制造、智慧能源、环保监测等20余个行业，累计稳定运行超10亿小时。平台通过信息安全管理体系认证及工业网络安全认证，确保数据全程加密与
技术方案：基于巨控GRM120系列LoRa无线模块的移动设备通信系统何工13763355074 人工智能 GRM110 GRM120 巨控lora 巨控GRM120
技术方案：基于巨控GRM120系列LoRa无线模块的移动设备通信系统项目需求3台移动设备需通过无线通信互联，支持485、网口、DI、AI接口，并满足以下功能：1公里无线通信（无需插卡）多PLC无线通信、PLC与传感器/组态软件通信模块自带逻辑、定时、运算功能（可替代部分PLC功能）支持中心站4G+LoRa组网，实现APP远程监控模块选型与配置1.设备通信需求分析根据设备接口需求，选择对应型号：GR
以下是基于巨控GRM241Q-4I4D4QHE模块的液位远程控制系统技术方案：何工13763355074 巨控GRM231Q 无线通讯巨控GRM241
以下是基于巨控GRM241Q-4I4D4QHE模块的液位远程控制系统技术方案：一、系统概述本系统采用双巨控GRM241Q模块构建4G无线物联网络，实现山上液位数据实时传输至山下水泵站，通过预设逻辑自动控制水泵启停，同时支持APP远程监控及人工干预。二、系统组成监测端（山上）液位传感器：投入式/超声波液位计（4-20mA模拟量输出）GRM241Q模块：直接接入液位信号（AI通道）供电：AC220V或
学会Python3模拟登录并爬取表格数据！excel高手也自叹不如！ m0_60635321 2024年程序员学习 excel python 爬虫
先自我介绍一下，小编浙江大学毕业，去过华为、字节跳动等大厂，目前阿里P7深知大多数程序员，想要提升技能，往往是自己摸索成长，但自己不成体系的自学效果低效又漫长，而且极易碰到天花板技术停滞不前！因此收集整理了一份《2024年最新Python全套学习资料》，初衷也很简单，就是希望能够帮助到想自学提升又不知道该从何学起的朋友。既有适合小白学习的零基础资料，也有适合3年以上经验的小伙伴深入学习提升的进阶课
如何进行存储容量规划？ ScaleFlux锐钲数据库大数据 dba 云计算
关于CSD3000：CSD3000是ScaleFlux推出的首款支持压缩的标准NVMeSSD.该产品采用先进的系统级芯片(SoC)，和软件开发技术，实现了存储、内存和计算的有效连接，并加入了硬件计算加速引擎，来缓解数据拥堵，减轻CPU负载和服务器体系结构中的瓶颈，释放未被充分利用的资源。极大优化了NVMeSSD,提升了存储的能力。关于ScaleFlux:ScaleFlux成立于2014年，是大规模
19.4.9 数据库方式操作Excel .Net学习 C#教程 excel 数据库 c#
版权声明：本文为博主原创文章，转载请在显著位置标明本文出处以及作者网名，未经作者允许不得用于商业目的。本节所说的操作Excel操作是讲如何把Excel作为数据库来操作。通过COM来操作Excel操作，请参看第21.2节在第19.3.4节【连接Excel】中已经讲述了如何连接Excel。剩下的操作就和前面几节操作Access数据库一样。注意：Excel不支持删除一行记录，使用SQL的delete语句
Apache ZooKeeper 分布式协调服务 slovess 分布式 apache zookeeper
1.ZooKeeper概述1.1定义与定位核心定位：分布式系统的协调服务，提供强一致性的配置管理、命名服务、分布式锁和集群管理能力核心模型：基于树形节点（ZNode）的键值存储，支持Watcher监听机制生态地位：Hadoop/Kafka等生态核心依赖，分布式系统基础设施级组件1.2设计目标强一致性：所有节点数据最终一致（基于ZAB协议）高可用性：集群半数以上节点存活即可提供服务顺序性：全局唯一递
leetcode hot 100 刷题记录(medium) 激昂～逐流 Leetcode leetcode 算法
题目3：无重复字符的最长子串（YES）解题思路：其实最好想到的方法就是使用两层for,让每个字符都可以是子串的首字符，查看哪个子串的长度最长即可。给定一个字符串s，请你找出其中不含有重复字符的最长子串的长度。classSolution{public:intlengthOfLongestSubstring(strings){//暴力的一次for,检查每个字符作为首字符时候的最长子串if(s.size
【mysql】锁机制 - 2.行锁间隙锁临键锁 m0_54804970 面试学习路线阿里巴巴 mysql 数据库
目录1.锁的几种类型1.1记录锁（行锁）?RecordLock1.2?间隙锁GapLock1.3临键锁?Next-KeyLock2.加锁过程2.1唯一索引的查询(1)等值查询,查询记录存在(2)?等值查询,查询记录不存在(3)范围查询2.2普通索引的查询(1)等值查询,查询记录存在(2)等值查询,查询记录不存在(3)范围查询2.3其他查询2.4总结1.锁的几种类型对InnoDB按照锁粒度可以分为：
Flink SQL 优化实战 - 维表 JOIN 优化腾讯云大数据大数据数据库 flink sql
作者：龙逸尘，腾讯CSIG高级工程师背景介绍维表（DimensionTable）是来自数仓建模的概念。在数仓模型中，事实表（FactTable）是指存储有事实记录的表，如系统日志、销售记录等，而维表是与事实表相对应的一种表，它保存了事实表中指定属性的相关详细信息，可以跟事实表做关联；相当于将事实表上经常重复出现的属性抽取、规范出来用一张表进行管理。在实际生产中，我们经常会有这样的需求，以原始数据流
基于ESP-NOW协议的温室通风监控系统神一样的老师物联网论文阅读分享物联网
论文标题中文标题：基于ESP-NOW协议的温室通风监控系统英文标题：SistemademonitoreodeventilacióneninvernaderoconprotocoloESP-NOW作者信息HéctorDeSosaa,*,GermánA.Xandera,LuisA.Urbania,AlejandroG.MaxitaaUniversidadNacionaldeMisiones,Facul
监控系统和AI辅助建议功能的实现后端
家里小朋友养了一只小乌龟，到了冬天就冬眠了，早早地准备了一个冬眠箱，铺上椰土，在室温低于15℃时，就把小乌龟放到冬眠箱里，不一会儿它就自己钻入土中把自己藏了起来。按照惯例，需要每隔一定时间，对冬眠箱进行补水，以保持土壤湿润，防止小乌龟缺水，但有时候也会忘记补水的工作，造成冬眠箱过于干燥，不利于乌龟健康。翻箱倒柜，找到一个9年前买的树莓派2ModelB，32位，4核1GB的设备，正好可以利用起来，做
算法单链的创建与删除换个号韩国红果果 c 算法
先创建结构体 struct student { int data; //int tag;//标记这是第几个 struct student *next; }; // addone 用于将一个数插入已从小到大排好序的链中 struct student *addone(struct student *h,int x){ if(h==NULL) //??????
《大型网站系统与Java中间件实践》第2章读后感白糖_ java中间件
断断续续花了两天时间试读了《大型网站系统与Java中间件实践》的第2章，这章总述了从一个小型单机构建的网站发展到大型网站的演化过程---整个过程会遇到很多困难，但每一个屏障都会有解决方案，最终就是依靠这些个解决方案汇聚到一起组成了一个健壮稳定高效的大型系统。看完整章内容，
zeus持久层spring事务单元测试 deng520159 java DAO spring jdbc
今天把zeus事务单元测试放出来,让大家指出他的毛病, 1.ZeusTransactionTest.java 单元测试 package com.dengliang.zeus.webdemo.test; import java.util.ArrayList; import java.util.List; import org.junit.Test; import
Rss 订阅开发周凡杨 html xml 订阅 rss 规范
RSS是 Really Simple Syndication的缩写（对rss2.0而言，是这三个词的缩写，对rss1.0而言则是RDF Site Summary的缩写，1.0与2.0走的是两个体系）。 RSS
分页查询实现 g21121 分页查询
在查询列表时我们常常会用到分页，分页的好处就是减少数据交换，每次查询一定数量减少数据库压力等等。按实现形式分前台分页和服务器分页：前台分页就是一次查询出所有记录，在页面中用js进行虚拟分页，这种形式在数据量较小时优势比较明显，一次加载就不必再访问服务器了，但当数据量较大时会对页面造成压力，传输速度也会大幅下降。服务器分页就是每次请求相同数量记录，按一定规则排序，每次取一定序号直接的数据
spring jms异步消息处理 510888780 jms
spring JMS对于异步消息处理基本上只需配置下就能进行高效的处理。其核心就是消息侦听器容器，常用的类就是DefaultMessageListenerContainer。该容器可配置侦听器的并发数量，以及配合MessageListenerAdapter使用消息驱动POJO进行消息处理。且消息驱动POJO是放入TaskExecutor中进行处理，进一步提高性能，减少侦听器的阻塞。具体配置如下：
highCharts柱状图布衣凌宇 hightCharts 柱图
第一步：导入 exporting.js,grid.js,highcharts.js;第二步：写controller @Controller@RequestMapping(value="${adminPath}/statistick")public class StatistickController { private UserServi
我的spring学习笔记2-IoC（反向控制依赖注入） aijuans spring mvc Spring 教程 spring3 教程 Spring 入门
IoC（反向控制依赖注入）这是Spring提出来了，这也是Spring一大特色。这里我不用多说，我们看Spring教程就可以了解。当然我们不用Spring也可以用IoC，下面我将介绍不用Spring的IoC。 IoC不是框架，她是java的技术，如今大多数轻量级的容器都会用到IoC技术。这里我就用一个例子来说明：如：程序中有 Mysql.calss 、Oracle.class 、SqlSe
TLS java简单实现 antlove java ssl keystore tls secure
1. SSLServer.java package ssl; import java.io.FileInputStream; import java.io.InputStream; import java.net.ServerSocket; import java.net.Socket; import java.security.KeyStore; import
Zip解压压缩文件百合不是茶 Zip格式解压 Zip流的使用文件解压
ZIP文件的解压缩实质上就是从输入流中读取数据。Java.util.zip包提供了类ZipInputStream来读取ZIP文件,下面的代码段创建了一个输入流来读取ZIP格式的文件; ZipInputStream in = new ZipInputStream(new FileInputStream(zipFileName)); &n
underscore.js 学习（一） bijian1013 JavaScript underscore
工作中需要用到underscore.js，发现这是一个包括了很多基本功能函数的js库，里面有很多实用的函数。而且它没有扩展 javascript的原生对象。主要涉及对Collection、Object、Array、Function的操作。学
java jvm常用命令工具——jstatd命令(Java Statistics Monitoring Daemon) bijian1013 java jvm jstatd
1.介绍 jstatd是一个基于RMI（Remove Method Invocation）的服务程序，它用于监控基于HotSpot的JVM中资源的创建及销毁，并且提供了一个远程接口允许远程的监控工具连接到本地的JVM执行命令。 jstatd是基于RMI的，所以在运行jstatd的服务
【Spring框架三】Spring常用注解之Transactional bit1129 transactional
Spring可以通过注解@Transactional来为业务逻辑层的方法(调用DAO完成持久化动作)添加事务能力，如下是@Transactional注解的定义： /* * Copyright 2002-2010 the original author or authors. * * Licensed under the Apache License, Version
我(程序员)的前进方向 bitray 程序员
作为一个普通的程序员,我一直游走在java语言中,java也确实让我有了很多的体会.不过随着学习的深入,java语言的新技术产生的越来越多,从最初期的javase,我逐渐开始转变到ssh,ssi,这种主流的码农,.过了几天为了解决新问题,webservice的大旗也被我祭出来了,又过了些日子jms架构的activemq也开始必须学习了.再后来开始了一系列技术学习,osgi,restful.....
nginx lua开发经验总结 ronin47
使用nginx lua已经两三个月了，项目接开发完毕了，这几天准备上线并且跟高德地图对接。回顾下来lua在项目中占得必中还是比较大的，跟PHP的占比差不多持平了，因此在开发中遇到一些问题备忘一下 1：content_by_lua中代码容量有限制，一般不要写太多代码，正常编写代码一般在100行左右（具体容量没有细心测哈哈，在4kb左右），如果超出了则重启nginx的时候会报 too long pa
java-66-用递归颠倒一个栈。例如输入栈{1,2,3,4,5}，1在栈顶。颠倒之后的栈为{5,4,3,2,1}，5处在栈顶 bylijinnan java
import java.util.Stack; public class ReverseStackRecursive { /** * Q 66.颠倒栈。 * 题目：用递归颠倒一个栈。例如输入栈{1,2,3,4,5}，1在栈顶。 * 颠倒之后的栈为{5,4,3,2,1}，5处在栈顶。 *1. Pop the top element *2. Revers
正确理解Linux内存占用过高的问题 cfyme linux
Linux开机后，使用top命令查看，4G物理内存发现已使用的多大3.2G，占用率高达80%以上： Mem: 3889836k total, 3341868k used, 547968k free, 286044k buffers Swap: 6127608k total,&nb
[JWFD开源工作流]当前流程引擎设计的一个急需解决的问题 comsci 工作流
当我们的流程引擎进入IRC阶段的时候，当循环反馈模型出现之后，每次循环都会导致一大堆节点内存数据残留在系统内存中，循环的次数越多，这些残留数据将导致系统内存溢出，并使得引擎崩溃。。。。。。而解决办法就是利用汇编语言或者其它系统编程语言，在引擎运行时，把这些残留数据清除掉。
自定义类的equals函数 dai_lm equals
仅作笔记使用 public class VectorQueue { private final Vector<VectorItem> queue; private class VectorItem { private final Object item; private final int quantity; public VectorI
Linux下安装R语言 datageek R语言 linux
命令如下：sudo gedit /etc/apt/sources.list1、deb http://mirrors.ustc.edu.cn/CRAN/bin/linux/ubuntu/ precise/ 2、deb http://dk.archive.ubuntu.com/ubuntu hardy universesudo apt-key adv --keyserver ke
如何修改mysql 并发数(连接数)最大值 dcj3sjt126com mysql
MySQL的连接数最大值跟MySQL没关系，主要看系统和业务逻辑了方法一：进入MYSQL安装目录打开MYSQL配置文件 my.ini 或 my.cnf查找 max_connections=100 修改为 max_connections=1000 服务里重起MYSQL即可　　方法二：MySQL的最大连接数默认是100客户端登录：mysql -uusername -ppass
单一功能原则 dcj3sjt126com 面向对象的程序设计软件设计编程原则
单一功能原则[ 编辑] SOLID 原则单一功能原则开闭原则 Liskov代换原则接口隔离原则依赖反转原则查论编在面向对象编程领域中，单一功能原则（Single responsibility principle）规定每个类都应该有
POJO、VO和JavaBean区别和联系 fanmingxing VO POJO javabean
POJO和JavaBean是我们常见的两个关键字，一般容易混淆，POJO全称是Plain Ordinary Java Object / Plain Old Java Object，中文可以翻译成：普通Java类，具有一部分getter/setter方法的那种类就可以称作POJO，但是JavaBean则比POJO复杂很多，JavaBean是一种组件技术，就好像你做了一个扳子，而这个扳子会在很多地方被
SpringSecurity3.X--LDAP：AD配置 hanqunfeng SpringSecurity
前面介绍过基于本地数据库验证的方式，参考http://hanqunfeng.iteye.com/blog/1155226，这里说一下如何修改为使用AD进行身份验证【只对用户名和密码进行验证，权限依旧存储在本地数据库中】。将配置文件中的如下部分删除：
mac mysql 修改密码 IXHONG mysql
$ sudo /usr/local/mysql/bin/mysqld_safe –user=root & //启动MySQL(也可以通过偏好设置面板来启动)$ sudo /usr/local/mysql/bin/mysqladmin -uroot password yourpassword //设置MySQL密码（注意，这是第一次MySQL密码为空的时候的设置命令，如果是修改密码，还需在-
设计模式--抽象工厂模式 kerryg 设计模式
抽象工厂模式：工厂模式有一个问题就是，类的创建依赖于工厂类，也就是说，如果想要拓展程序，必须对工厂类进行修改，这违背了闭包原则。我们采用抽象工厂模式，创建多个工厂类，这样一旦需要增加新的功能，直接增加新的工厂类就可以了，不需要修改之前的代码。总结：这个模式的好处就是，如果想增加一个功能，就需要做一个实现类，
评"高中女生军训期跳楼” nannan408
首先，先抛出我的观点，各位看官少点砖头。那就是，中国的差异化教育必须做起来。孔圣人有云：有教无类。不同类型的人，都应该有对应的教育方法。目前中国的一体化教育，不知道已经扼杀了多少创造性人才。我们出不了爱迪生，出不了爱因斯坦，很大原因，是我们的培养思路错了，我们是第一要“顺从”。如果不顺从，我们的学校，就会用各种方法，罚站，罚写作业，各种罚。军
scala如何读取和写入文件内容？ qindongliang1922 java jvm scala
直接看如下代码： package file import java.io.RandomAccessFile import java.nio.charset.Charset import scala.io.Source import scala.reflect.io.{File, Path} /** * Created by qindongliang on 2015/
C语言算法之百元买百鸡 qiufeihu c 算法
中国古代数学家张丘建在他的《算经》中提出了一个著名的“百钱买百鸡问题”，鸡翁一，值钱五，鸡母一，值钱三，鸡雏三，值钱一，百钱买百鸡，问翁，母，雏各几何？代码如下： #include <stdio.h> int main() { int cock,hen,chick; /*定义变量为基本整型*/ for(coc
Hadoop集群安全性：Hadoop中Namenode单点故障的解决方案及详细介绍AvatarNode wyz2009107220 NameNode
正如大家所知，NameNode在Hadoop系统中存在单点故障问题，这个对于标榜高可用性的Hadoop来说一直是个软肋。本文讨论一下为了解决这个问题而存在的几个solution。 1. Secondary NameNode 原理：Secondary NN会定期的从NN中读取editlog，与自己存储的Image进行合并形成新的metadata image 优点：Hadoop较早的版本都自带，