PostgreSQL的监控三(zabbix)

3  共通的监控工具

前面介绍的PostgreSQL监控工具都偏向于性能分析,没有告警功能。而且它们只是针对PostgreSQL的监视,有时需要监控整个业务相关的系统,这时候就要考虑通用的监控工具了。Linux下比较适合监控数据库的常用的工具有Nagios和Zabbix。Zabbix更容易使用,现在看上去也更被多数人看好,所以本文只介绍Zabbix监控PostgreSQL的方法。

3.1 Zabbix简介


PostgreSQL的监控三(zabbix)_第1张图片
Zabbix是一个all in one高度集成的企业级监控解决方案。由一个中心的Zabbix Server和若干可能安装有Zabbix Agent被监控设备构成,主要特性可概括为以下几点
数据采集
 支持agent和agent less(SNMP, IPMI, HTTP,FTP...)
 支持基于JMX对java应用的监视
 可灵活定制agent
数据存储
 数据库为PostgreSQL,Mysql,Oracle,SQLite或DB2
 可配置历史和趋势数据的保存时间
 内建旧清理程序防止数据膨胀
报警
 可定制报警阈值
 灵活设置报警方式,邮件,SMS,脚本
 支持报警升级
 报警消息可使用宏变量定制
可视化
 可定制的数据图形
 仪表盘
 地图
 所有配置都通过GUI编辑
大规模部署
 支持模板
 自动发现主机和监控项目
 通过Zabbix Proxy实现分布式部署
其他
 Zabbix API
 认证和访问控制
 IT资产收集

zabbix要想监视PostgreSQL这种应用型的对象,一般使用zabbix agent。zabbix agent有2种工作方式。
方式1:被动代理
由Zabbix Server(或Proxy)主动查询数据(如CPU负载),作为响应Zabbix Agent返回查询结果。这也是最简单最常用的方式。

方式2:主动代理
Zabbix Agent先从Zabbix Server获取需要主动报告的监控项目一览,然后定期发送新值到Zabbix Server。主动代理可以用于处理时间比较长的监控项,比如log 。

也可以使用Zabbix Trapper
方式3:Trapper
由Zabbix Agent主动报告数据。被监控端可调用zabbix_send命令或直接利用Socket发送数据到Zabbix Server。通过Trapper可以只在状态变更时进行报告。

另外还有把PostgreSQL状态通过SNMP代理发布的方案(http://pgsnmpd.projects.pgfoundry.org/),估计用的不多,本文不涉及。

详细参考:
https://www.zabbix.com/documentation/2.4/manual/concepts/agent
https://www.zabbix.com/documentation/2.4/manual/appendix/items/activepassive
https://www.zabbix.com/documentation/2.4/manual/config/items/itemtypes/trapper


Zabbix没有内置对PostgreSQL的监控项,所以如果要监控PostgreSQL需要做一些监控项的配置或定制,下面会介绍几种方法。

3.2 方法1:直接扩充UserParameter

自己修改zabbix_agentd.conf的配置文件,在Zabbix agent上增加PostgreSQL相关的监控项,使用psql发SQL的方式获取PostgreSQL的性能数据。


zabbix_agentd.conf

  1. #Get the PostgreSQL version
  2. UserParameter=psql.version,psql --version|head -n1
  3. #Get the total number of Server Processes that are active
  4. UserParameter=psql.server_processes,psql --"select sum(numbackends) from pg_stat_database"
  5. #Get the total number of commited transactions
  6. UserParameter=psql.tx_commited,psql --"select sum(xact_commit) from pg_stat_database"
  7. #Get the total number of rolled back transactions
  8. UserParameter=psql.tx_rolledback,psql --"select sum(xact_rollback) from pg_stat_database"

参照
https://www.zabbix.com/wiki/howto/monitor/db/postgresql

3.3 方法2:扩充UserParameter调用PostgreSQL监视脚本

和方法1类似,但方法1中定义的UserParameter太多,为方便起见,可以把监控PostgreSQL的SQL命令做成一个单独的脚本。

比如参照下面的例子
https://www.zabbix.com/forum/showthread.php?t=8009

zabbix_agentd.conf:

  1. UserParameter=postgresql[*],/opt/zabbix/bin/zapost $1 $2

zapost:

  1. #
  2. Name: zapost
  3. #
  4. # Checks PostgreSQL activity.
  5. #
  6. # Author: bashman
  7. #
  8. # Version: 1.0
  9. #

  10. zapostver="1.0"
  11. rval=0
  12. sql=""

  13. case $1 in

  14. #'summary')
  15. # sql="select a.datname, pg_size_pretty(pg_database_size(a.datid)) as size, cast(blks_hit/(blks_read+blks_hit+0.000001)*100.0 as numeric(5,2)) as cache, cast(xact_commit/(xact_rollback+xact_commit+0.000001)*100.0 as numeric(5,2)) as success from pg_stat_database a order by a.datname"
  16. ;;

  17. #'size')
  18.         #comprobar aqui los parametros
  19. # shift
  20. # sql="select pg_database_size('$1') as size"
  21. ;;

  22. #'version')
  23. # sql='select version()'
  24. ;;

  25. 'totalsize')
  26.         sql="select sum(pg_database_size(datid)) as total_size from pg_stat_database"
  27.         ;;

  28. 'db_cache')
  29.         # comprueba los parametros
  30.         if [ ! -z $2 ]; then
  31.         shift
  32.             sql="select cast(blks_hit/(blks_read+blks_hit+0.000001)*100.0 as numeric(5,2)) as cache from pg_stat_database where datname = '$1'"
  33.     fi
  34.         ;;

  35. 'db_success')
  36.         # comprueba los parametros
  37.     if [ ! -z $2 ]; then
  38.         shift
  39.                sql="select cast(xact_commit/(xact_rollback+xact_commit+0.000001)*100.0 as numeric(5,2)) as success from pg_stat_database where datname = '$1'"
  40.     fi
  41.     ;;

  42. 'server_processes')
  43.     sql="select sum(numbackends) from pg_stat_database"
  44.     ;;

  45. 'tx_commited')
  46.     sql="select sum(xact_commit) from pg_stat_database"
  47.     ;;

  48. 'tx_rolledback')
  49.     sql="select sum(xact_rollback) from pg_stat_database"
  50.     ;;

  51. 'db_size')
  52.     # comprueba los parametros
  53.         if [ ! -z $2 ]; then
  54.         shift
  55.         sql="select pg_database_size('$1')" #as size"
  56.     fi
  57.     ;;

  58. 'db_connections')
  59.         # comprueba los parametros
  60.         if [ ! -z $2 ]; then
  61.         shift
  62.             sql="select numbackends from pg_stat_database where datname = '$1'"
  63.     fi
  64.     ;;

  65. 'db_returned')
  66.     # comprueba los parametros
  67.         if [ ! -z $2 ]; then
  68.         shift
  69.         sql="select tup_returned from pg_stat_database where datname = '$1'"
  70.     fi
  71.     ;;

  72. 'db_fetched')
  73.         # comprueba los parametros
  74.         if [ ! -z $2 ]; then
  75.         shift
  76.             sql="select tup_fetched from pg_stat_database where datname = '$1'"
  77.     fi
  78.     ;;

  79. 'db_inserted')
  80.     # comprueba los parametros
  81.         if [ ! -z $2 ]; then
  82.             shift
  83.             sql="select tup_inserted from pg_stat_database where datname = '$1'"
  84.     fi
  85.         ;;

  86. 'db_updated')
  87.     # comprueba los parametros
  88.         if [ ! -z $2 ]; then
  89.             shift
  90.             sql="select tup_updated from pg_stat_database where datname = '$1'"
  91.     fi
  92.         ;;

  93. 'db_deleted')
  94.     # comprueba los parametros
  95.         if [ ! -z $2 ]; then
  96.             shift
  97.             sql="select tup_deleted from pg_stat_database where datname = '$1'"
  98.     fi
  99.         ;;

  100. 'db_commited')
  101.     # comprueba los parametros
  102.         if [ ! -z $2 ]; then
  103.             shift
  104.         sql="select xact_commit from pg_stat_database where datname = '$1'"
  105.     fi
  106.     ;;

  107. 'db_rolled')
  108.     # comprueba los parametros
  109.     if [ ! -z $2 ]; then
  110.             shift
  111.         sql="select xact_rollback from pg_stat_database where datname = '$1'"
  112.     fi
  113.     ;;

  114. 'version')
  115.     sql="version"
  116.     ;;

  117. 'zapostver')
  118.         echo "$zapostver"
  119.     exit $rval
  120.         ;;

  121. *)
  122.         echo "zapost version: $zapostver"
  123.         echo "usage:"
  124.     echo " $0 totalsize -- Check the total databases size."
  125.     echo " $0 db_cache <dbname> -- Check the database cache hit ratio (percentage)."
  126.     echo " $0 db_success <dbname> -- Check the database success rate (percentage)."
  127.     echo " $0 server_processes -- Check the total number of Server Processes that are active."
  128.     echo " $0 tx_commited -- Check the total number of commited transactions."
  129.     echo " $0 tx_rolledback -- Check the total number of rolled back transactions."
  130.     echo " $0 db_size <dbname> -- Check the size of a Database (in bytes)."
  131.     echo " $0 db_connections <dbname> -- Check the number of active connections for a specified database."    
  132.     echo " $0 db_returned <dbname> -- Check the number of tuples returned for a specified database."
  133.     echo " $0 db_fetched <dbname> -- Check the number of tuples fetched for a specified database."
  134.     echo " $0 db_inserted <dbname> -- Check the number of tuples inserted for a specified database."
  135.     echo " $0 db_updated <dbname> -- Check the number of tuples updated for a specified database."
  136.     echo " $0 db_deleted <dbname> -- Check the number of tuples deleted for a specified database."
  137.     echo " $0 db_commited <dbname> -- Check the number of commited back transactions for a specified database."
  138.     echo " $0 db_rolled <dbname> -- Check the number of rolled back transactions for a specified database."
  139.     echo " $0 version -- The PostgreSQL version."
  140.     echo " $0 zapostver -- Version of this script."
  141.         exit $rval
  142.         ;;
  143. esac

  144. if [ "$sql" != "" ]; then
  145.     if [ "$sql" == "version" ]; then
  146.         psql --version|head -n1
  147.         rval=$?
  148.     else
  149.         psql -t -c "$sql"
  150.         rval=$?
  151.     fi
  152. fi

  153. if [ "$rval" -ne 0 ]; then
  154.       echo "ZBX_NOTSUPPORTED

3.4 方法3:使用Postbix或DBforBIX插件

和前面提到的2种方法相比,Postbix插件的功能更全面,它包含了PostgreSQL相关的监控项和图形的Zabbix模板。Postbix以一个的后台java deamon运行,这个deamon通过jdbc查询远端被监控数据库的状态然后以trap agent的方式发送到Zabbix Server。
Postbix来自http://www.smartmarmot.com/,除了Postbix该公司还有Orabbix,MySQLBix,这些工具的内部架构和使用方法基本相同,唯一的区别就是支持的被监控数据库不同。所以smartmarmot又推出了整合这几种数据库监控能力的DBforBIX。DBforBIX的内部结构和使用方法和Postbix基本相同,下面介绍一下DBforBIX的简单的使用例子。

1)下载dbforbix
http://www.smartmarmot.com/product/dbforbix/dbforbix-download/

2)在Zabbix Server上安装dbforbix

  1. [root@zabbix ~]# mkdir /opt/dbforbix
  2. [root@zabbix ~]# cd /opt/dbforbix
  3. [root@zabbix dbforbix]# unzip /root/dbforbix-0.6.1.zip
  4. [root@zabbix dbforbix]# cp /opt/dbforbix/init.d/dbforbix /etc/init.d/dbforbix
  5. [root@zabbix dbforbix]# chmod +x /etc/init.d/dbforbix
  6. [root@zabbix dbforbix]# chmod +x /opt/dbforbix/run.sh
  7. [root@zabbix dbforbix]# chkconfig dbforbix on

3)导入dbforix的模板到Zabbix服务器
点击Zabbix GUI画面的"Configuration->Templates->Import"把下面的模板文件导入Zabbix服务器。
/opt/dbforbix/template/template_postgresql.xml
PostgreSQL的监控三(zabbix)_第2张图片

4)在被监控PostgreSQL实例上创建DBforBIX使用的账号并赋予权限

  1.   CREATE USER zabbix WITH PASSWORD 'passw0rd';
  2.   GRANT SELECT ON pg_stat_activity to zabbix;
  3.   GRANT SELECT ON pg_stat_activity to zabbix;
  4.   GRANT SELECT ON pg_database to zabbix;
  5.   GRANT SELECT ON pg_authid to zabbix;
  6.   GRANT SELECT ON pg_stat_bgwriter to zabbix;
  7.   GRANT SELECT ON pg_locks to zabbix;
  8.   GRANT SELECT ON pg_stat_database to zabbix
5)修改config.props
通过拷贝config.props.sample生成config.props, 然后修改config.props设置Zabbix Server的IP和端口号,设置被监控PostgreSQL数据库的访问账号

  1. [root@zabbix dbforbix]# cp /opt/dbforbix/conf/config.props.sample /opt/dbforbix/conf/config.props
  2. [root@zabbix dbforbix]# vi /opt/dbforbix/conf/config.props
  3. ZabbixServerList=ZabbixServer
  4. ZabbixServer.Address=IP_ADDRESS_OF_ZABBIX_SERVER
  5. ZabbixServer.Port=PORT_OF_ZABBIX_SERVER
  6. ...
  7. DBforBIX.PidFile=./logs/dbforbix.pid
  8. ...
  9. DatabaseList=PGSQLDB2
  10. PGSQLDB2.Url=jdbc:postgresql://host:port/database
  11. PGSQLDB2.User=zabbix
  12. PGSQLDB2.Password=passw0rd

注意:/opt/dbforbix/init.d/dbforbix有个Bug,本来应该从config.props中读取pid文件名的,结果写死了是dbforbix.pid,config.props.sample中的默认值又是orabix.pid。

6)修改pgsqlquery.props
通过拷贝pgsqlquery.props.sample生成pgsqlquery.props。pgsqlquery.props中定义了监控项目及对应的查询SQL,可以编辑QueryList对监控项做筛选。

  1. [root@zabbix dbforbix]# cp /opt/dbforbix/conf/pgsqlquery.props.sample /opt/dbforbix/conf/pgsqlquery.props

看一下 pgsqlquery.props.sample包含的内容
pgsqlquery.props.sample:
  1. QueryList=activeconn,tupfetched,tupinserted,tupupdated,tupdeleted,xactcommit,xactrollback,exclusivelock,accessexclusivelock,accesssharelock,rowsharelock,rowexclusivelock,shareupdateexclusivelock,sharerowexclusivelock,checkpoints_timed,checkpoints_req,buffers_checkpoint,buffers_clean,maxwritten_clean,buffers_backend,buffers_alloc    

  2. #statistic of database
  3. activeconn.Query=select sum(numbackends) from pg_stat_database
  4. tupreturned.Query=select sum(tup_returned) from pg_stat_database
  5. tupfetched.Query=select sum(tup_fetched) from pg_stat_database
  6. tupinserted.Query=select sum(tup_inserted) from pg_stat_database
  7. tupupdated.Query=select sum(tup_updated) from pg_stat_database
  8. tupdeleted.Query=select sum(tup_deleted) from pg_stat_database
  9. xactcommit.Query=SELECT sum(xact_commit) FROM pg_stat_database
  10. xactrollback.Query=SELECT sum(xact_rollback) FROM pg_stat_database

  11. #locks
  12. exclusivelock.Query=SELECT count(*) FROM pg_locks where mode='ExclusiveLock'
  13. accessexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='AccessExclusiveLock'
  14. accesssharelock.Query=SELECT count(*) FROM pg_locks where mode='AccessShareLock'
  15. rowsharelock.Query=SELECT count(*) FROM pg_locks where mode='RowShareLock'
  16. rowexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='RowExclusiveLock'
  17. shareupdateexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='ShareUpdateExclusiveLock'
  18. sharerowexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='ShareRowExclusiveLock'

  19. checkpoints_timed.Query=select checkpoints_timed from pg_stat_bgwriter
  20. checkpoints_req.Query=select checkpoints_req from pg_stat_bgwriter
  21. buffers_checkpoint.Query=select buffers_checkpoint from pg_stat_bgwriter
  22. buffers_clean.Query=select buffers_clean from pg_stat_bgwriter
  23. maxwritten_clean.Query=select maxwritten_clean from pg_stat_bgwriter
  24. buffers_backend.Query=select buffers_backend from pg_stat_bgwriter
  25. buffers_alloc.Query=select buffers_alloc from pg_stat_bgwriter

7)启动dbforbix deamon

  1. [root@zabbix dbforbix]# /etc/init.d/dbforbix start

8) 在Zabbix Server上创建Host
点击Zabbix GUI画面的"Configuration->Hosts->Create Host"为被监控数据库创建一个专门的Host。"Host name"设置为“PGSQLDB2”( 必须和config.props的DatabaseList中的名称一致,这里是“PGSQLDB2”)。并且把Host“PGSQLDB2”链接到前面导入的模板"Template_PostgeSQL"。
PostgreSQL的监控三(zabbix)_第3张图片

PostgreSQL的监控三(zabbix)_第4张图片

9) 检查数据是否已被收集
点击Zabbix GUI画面的"Monitoring->Last data"检查数据是否已被收集。
PostgreSQL的监控三(zabbix)_第5张图片

参考
http://www.smartmarmot.com/wiki/index.php/DBforBIX

3.5 方法4:使用pg_monz模板

pg_monz是一套可以监控PostgreSQL的zabbix模板,通过定制的agent U serParameter监控PostgreSQL数据库 ,并且利用Zabbix的发现机制可以自动发现和监视数据库和表。
pg_monz由下面几个文件组成
文件 说明
pg_monz_template.xml
模版定义文件
userparameter_pgsql.conf
提供PostgreSQL监控项目的用户参数定义
find_dbname.sh
PG数据库的自动发现脚本
find_dbname_table.sh
PG数据表的自动发现脚本



  

  

要了解pg_monz支持哪些监控项目,看一下 userparameter_pgsql.conf 就可以了
userparameter_pgsql.conf:

点击(此处)折叠或打开

  1. # PostgreSQL user parameter

  2. #
  3. # Server specific examples
  4. #
  5. # Get the total number of commited transactions
  6. UserParameter=psql.tx_commited[*],psql -h $1 -p $2 -U $3 -d $4 --"select sum(xact_commit) from pg_stat_database"
  7. # Get the total number of rolled back transactions
  8. UserParameter=psql.tx_rolledback[*],psql -h $1 -p $2 -U $3 -d $4 --"select sum(xact_rollback) from pg_stat_database"
  9. # Max Connections
  10. UserParameter=psql.server_maxcon[*],psql -h $1 -p $2 -U $3 -d $4 --"show max_connections"
  11. # PostgreSQL is running
  12. UserParameter=psql.running[*],psql -h $1 -p $2 -U $3 -d $4 --"select 1" > /dev/null 2>&; echo $?

  13. # Added by SRA OSS
  14. # Get number of checkpoint count (by checkpoint_timeout)
  15. UserParameter=psql.checkpoints_timed[*],psql -h $1 -p $2 -U $3 -d $4 --"select checkpoints_timed from pg_stat_bgwriter"
  16. # Get number of checkpoint count (by checkpoint_segments)
  17. UserParameter=psql.checkpoints_req[*],psql -h $1 -p $2 -U $3 -d $4 --"select checkpoints_req from pg_stat_bgwriter"
  18. # Get the total number of connections
  19. UserParameter=psql.server_connections[*],psql -h $1 -p $2 -U $3 -d $4 --"select count(*) from pg_stat_activity;"
  20. # Get the total number of active (on processing SQL) connections
  21. UserParameter=psql.active_connections[*],psql -h $1 -p $2 -U $3 -d $4 --"select count(state) from pg_stat_activity where state = 'active'"
  22. # Get the total number of idle connections
  23. UserParameter=psql.idle_connections[*],psql -h $1 -p $2 -U $3 -d $4 --"select count(state) from pg_stat_activity where state = 'idle'"
  24. # Get the total number of idle in transaction connections
  25. UserParameter=psql.idle_tx_connections[*],psql -h $1 -p $2 -U $3 -d $4 --"select count(state) from pg_stat_activity where state = 'idle in transaction'"
  26. # Get the total number of lock-waiting connections
  27. UserParameter=psql.locks_waiting[*],psql -h $1 -p $2 -U $3 -d $4 --"select count(*) from pg_stat_activity where waiting = 't'"

  28. # Get buffer information
  29. UserParameter=psql.buffers_checkpoint[*],psql -h $1 -p $2 -U $3 -d $4 --"select buffers_checkpoint from pg_stat_bgwriter"
  30. UserParameter=psql.buffers_clean[*],psql -h $1 -p $2 -U $3 -d $4 --"select buffers_clean from pg_stat_bgwriter"
  31. UserParameter=psql.maxwritten_clean[*],psql -h $1 -p $2 -U $3 -d $4 --"select maxwritten_clean from pg_stat_bgwriter"
  32. UserParameter=psql.buffers_backend[*],psql -h $1 -p $2 -U $3 -d $4 --"select buffers_backend from pg_stat_bgwriter"
  33. UserParameter=psql.buffers_backend_fsync[*],psql -h $1 -p $2 -U $3 -d $4 --"select buffers_backend_fsync from pg_stat_bgwriter"
  34. UserParameter=psql.buffers_alloc[*],psql -h $1 -p $2 -U $3 -d $4 --"select buffers_alloc from pg_stat_bgwriter"

  35. # Get number of slow queries
  36. UserParameter=psql.slow_queries[*],psql -h $1 -p $2 -U $3 -d $4 --"select count(*) from pg_stat_activity where state = 'active' and now() - query_start > '$5 sec'::interval"
  37. UserParameter=psql.slow_select_queries[*],psql -h $1 -p $2 -U $3 -d $4 --"select count(*) from pg_stat_activity where state = 'active' and now() - query_start > '$5 sec'::interval and query ilike 'select%'"
  38. UserParameter=psql.slow_dml_queries[*],psql -h $1 -p $2 -U $3 -d $4 --"select count(*) from pg_stat_activity where state = 'active' and now() - query_start > '$5 sec'::interval and query ~* '^(insert|update|delete)'"

  39. #
  40. # Database specific examples
  41. #
  42. # Get the size of a Database (in bytes)
  43. UserParameter=psql.db_size[*],psql -h $1 -p $2 -U $3 -d $4 --"select pg_database_size('$5')"
  44. # Get number of active connections for a specified database
  45. UserParameter=psql.db_connections[*],psql -h $1 -p $2 -U $3 -d $4 --"select numbackends from pg_stat_database where datname = '$5'"
  46. # Get number of tuples returned for a specified database
  47. UserParameter=psql.db_returned[*],psql -h $1 -p $2 -U $3 -d $4 --"select tup_returned from pg_stat_database where datname = '$5'"
  48. # Get number of tuples fetched for a specified database
  49. UserParameter=psql.db_fetched[*],psql -h $1 -p $2 -U $3 -d $4 --"select tup_fetched from pg_stat_database where datname = '$5'"
  50. # Get number of tuples inserted for a specified database
  51. UserParameter=psql.db_inserted[*],psql -h $1 -p $2 -U $3 -d $4 --"select tup_inserted from pg_stat_database where datname = '$5'"
  52. # Get number of tuples updated for a specified database
  53. UserParameter=psql.db_updated[*],psql -h $1 -p $2 -U $3 -d $4 --"select tup_updated from pg_stat_database where datname = '$5'"
  54. # Get number of tuples deleted for a specified database
  55. UserParameter=psql.db_deleted[*],psql -h $1 -p $2 -U $3 -d $4 --"select tup_deleted from pg_stat_database where datname = '$5'"
  56. # Get number of commited/rolled back transactions for a specified database
  57. UserParameter=psql.db_tx_commited[*],psql -h $1 -p $2 -U $3 -d $4 --"select xact_commit from pg_stat_database where datname = '$5'"
  58. UserParameter=psql.db_tx_rolledback[*],psql -h $1 -p $2 -U $3 -d $4 --"select xact_rollback from pg_stat_database where datname = '$5'"

  59. # Cache Hit Ratio
  60. UserParameter=psql.cachehit_ratio[*],psql -h $1 -p $2 -U $3 -d $4 --"SELECT round(blks_hit*100/(blks_hit+blks_read), 2) AS cache_hit_ratio FROM pg_stat_database WHERE datname = '$5' and blks_read > 0 union all select 0.00 AS cache_hit_ratio order by cache_hit_ratio desc limit 1"

  61. # Added by SRA OSS
  62. # Get number of temp files
  63. UserParameter=psql.db_temp_files[*],psql -h $1 -p $2 -U $3 -d $4 --"select temp_files from pg_stat_database where datname = '$5'"
  64. # Get temp file size (in bytes)
  65. UserParameter=psql.db_temp_bytes[*],psql -h $1 -p $2 -U $3 -d $4 --"select temp_bytes from pg_stat_database where datname = '$5'"
  66. # Get percentage of dead tuples of all tables for a specified database
  67. UserParameter=psql.db_dead_tup_ratio[*],psql -h $1 -p $2 -U $3 -d $4 --"select round(sum(n_dead_tup)*100/sum(n_live_tup+n_dead_tup), 2) as dead_tup_ratio from pg_stat_all_tables where n_live_tup > 0"
  68. # Get number of deadlocks for a specified database (9.or later)
  69. UserParameter=psql.db_deadlocks[*],psql -h $1 -p $2 -U $3 -d $4 --"select deadlocks from pg_stat_database where datname = '$5'"


  70. #
  71. # Table specific examples
  72. #
  73. # Get table cache hit ratio of a specific table
  74. UserParameter=psql.table_cachehit_ratio[*],psql -h $1 -p $2 -U $3 -d $4 --"select round(heap_blks_hit*100/(heap_blks_hit+heap_blks_read), 2) as cache_hit_ratio from pg_statio_user_tables where schemaname = '$5' and relname = '$6' and heap_blks_read > 0 union all select 0.00 as cache_hit_ratio order by cache_hit_ratio desc limit 1"
  75. # Get number of sequencial scan of a specific table
  76. UserParameter=psql.table_seq_scan[*],psql -h $1 -p $2 -U $3 -d $4 --"select seq_scan from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  77. # Get number of index scan of a specific table
  78. UserParameter=psql.table_idx_scan[*],psql -h $1 -p $2 -U $3 -d $4 --"select coalesce(idx_scan,0) from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  79. # Get number of vacuum count of a specific table
  80. UserParameter=psql.table_vacuum_count[*],psql -h $1 -p $2 -U $3 -d $4 --"select vacuum_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  81. # Get number of analyze count of a specific table
  82. UserParameter=psql.table_analyze_count[*],psql -h $1 -p $2 -U $3 -d $4 --"select analyze_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  83. # Get number of autovacuum count of a specific table
  84. UserParameter=psql.table_autovacuum_count[*],psql -h $1 -p $2 -U $3 -d $4 --"select autovacuum_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  85. # Get number of autoanalyze count of a specific table
  86. UserParameter=psql.table_autoanalyze_count[*],psql -h $1 -p $2 -U $3 -d $4 --"select autoanalyze_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

  87. # Get number of tuples of a specific table
  88. UserParameter=psql.table_n_tup_ins[*],psql -h $1 -p $2 -U $3 -d $4 --"select n_tup_ins from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  89. UserParameter=psql.table_n_tup_upd[*],psql -h $1 -p $2 -U $3 -d $4 --"select n_tup_upd from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  90. UserParameter=psql.table_n_tup_del[*],psql -h $1 -p $2 -U $3 -d $4 --"select n_tup_del from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  91. UserParameter=psql.table_seq_tup_read[*],psql -h $1 -p $2 -U $3 -d $4 --"select seq_tup_read from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  92. UserParameter=psql.table_idx_tup_fetch[*],psql -h $1 -p $2 -U $3 -d $4 --"select coalesce(idx_tup_fetch,0) from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  93. UserParameter=psql.table_n_tup_hot_upd[*],psql -h $1 -p $2 -U $3 -d $4 --"select n_tup_hot_upd from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  94. UserParameter=psql.table_n_live_tup[*],psql -h $1 -p $2 -U $3 -d $4 --"select n_live_tup from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
  95. UserParameter=psql.table_n_dead_tup[*],psql -h $1 -p $2 -U $3 -d $4 --"select n_dead_tup from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"


  96. #
  97. # Discovery Rule
  98. #
  99. # Database Discovery
  100. UserParameter=db.list.discovery[*],$5/find_dbname.sh $1 $2 $3 $4
  101. UserParameter=db_table.list.discovery[*],$5/find_dbname_table.sh $1 $2 $3 $4



使用例:
1)下载pg_monz
https://github.com/pg-monz/pg_monz/releases

2)安装pg_monz

  1. [root@zabbix ~]# tar xfz pg_monz-1.0.tar.gz
  2. [root@zabbix ~]# cd pg_monz-1.0/pg_monz
  3. [root@zabbix pg_monz]# cp find_dbname.sh find_dbname_table.sh /usr/local/bin/
  4. [root@zabbix pg_monz]# cp userparameter_pgsql.conf /etc/zabbix/zabbix_agentd.d/
  5. [root@zabbix pg_monz]# chmod +x /usr/local/bin/find_dbname.sh
  6. [root@zabbix pg_monz]# chmod +x /usr/local/bin/find_dbname_table.sh
  7. [root@zabbix pg_monz]# /etc/init.d/zabbix-agent restart

3)导入模板到Zabbix服务器
点击Zabbix GUI画面的"Configuration->Templates->Import"把模板文件pg_monz_template.xml导入Zabbix服务器。
PostgreSQL的监控三(zabbix)_第6张图片
4)设置模板中的宏
点击Zabbix GUI画面的"Configuration->Templates",再点击其中的"PostgreSQL Check"模板,然后点击"Marcos" Tab设置必要宏参数(尤其是连接相关的参数)。
PostgreSQL的监控三(zabbix)_第7张图片

5)在Zabbix Server上创建Host
点击Zabbix GUI画面的"Configuration->Hosts->Create Host"为被监控数据库所在主机创建一个Host,如果该主机的Host已存在也可使用已有Host。这个Host要设置Zabbix Agent,并且把该Host链接到前面导入的模板"PostgeSQL Check"。
PostgreSQL的监控三(zabbix)_第8张图片

PostgreSQL的监控三(zabbix)_第9张图片


6) 检查数据是否已被收集
点击Zabbix GUI画面的"Monitoring->Last data"检查数据是否已被收集。
PostgreSQL的监控三(zabbix)_第10张图片


参考
http://pg-monz.github.io/pg_monz/index-en.html

3.6 小结

以上的方法1和方法2都需要自己再进行定制,而DBforBIXpg_monz已经比较成熟了。pg_monzDBforBIX相比更简单,可监控的PostgreSQL项目也更多,还可以自动发现库和表;DBforBIX的优势则在于支持监控多种常用的数据库以及可以使用jdbc连接池。综合而言如果不需要监控多种数据库个人倾向于pg_monz

转载: http://blog.chinaunix.net/uid-20726500-id-4513716.html

你可能感兴趣的:(数据库-PostgreSQL)