啃别人的骨头
https://www.xiaomastack.com/2014/07/03/zabbixdbtable/
http://www.furion.info/623.html?utm_source=tuicool&utm_medium=referral
1.zabbix表关系
2.hosts
hostid,host(包括模板)
hostid:每台主机唯一id标识;host:主机名;status:主机状态标识(为0则是你要找的主机,3表示的貌似是模板主机)
3.groups
groupid,name
4.hosts_groups
hosts,groups
5.items
hostid,itemid
items表则记录着没台主机所有图形项目(比如一个网卡图,有流入和流出两根线,则这张图就有两个item)。字段说明,itemid是每个绘图项目唯一标识,hostid每个主机的标识,name每个item的名字,delay数据采集间隔,history历史数据保存时间,status标识item的状态(0表示正常显示的item),units保存item的单位
6.graphs_items
gitemid,graphid,itemid
关联着每幅图的item和图形id,简单的说就是告诉我们每台主机有多少幅图及每幅图上有哪些item
7.graphs
graphid
8.history,history_text,history_uint
itemid
9.trends,trends_uint
itemid
分割线
-------------------------------------------------
10.actions
actionid,
actions表记录了当触发器触发时,需要采用的动作
11.alerts
alerts 表保存了历史的告警事件,可以从这个表里面去做一些统计分析,例如某个部门、 某人、某类时间的告警统计,以及更深入的故障发生、恢复时间,看你想怎么用了。
alertid,actionid,eventid,userid
12.functions
itemid,triggerid
function 表时非常重要的一个表了,记录了trigger中使用的表达式,例如max、last、nodata等函数。
13.trigger_discovery
triggerid,parent_triggerid
案例
zabbix如何批量更新错误图表 2014/06
http://www.furion.info/654.html
graphs_sql = " select graphid , name from graphs where name like '%端口队列发包量' "
#获取graphid
items_sql = " select i.hostid ,g.gitemid,i.itemid ,i.description from items as i left join graphs_items g on i.itemid = g.itemid where g.graphid= %s" % graphid
#获取hostid,itemid
new_itemid_sql = "select hostid ,itemid,description from items where hostid=%d and description='%s'" %(hostid, description_new)
#获取正确的itemid
sql_update = "update graphs_items set itemid=%d where graphid = %d and gitemid=%d " %(itemid_new,graphid, gitemid)
#利用更新错的数据,更新正确的itemid
四个sql语句搞定
2.zabbix批量更新主机关联的模板
http://www.furion.info/703.html
get_host_template
sql = 'select ht.templateid from hosts_templates ht, hosts h where h.hostid = ht.hostid and h.hostid=%s' %(hostid)
或者template.get
涉及两个表 hosts_templates和hosts
3.19vs30,其实items都会有,这样的查询是错误的。
select t1.* from hosts_templates t1 where t1.hostid in(
select ii.hostid from items ii where ii.`name` like '%GC%'
and ii.hostid IN (select htt.hostid from hosts_templates htt where htt.templateid='10143') GROUP BY ii.hostid
) GROUP BY t1.hostid
比如你想查找出那些有应用模板但是没有items的主机,那些是有故障的
找出hostid
select htt.hostid from hosts_templates htt where htt.templateid='10143' and htt.hostid not in (
select ii.hostid from items ii where ii.`name` like '%tomcat%'
and ii.hostid IN (select htt.hostid from hosts_templates htt where htt.templateid='10143') GROUP BY ii.hostid
) group by htt.hostid
2.然后与hosts表连接
select * from hosts kk where kk.hostid in (
select htt.hostid from hosts_templates htt where htt.templateid='10143' and htt.hostid not in (
select ii.hostid from items ii where ii.`name` like '%tomcat%'
and ii.hostid IN (select htt.hostid from hosts_templates htt where htt.templateid='10143') GROUP BY ii.hostid
) group by htt.hostid
)
4.查询不支持的项目select status,itemid,hostid,name,key_ from items where status=1;
总之,zabbix表关系设计的很好。多余的字段也会显示是否故障,比如类似status字段。
问题
1.自动发现规则的表是哪张
2.到主机,可以定位哪些主机的值更新存在异常(比unreachable的报警更加准确)
更新
zabbix 表结构 hosts->hostid->templateid hosts表 hostid host 1v1关系 hosts_templates hostid templateid 多v1关系 alerts表是记录已经发送邮件的记录 所以界面上是从哪个表取出来的。 events SELECT * FROM information_schema.`KEY_COLUMN_USAGE` WHERE referenced_table_name='events' 因为zabbix有很多外键,所以说这个设计很不错。尤其要注意子表与主表的关联,比如events 所以分析zabbix表结构,首先从外键看起,谁的外键多,谁就是主表,那么就是源头了。 events acknowledged select * from events where eventid='4516146'; eventid source object objectid clock value acknowledged ns 4516146 0 0 13791 1470815425 1 1 125453205 5 3 0 13477 1465962284 1 0 97450859 hosts.hostid->hosts_groups.hostid hosts_groups.groupid->groups.groupid hosts.hostid->items.hostid items.itemid->functions.itemid functions.triggerid->triggers.triggerid triggers.triggerid->events.objectid functionid itemid triggerid function parameter 10199 10019 10016 diff 0 {functionid}>100 意味着itemid(10019)的values>100 意思就是items的值关联function方式,然后triggers是否为1还是0(正常) sql语法1: select ht.templateid from hosts_templates ht, hosts h where h.hostid = ht.hostid -- and h.hostid = '10084'; 解析 hosts的hostid去匹配hosts_templates,所以记录总数为hosts_templates,有重复. sql语法2: 找出那些有应用模板但是没有items的主机,那些是有故障的 select * from hosts where hostid in ( SELECT htt.hostid FROM hosts_templates htt WHERE htt.templateid = '10143' AND htt.hostid NOT IN ( SELECT ii.hostid FROM items ii WHERE ii.`name` LIKE '%tomcat%' AND ii.hostid IN ( SELECT htt.hostid FROM hosts_templates htt WHERE htt.templateid = '10143' ) GROUP BY ii.hostid ) GROUP BY htt.hostid ) sql语法3:查看items中停用的项目 SELECT b. HOST, a.itemid, a.hostid, a. NAME, a.key_ FROM items a, HOSTS b WHERE b.hostid = a.hostid AND a. STATUS = 1 sql语法4: create table tmp1 as (SELECT `hosts`.`host`, `triggers`.triggerid, `triggers`.description, `triggers`.priority, `events`.`value`, FROM_UNIXTIME(`events`.clock) time FROM `hosts`, `triggers`, `events`, `items`, `functions`, `groups`, `hosts_groups` WHERE `hosts`.hostid = `hosts_groups`.hostid AND `hosts_groups`.groupid = `groups`.groupid AND `triggers`.triggerid = `events`.objectid AND `hosts`.hostid = `items`.hostid AND `items`.itemid = `functions`.itemid AND `functions`.triggerid = `triggers`.triggerid); --告警数据 alter表 select FROM_UNIXTIME(clock),sendto,`subject` from alerts -- where `subject` like '%磁盘%' and where DATE_FORMAT(FROM_UNIXTIME(clock),'%Y-%m-%d') = DATE_FORMAT(NOW(),'%Y-%m-%d'); select FROM_UNIXTIME(clock),hh.* from alerts hh where status != 1 and TO_DAYS(NOW()) - TO_DAYS(FROM_UNIXTIME(clock)) < 365; select FROM_UNIXTIME(clock),hh.* from alerts hh where status != 1 and DATE_FORMAT(NOW(),'%Y-%m-%d') - DATE_FORMAT(FROM_UNIXTIME(clock),'%Y-%m-%d') < 30; events select FROM_UNIXTIME(h.clock),h.subject,FROM_UNIXTIME(f.clock),f.* from events f, (select clock,`subject`,eventid from alerts where DATE_FORMAT(FROM_UNIXTIME(clock),'%Y-%m-%d') = DATE_FORMAT(NOW(),'%Y-%m-%d') ) as h where f.eventid = h.eventid (32723 select * from functions where itemid = '32723' 15497 select * from `triggers` where triggerid = '15497'; select * from `events` select 1504618584 select from_unixtime(1504618584,'%Y%m%d %H:%i:%S')) 得到的时间为20170905 21:36:24 select from_unixtime(tt.clock,'%Y%m%d %H:%i:%S'),tt.* from events tt where eventid in ( select * from alerts order by from_unixtime(clock,'%Y%m%d %H:%i:%S') desc) order by eventid desc limit 1,5000 hosts hostid 是ip和模板 select i.itemid,h.host from items i,hosts h where i.hostid=h.hostid and h.host='xxxx' and i.name in ('regionserver writeRequestsCount','regionserver requests'); 详细步骤 select * from hosts where host='192.1.1.206'; hostid select * from items where hostid='10084'; select * from items where hostid='10084' and name like '%war%'; itemid 总体 hosts,items 联合查询itemid,然后通过itemid到history_unit查询更新记录 报警trigger条件,记录到triggers表中,然后符合条件的过滤到alerts,就会发送出去 主屏幕最近20个问题,表示的是获取到的值一直没有改变。去striggers查询 如果一直没发邮件出去,那么就会挂在web界面上,然后即使你action了,也不会去从数据库执行那套操作。就会一直挂着,除非你改变状态。 问题 1.为什么在主屏幕上显示,而不是发送邮件。 triggers表,alters表没有数据 CREATE TABLE `triggers` ( `triggerid` bigint(20) unsigned NOT NULL, `expression` varchar(2048) NOT NULL DEFAULT '', `description` varchar(255) NOT NULL DEFAULT '', `url` varchar(255) NOT NULL DEFAULT '', `status` int(11) NOT NULL DEFAULT '0', `value` int(11) NOT NULL DEFAULT '0', `priority` int(11) NOT NULL DEFAULT '0', `lastchange` int(11) NOT NULL DEFAULT '0', `comments` text NOT NULL, `error` varchar(128) NOT NULL DEFAULT '', `templateid` bigint(20) unsigned DEFAULT NULL, `type` int(11) NOT NULL DEFAULT '0', `state` int(11) NOT NULL DEFAULT '0', `flags` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`triggerid`), KEY `triggers_1` (`status`), KEY `triggers_2` (`value`,`lastchange`), KEY `triggers_3` (`templateid`), CONSTRAINT `c_triggers_1` FOREIGN KEY (`templateid`) REFERENCES `triggers` (`triggerid`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8;