This guide asumes that you know how to play with RRDTOOL and that you are somewhat familiar with OpenNMS. All the examples presented here were tested on Redhat Linux 7.3, version 1.0 of OpenNMS with Net-SNMP 4.2.3 agents running on the managed stations.
I tried to be as accurate as possible, so read this material at your own risk.
Author:
Contents[hide]
|
You will need to define the OIDs you want to capture in the file datacollection-config.xml file. In our example we want to capture the following Net-SNMP OIDs (memory, host uptime and interface traffic):
<!-- Net-SNMP memory stats, [email protected] --> <group name="ucd-memory" ifType="ignore"> <!-- Total Swap Size configured for the host. --> <mibObj oid=".1.3.6.1.4.1.2021.4.3" instance="0" alias="memTotalSwap" type="integer" /> <!-- Available Swap Space on the host. --> <mibObj oid=".1.3.6.1.4.1.2021.4.4" instance="0" alias="memAvailSwap" type="integer" /> <!-- Total Real/Physical Memory Size on the host. --> <mibObj oid=".1.3.6.1.4.1.2021.4.5" instance="0" alias="memTotalReal" type="integer" /> <!-- Available Real/Physical Memory Space on the host. --> <mibObj oid=".1.3.6.1.4.1.2021.4.6" instance="0" alias="memAvailReal" type="integer" /> <!-- Error flag. 1 indicates very little swap space left --> <mibObj oid=".1.3.6.1.4.1.2021.4.100" instance="0" alias="memSwapError" type="integer" /> </group> <!-- Net-SNMP system stats, [email protected] --> <group name="ucd-systemStats" ifType="ignore"> <!-- Amount of memory swapped to disk (kB/s). --> <mibObj oid=".1.3.6.1.4.1.2021.11.4" instance="0" alias="ssSwapOut" type="integer" /> <!-- percentages of user CPU time. --> <mibObj oid=".1.3.6.1.4.1.2021.11.9" instance="0" alias="ssCpuUser" type="integer" /> <!-- percentages of user CPU system. --> <mibObj oid=".1.3.6.1.4.1.2021.11.10" instance="0" alias="ssCpuSystem" type="integer" / > <!-- percentages of user CPU idle. --> <mibObj oid=".1.3.6.1.4.1.2021.11.11" instance="0" alias="ssCpuIdle" type="integer" /> </group> <group name = "mib2-interfaces-net-snmp" ifType = "all"> <mibObj oid=".1.3.6.1.2.1.2.2.1.10" instance="ifIndex" alias="ifInOctets" type="counter"/ > <mibObj oid=".1.3.6.1.2.1.2.2.1.11" instance="ifIndex" alias="ifInUcastPkts" type="counter"/ > <mibObj oid=".1.3.6.1.2.1.2.2.1.12" instance="ifIndex" alias="ifInNUcastPkts" type="counter"/> <mibObj oid=".1.3.6.1.2.1.2.2.1.14" instance="ifIndex" alias="ifInErrors" type="counter"/> <mibObj oid=".1.3.6.1.2.1.2.2.1.16" instance="ifIndex" alias="ifOutOctets" type="counter"/> <mibObj oid=".1.3.6.1.2.1.2.2.1.19" instance="ifIndex" alias="ifOutDiscards" type="counter"/> <mibObj oid=".1.3.6.1.2.1.2.2.1.20" instance="ifIndex" alias="ifOutErrors" type="counter"/> </group>
Then make sure your OIDs make it into the "systemDef" tag:
<systemDef name = "Net-SNMP"> <!-- <sysoidMask>.1.3.6.1.4.1.2021.250.</sysoidMask> --> <sysoidMask>.1.3.6.1.4.1.2021.</sysoidMask> <collect> <includeGroup>mib2-interfaces-net-snmp</includeGroup> <includeGroup>mib2-host-resources-storage</includeGroup> <includeGroup>mib2-host-resources-system</includeGroup> <includeGroup>mib2-host-resources-memory</includeGroup> <includeGroup>ucd-loadavg</includeGroup> <includeGroup>ucd-systemStats</includeGroup> <includeGroup>ucd-memory</includeGroup> </collect> </systemDef>
Please note than if a given OID cannot be retrieved (for example the SNMP agent doesn't support it) then the RRDtool file is not created (normally stored in /var/opennms/rrd/snmp/)
Additional notes
Make sure <sysoidMask> matches your systems oid or else it will not find it.
$ snmpget -v2c -On -c <community> <host> sysObjectID.0 .1.3.6.1.2.1.1.2.0 = OID: .1.3.6.0.0.0.0.0.0.0
would give you:
<sysoidMask>.1.3.6.0.0.0.0.0.0.0.</sysoidMask>
You need to define your custom graphics as follow:
NOTE: Beginning with version 1.3.2 the types are interfaceSnmp and nodeSnmp instead of interface and node.
One way to know wich datasources exist on your rddtool file (used in the DEF part of the rrdtool command line) is to get the information from the rrdtool file like this:
[root@lnxdev0001 16]# rrdtool info hrSystemUptime.rrd filename = "hrSystemUptime.rrd" rrd_version = "0001" step = 300 last_update = 1026942859 ds[hrSystemUptime].type = "GAUGE" ds[hrSystemUptime].minimal_heartbeat = 600 ds[hrSystemUptime].min = NaN ds[hrSystemUptime].max = NaN ds[hrSystemUptime].last_ds = "UNKN" ds[hrSystemUptime].value = 2.0296406018e+10 ds[hrSystemUptime].unknown_sec = 0 rra[0].cf = "AVERAGE" rra[0].rows = 8928 rra[0].pdp_per_row = 1 rra[0].xff = 5.0000000000e-01 rra[0].cdp_prep[0].value = NaN rra[0].cdp_prep[0].unknown_datapoints = 0 rra[1].cf = "AVERAGE" rra[1].rows = 8784 rra[1].pdp_per_row = 12 rra[1].xff = 5.0000000000e-01 rra[1].cdp_prep[0].value = 7.8203629928e+08 rra[1].cdp_prep[0].unknown_datapoints = 0 rra[2].cf = "MIN" rra[2].rows = 8784 rra[2].pdp_per_row = 12 rra[2].xff = 5.0000000000e-01 rra[2].cdp_prep[0].value = 7.8068611333e+07 rra[2].cdp_prep[0].unknown_datapoints = 0 rra[3].cf = "MAX" rra[3].rows = 8784 rra[3].pdp_per_row = 12 rra[3].xff = 5.0000000000e-01 rra[3].cdp_prep[0].value = 7.8338616000e+07 rra[3].cdp_prep[0].unknown_datapoints = 0
As you can see, the ds is 'hrSystemUptime'
If for some reason one of the rrdtool files that are part of a graphic definition is missing, then the graphic doesn't show up at all in the web console.
Remember than you can always check the contents of a rrdtool file typing:
rrdtool dump hrSystemUptime.rrd (shows a huge amount of data).
You can always check if the graphics you want to plot are accurate by running the rrdtool command line; Here is an example to graph the host uptime by hand:
rrdtool graph uptime.png --title "Host uptime" --vertical-label / Days "DEF:timeticks=hrSystemUptime.rrd:hrSystemUptime:AVERAGE" / "CDEF:days=timeticks,8640000,/" AREA:days#FF0000:"Days" / GPRINT:days:AVERAGE:"Avg //: %8.1lf %s" GPRINT:days:MIN:"Min //: %8.1lf %s" / GPRINT:days:MAX:"Max //: %8.1lf %s"
As you can see in the next lines, you will "copy and paste" part of this line in the graphic definition.
Here is the graphic configuration to caputure the host uptime, memory ussage and link accuracy, file snmp-graph.properties:
#### Newbreak LLC Custom reports. Jose Vicente Nunez Zuleta ([email protected]) ##### # Be very carefull with trailing spaces, otherwise you can get problems with the graphics!!! # Get the Uptime if the host uses Net-SNMP ([email protected]). To get the graphic manually: report.netsnmp.uptime.name=Uptime report.netsnmp.uptime.columns=hrSystemUptime report.netsnmp.uptime.type=node report.netsnmp.uptime.command=--title "Host uptime" / --vertical-label Days / DEF:timeticks={rrd1}:hrSystemUptime:AVERAGE / CDEF:days=timeticks,8640000,/ / AREA:days#FF0000:"Days" / GPRINT:days:AVERAGE:"Avg //: %8.1lf %s" / GPRINT:days:MIN:"Min //: %8.1lf %s" / GPRINT:days:MAX:"Max //: %8.1lf %s" / # Get the Memory statistics if the host uses Net-SNMP ([email protected]). report.netsnmp.memory.name=Memory report.netsnmp.memory.columns=memTotalSwap,memAvailSwap,memTotalReal,memAvailReal report.netsnmp.memory.type=node report.netsnmp.memory.command=--title "Host memory usage" / --vertical-label bytes / DEF:mtotalSwap={rrd1}:memTotalSwap:AVERAGE / DEF:mavailSwap={rrd2}:memAvailSwap:AVERAGE / DEF:mtotalReal={rrd3}:memTotalReal:AVERAGE / DEF:mavailReal={rrd4}:memAvailReal:AVERAGE / CDEF:totalSwap=mtotalSwap,1024,* / CDEF:availSwap=mavailSwap,1024,* / CDEF:totalReal=mtotalReal,1024,* / CDEF:availReal=mavailReal,1024,* / LINE3:totalSwap#FF0000:"Total swap" / LINE1:availSwap#00FF00:"Available swap" / LINE3:totalReal#0000FF:"Total real" / LINE1:availReal#000000:"Available real" / GPRINT:totalSwap:AVERAGE:"Avg //: %8.1lf %s" / GPRINT:totalSwap:MIN:"Min //: %8.1lf %s" / GPRINT:totalSwap:MAX:"Max //: %8.1lf %s" / GPRINT:availSwap:AVERAGE:"Avg //: %8.1lf %s" / GPRINT:availSwap:MIN:"Min //: %8.1lf %s" / GPRINT:availSwap:MAX:"Max //: %8.1lf %s" / GPRINT:totalReal:AVERAGE:"Avg //: %8.1lf %s" / GPRINT:totalReal:MIN:"Min //: %8.1lf %s" / GPRINT:totalReal:MAX:"Max //: %8.1lf %s" / GPRINT:availReal:AVERAGE:"Avg //: %8.1lf %s" / GPRINT:availReal:MIN:"Min //: %8.1lf %s" / GPRINT:availReal:MAX:"Max //: %8.1lf %s" / # Calculate the network accuracy using SNMP ([email protected]). # Check the following man pages for more info on rrdtool graph: # - rrdgraph_graph # - rrdgraph # - rrdgraph_examples # Also if you forgot anbout the RPN notation (like me :) ) then go to: # - http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/tutorial/rpntutorial.html # # Use CDEF to calculate this expression: # accuracy = 100 - ( (DifInErr*100) / (DifInUcast + DifInNUcast) ) # In my case, Net-SNMP doesn't show up the ifInNUcastPkts OID on Linux (Solaris works fine) # but the ammount of traffic is very low, so even taking out that value the estimate is good. report.netsnmp.accuracy.name=Accuracy #report.netsnmp.accuracy.columns=ifInErrors,ifInUcastPkts,ifInNUcastPkts report.netsnmp.accuracy.columns=ifInErrors,ifInUcastPkts report.netsnmp.accuracy.type=interface report.netsnmp.accuracy.command=--title "Link accuracy" / DEF:error={rrd1}:ifInErrors:AVERAGE / DEF:ucast={rrd2}:ifInUcastPkts:AVERAGE / CDEF:accuracy=100,error,100,*,ucast,/,- / LINE2:accuracy#FF0000:"% Accuracy" / GPRINT:accuracy:AVERAGE:"Avg //: %8.1lf %s" / GPRINT:accuracy:MIN:"Min //: %8.1lf %s" / GPRINT:accuracy:MAX:"Max //: %8.1lf %s" / # If you are sure than all your machines have information about multicast traffic, then: # DEF:nucast={rrd3}:ifInNUcastPkts:AVERAGE / # CDEF:accuracy=100,error,100,*,ucast,nucast,+,/,- /
Finally, don't forget to add the reports into the value of the reports property.
#report keys, list ALL prefab reports here! reports=traffic, octets, errors, discards, avgbusy5, freemem, / bufferfails, kerneltasks, kernelmem, / cpuPercentBusy, / novell.numberOfNLMsLoaded, novell.openFiles, novell.licensedConnections, / memory, / novell.codeDataMemory, novell.cacheBuffers, / novell.diskSpaceSys, novell.diskSpaceVol2, / winnt2k.diskSpaceC, winnt2k.diskSpaceD, / checkpoint.pktsAccepted, checkpoint.pktsRejected, / checkpoint.pktsDropped, checkpoint.pktsLogged, / loadavg, netsnmp.uptime, netsnmp.memory, netsnmp.accuracy /
Our reports are netsnmp.uptime, netsnmp.memory, netsnmp.accuracy on the last line. If you miss this step, you will get web exceptions when attempting to run performance reports. The exceptions will look something like the following:
Missing Parameter The request you made was incomplete. It was missing the report parameter. The following parameters are required: report node
/etc/init.d/opennms restart /etc/init.d/tomcat4 restart
Probably you will have to wait a couple of minutes before you get useful data to display.
Tutorial on how to use the RPN notation:
http://oss.oetiker.ch/rrdtool/tut/rpntutorial.en.html
Check the following man pages for more info on rrdtool graph:
rrdgraph_graph rrdgraph rrdgraph_examples