Mon is considered king of all monitoring tools.
Note: This article obsoletes my previous article on MON.
First of all, install the following perl modules:-
perl -MCPAN -e “install Time::HiRes”
perl -MCPAN -e “install Time::Period”
Download Mon and mon-client software :-
cd /root
wget http://kernel.org/pub/software/admin/mon/mon-1.2.0.tar.bz2
wget http://kernel.org/pub/software/admin/mon/mon-client-1.2.0.tar.bz2
cd /usr/local/
[root@www local]# tar xjf /root/mon-1.2.0.tar.bz2
Rename the directory:-
[root@www local]# mv mon-1.2.0 mon
cd /usr/local/mon/etc
cp example.cf mon.cf
Edit mon.cf, change the paths from /usr/lib/mon to /usr/local/mon .
[root@www etc]# vi mon.cf
…
…
#
# NOTE:
#
# A “watch” definition (a line which begins with the word “watch” and is
# followed by “service” definitions) is terminated by an
# empty line, or by a subsequent definition. You may not put blank lines
# inside of your watch definitions.
#
#
# global options
#
cfbasedir = /usr/local/mon/etc
alertdir = /usr/local/mon/alert.d
mondir = /usr/local/mon/mon.d
maxprocs = 20
histlength = 100
randstart = 60s
dtlogfile = /var/log/mon-downtim.log
dtlogging = yes
#
# authentication types:
# getpwnam standard Unix passwd, NOT for shadow passwords
# shadow Unix shadow passwords (not implemented)
# userfile “mon” user file
#
authtype = getpwnam
#
# NB: hostgroup and watch entries are terminated with a blank line (or
# end of file). Don’t forget the blank lines between them or you lose.
#
#
# group definitions (hostnames or IP addresses)
#
hostgroup webservers www.example.com
hostgroup mailservers mail.example.com
hostgroup dbservers db.example.com
watch mailservers
service ping
description ping servers
interval 5m
monitor fping.monitor
depend routers:ping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert [email protected]
alert page.alert [email protected]
alertevery 1h
period wd {Sat-Sun}
alert mail.alert [email protected]
service fping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert [email protected]
alert page.alert [email protected]
alertevery 1h
service smtp
interval 10m
monitor smtp.monitor
period wd {Mon-Fri} hr {7am-10pm}
alertevery 1h
alertafter 2 30m
alert page.alert [email protected]
service imap
interval 10m
monitor imap.monitor
period wd {Mon-Fri} hr {7am-10pm}
alertevery 1h
alertafter 2 30m
alert page.alert [email protected]
service pop
interval 10m
monitor pop3.monitor
period wd {Mon-Fri} hr {7am-10pm}
alertevery 1h
alertafter 2 30m
alert page.alert [email protected]
watch webservers
service fping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert [email protected]
alert page.alert [email protected]
alertevery 1h
service ping
interval 2m
monitor fping.monitor
allow_empty_group
period wd {Sun-Sat}
alert qpage.alert mis-pagers
alertevery 45m
service http
interval 4m
monitor http.monitor
allow_empty_group
period wd {Sun-Sat}
alert qpage.alert mis-pagers
upalert mail.alert -S “web server is back up” mis
alertevery 45m
service freespace
interval 15m
monitor freespace.monitor /f330:5000 /f540:5000 ;;
period wd {Sun-Sat}
alert mail.alert [email protected]
# alert delete.snapshot
alertevery 1h
service ftp
interval 5m
monitor ftp.monitor
period wd {Sun-Sat}
alert mail.alert [email protected]
alertevery 1h
watch dbservers
service ping
description ping servers
interval 5m
monitor fping.monitor
depend routers:ping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert [email protected]
alert page.alert [email protected]
alertevery 1h
period wd {Sat-Sun}
alert mail.alert [email protected]
service fping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert [email protected]
alert page.alert [email protected]
alertevery 1h
Next, download and install fping:
fping is a ping(1) like program which uses the Internet Control
Message Protocol (ICMP) echo request to determine if a host is
up. fping is different from ping in that you can specify any
number of hosts on the command line, or specify a file containing
the lists of hosts to ping. Instead of trying one host until it
timeouts or replies, fping will send out a ping packet and move
on to the next host in a round-robin fashion. If a host replies,
it is noted and removed from the list of hosts to check. If a host
does not respond within a certain time limit and/or retry limit it
will be considered unreachable.
Checking 2500 hosts (99% of which are unreachable) via ping can take hours.
fping was written to solve the problem of pinging N number of hosts
in an efficient manner. By sending out pings in a round-robin fashion
and checking on responses as they come in at random, a large number of
hosts can be checked at once.
Unlike ping, fping is meant to be used in scripts and its
output is easy to parse.
cd /root
wget http://fping.sourceforge.net/download/fping.tar.gz
tar xzf fping.tar.gz
cd fping-2.4b2_to/
./configure
make
make install
cd /root
Fping will get installed in /usr/local/sbin as a result of “make install” . /usr/local/sbin is in the search path be default. If it is not, you can specify the full / absolute path to fping program in the mon.d/fping.monitor file by manually editing it at a particular line ( line # 53) :-
vi /usr/local/mon/mon.d/fping.monitor
…
…
my $CMD = “fping -e -r $RETRIES -t $TIMEOUT”;
…
…
Add the following lines to /etc/services:
mon 2583/tcp # MON
mon 2583/udp # MON traps
Copy the mon startup script to /etc/init.d/ :-
cp /usr/local/mon/etc/S99mon /etc/init.d/mon
vi /etc/init.d/mon
#!/bin/sh
#
# start/stop the mon server
#
# You probably want to set the path to include
# nothing but local filesystems.
#
# chkconfig: 2345 99 10
# description: mon system monitoring daemon
# processname: mon
# config: /usr/local/mon/etc/mon/mon.cf
# pidfile: /var/run/mon.pid
#
PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/mon
export PATH
# Source function library.
. /etc/rc.d/init.d/functions
# The following two variables are introduced by Kamran
MONCONFIGFILE=/usr/local/mon/etc/mon.cf
MON=/usr/local/mon/mon
# See how we were called.
case “$1″ in
start)
echo -n “Starting mon daemon: ”
# The following line is edited by Kamran. Replaced absulute path with variable names.
daemon $MON -f -l -c $MONCONFIGFILE
echo
touch /var/lock/subsys/mon
;;
stop)
echo -n “Stopping mon daemon: ”
killproc mon
echo
rm -f /var/lock/subsys/mon
;;
status)
status mon
;;
restart)
killall -HUP mon
;;
*)
echo “Usage: mon {start|stop|status|restart}”
exit 1
esac
exit 0
[root@www mon]#
chmod +x /etc/init.d/mon
chkconfig –level 35 mon on
service mon start
[root@www mon-client-1.2.0]# service mon status
mon (pid 20609) is running…
[root@www mon-client-1.2.0]#
Check:-
[root@www mon]# ps aux | grep mon
root 2134 0.0 0.0 3788 284 ? S Jun27 0:00 /usr/sbin/courierlogger -pid=/var/spool/authdaemon/pid -start /usr/libexec/courier-authlib/authdaemond
root 2135 0.0 0.0 52496 436 ? S Jun27 0:00 /usr/libexec/courier-authlib/authdaemond
root 2148 0.0 0.0 54708 728 ? S Jun27 0:01 /usr/libexec/courier-authlib/authdaemond
root 2149 0.0 0.0 54708 732 ? S Jun27 0:01 /usr/libexec/courier-authlib/authdaemond
root 2150 0.0 0.0 54708 728 ? S Jun27 0:01 /usr/libexec/courier-authlib/authdaemond
root 2151 0.0 0.0 54708 732 ? S Jun27 0:01 /usr/libexec/courier-authlib/authdaemond
root 2152 0.0 0.0 54708 728 ? S Jun27 0:01 /usr/libexec/courier-authlib/authdaemond
dbus 2153 0.0 0.0 21256 344 ? Ss Jun27 0:00 dbus-daemon –system
qscand 2432 0.0 0.0 21564 976 ? Ss Jun27 0:16 /usr/bin/freshclam -d -c 24 –quiet -p /var/run/clamav/freshclam.pid –daemon-notify=/etc/clamd.conf
root 20354 0.0 0.8 106076 8984 ? S 21:56 0:00 /usr/bin/perl /usr/local/mon/mon -f -l -c /usr/local/mon/etc/mon.cf
root 20364 0.0 0.0 61148 680 pts/0 S+ 21:57 0:00 grep mon
That was a lot of output. Let’s filter out the word courier.
[root@www mon]# ps aux | grep mon | grep -v courier
dbus 2153 0.0 0.0 21256 344 ? Ss Jun27 0:00 dbus-daemon –system
qscand 2432 0.0 0.0 21564 976 ? Ss Jun27 0:16 /usr/bin/freshclam -d -c 24 –quiet -p /var/run/clamav/freshclam.pid –daemon-notify=/etc/clamd.conf
root 20354 0.0 0.8 106076 9004 ? S 21:56 0:00 /usr/bin/perl /usr/local/mon/mon -f -l -c /usr/local/mon/etc/mon.cf
root 20372 0.0 0.3 85284 3244 ? S 21:57 0:00 /usr/bin/perl /usr/local/mon/mon.d/smtp.monitor mail.example.com
root 20377 0.1 0.3 87352 3304 ? S 21:57 0:00 /usr/bin/perl /usr/local/mon/mon.d/imap.monitor mail.example.com
root 20380 0.0 0.0 61148 680 pts/0 S+ 21:57 0:00 grep mon
[root@www mon]#
Time to copy the client CGI program to proper location:-
mkdir /var/www/cgi-bin/mon
cp /usr/local/mon/clients/mon.cgi /var/www/cgi-bin/mon/
[root@www mon]# chmod +x /var/www/cgi-bin/mon/mon.cgi
vi /var/www/cgi-bin/mon.cgi
. . .
$organization = “TestSite”; # Organization name.
$monadmin = “kamran\@example.com”; # Your e-mail address. Make sure the backslash is present.
$reload_time = 30; # Seconds for page reload.
. . .
Try accessing this page from the web browser :-
http://www.example.com/cgi-bin/mon/mon.cgi
If you get a blank page, check your apache error log for your site:-
[Thu Jul 30 22:08:01 2009] [error] [client 76.74.237.16] Can’t locate Mon/Client.pm in @INC (@INC contains: /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.7/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.6/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl/5.8.7 /usr/lib/perl5/site_perl/5.8.6 /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.7/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.6/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl/5.8.7 /usr/lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8) at /var/www/cgi-bin/mon/mon.cgi line 138.
[Thu Jul 30 22:08:01 2009] [error] [client 76.74.237.16] BEGIN failed–compilation aborted at /var/www/cgi-bin/mon/mon.cgi line 138.
[Thu Jul 30 22:08:01 2009] [error] [client 76.74.237.16] Premature end of script headers: mon.cgi
, then, this means that Mon/Client.pm is to be installed.
[root@www mon]# perl -MCPAN -e “install Mon::Client”
…
…
Running make install
Prepending /root/.cpan/build/Mon-0.11-x4te9h/blib/arch /root/.cpan/build/Mon-0.11-x4te9h/blib/lib to PERL5LIB for ‘install’
Manifying blib/man3/Mon::Protocol.3pm
Manifying blib/man3/Mon::SNMP.3pm
Manifying blib/man3/Mon::Client.3pm
Installing /usr/lib/perl5/site_perl/5.8.8/Mon/SNMP.pm
Installing /usr/lib/perl5/site_perl/5.8.8/Mon/Protocol.pm
Installing /usr/lib/perl5/site_perl/5.8.8/Mon/Client.pm
Installing /usr/share/man/man3/Mon::SNMP.3pm
Installing /usr/share/man/man3/Mon::Protocol.3pm
Installing /usr/share/man/man3/Mon::Client.3pm
Appending installation info to /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/perllocal.pod
TROCKIJ/Mon-0.11.tar.gz
/usr/bin/make install — OK
[root@www mon]#
If this doesn’t work for you, then you can also untar the mon-client package and install these modules from there.
Note: In my experience installing the perl modules which came with mon-client was better option, otherwise I was getting an error as:-
[Thu Jul 30 22:18:55 2009] [error] [client 76.74.237.16] Can’t locate object method “list_views” via package “Mon::Client” at /var/www/cgi-bin/mon/mon.cgi line 2175, <GEN0> line 1.
[Thu Jul 30 22:18:55 2009] [error] [client 76.74.237.16] Premature end of script headers: mon.cgi
, and googling on this one did not help!
So, as you can see below, you will see the same three perl modules + an additional one in the mon-client tarball:-
cd /root
tar xjf mon-client-1.2.0.tar.bz2
cd mon-client-1.2.0
ls
CHANGES COPYING COPYRIGHT Makefile.PL MANIFEST Mon README test.pl VERSION
ls Mon/
Client.pm Config.pm Protocol.pm SNMP.pm
To actuall install them, use:
perl Makefile.PL
make
make test
make install
Try reloading the page : http://www.example.com/cgi-bin/mon/mon.cgi
This time, you should be able to see the page showing some statistics.
Remember to relax your firewall to allow outgoing traffic for the protocols / monitors you are using for different servers.
Similarly the servers you are monitoring should also have a relaxed firewall to allow incoming connections from the monitoring server.
Securing access to mon.cgi :-
In your apache config file, add the follwing code:-
vi /etc/httpd/conf/httpd.conf
…
<Directory “/var/www/cgi-bin/mon”>
AllowOverride AuthConfig
Options None
Order allow,deny
Allow from all
</Directory>
…
service httpd reload
Create a .htaccess file in mon’s cgi directory:-
[root@www mon]# vi /var/www/cgi-bin/mon/.htaccess
AuthName “Authorization Required”
AuthType Basic
AuthUserFile /var/www/vhosts/.htpasswd
Require valid-user
[root@www mon]#
Change permissions and ownership of the .htaccess file:-
chown siteftpuser:apache /var/www/cgi-bin/mon/.htaccess
chmod 640 /var/www/cgi-bin/mon/.htaccess
[root@www mon]# htpasswd -c /var/www/vhosts/.htpasswd monitor
Read the MON documentation in the doc directory , on how to write monitors and alerts:-
[root@www mon]# ls doc/
CHANGES.mon.cgi monshow.1 README.msql-mysql.monitor README.software
globals README.alerts README.paging README.syslog.monitor
how-to-write-a-monitor.txt README.cgi-bin README.protocol README.traps
how-to-write-an-alert.txt README.hints README.rpc.monitor README.variables
mon.8 README.mon.cgi README.snmpdiskspace.monitor
moncmd.1 README.monitors README.snmpvar.monitor
The following articles are quite helpful, having sample MON configurations, etc.
- This one is a live site: https://admin.teamnet.de/cgi-bin/mon.cgi?command=query_opstatus
- Shahid Azeez’s personal website : http://shahidz.com/mon/