前段时间公司有个小需求:远程监控服务器,分两部分:
一个监控服务器是否“活着”;
另一个监控程序是否在运行以及服务器基本性能,超过预警值需要给相关人员发邮件提醒
wget http://sourceforge.net/projects/msmtp/files/msmtp/1.4.31/msmtp-1.4.31.tar.bz2/download
tar -jxvf msmtp-1.4.31.tar.bz2
cd msmtp-1.4.31
./configure --prefix=/usr/local/msmtp
make
make install
yum -y install msmtp
cd /usr/local/msmtp/
mkdir etc #配置文件目录和配置文件都要自己建
cd etc
vim msmtprc
新增的msmtprc内容如下:
#Set default values for all following accounts.
defaults
logfile ~/msmtp.log
#The SMTP server of the provider.
account test
#SMTP server
host smtp.163.com
#from Email
from xxx@163.com
#auth login
auth login
#login account
user xxx
#login password
password *****
#Set a default account
account default: test
注意:password是明文的
如果没有安装,则使用yum安装
yum -y install mutt
修改配置文件
vi /etc/Muttrc
set from="发送邮件地址"
set sendmail="/usr/bin/msmtp"
set use_from=yes
set realname="发件人"
set editor="vi"
echo "邮件内容123456" | mutt -s "邮件标题测试邮件" -a /scripts/test.txt ***@xxxx.com(-s:主题 -a:附件)
监控服务器性能:cpu、内存、磁盘
#!/bin/bash
MAILLIST="[email protected]" #emailist
MEM_CORDON=100 #内存使用大于这个值报警
SWAP_CORDON=50 #交换区使用值大于这个报警
CPU_CORDON=5 #cpu空闲小于这个值报警
DISK_CORDON=85 #磁盘占用大于这个值报警
HOSTNAME=`hostname`
DATA=`date`
send_warning()
{
echo $MESSAGE | mutt -s "$TITLE" "$MAILLIST"
}
if [ $# -ne 0 ];then
DISK_DIR=$1
else
DISK_DIR="/dev/sdb1"
fi
#MEM|SWAP check
MEMSTATUS=`free | grep "Mem" | awk '{printf("%d", $3*100/$2)}'`
SWAPSTATUS=`free | grep "Swap" | awk '{printf("%d", $3*100/$2)}'`
if [ $MEMSTATUS -ge $MEM_CORDON ];then
TITLE="[bad_news]:$HOSTNAME mem usage"
MESSAGE="Time:${DATA},Mem_used:${MEMSTATUS}%,Swap_used:${SWAPSTATUS}%"
send_warning
fi
if [ $SWAPSTATUS -ge $SWAP_CORDON ];then
TITLE="[bad_news]:$HOSTNAME Swap usage"
MESSAGE="Time:${DATA},Mem_used:${MEMSTATUS}%,Swap_used:${SWAPSTATUS}%"
send_warning
fi
#cpu
CPUSTATUS=`vmstat | awk '{print $15}' | tail -1`
if [ $CPUSTATUS -le $CPU_CORDON ];then
TITLE="[bad_news]:$HOSTNAME cpu usage"
MESSAGE="Time:${DATA},MCpu_free:${CPUSTATUS}%"
fi
#disk use n%
DISKSTATUS=`df -h $DISK_DIR | awk '{print $5}' | tail -1 | tr -d %`
if [ $DISKSTATUS -ge $DISK_CORDON ];then
TITLE="[bad_news]:$HOSTNAME disk usage"
MESSAGE="Time:${DATA},Disk_used:${DISKSTATUS}%"
send_warning
fi
httpdnum=`ps aux | grep 'httpd' | wc -l`
if [ $httpdnum -le 1 ]
then
TITLE="[bad_news]:$HOSTNAME "
MESSAGE="Time:${DATA},apache prograss is ended"
send_warning
fi
tomcatnum=`ps aux | grep 'tomcat' | wc -l`
if [ $tomcatnum -le 1 ]
then
TITLE="[bad_news]:$HOSTNAME "
MESSAGE="Time:${DATA},tomcat prograss is ended"
send_warning
fi
监控tomcat是否运行
#! /bin/bash
TomcatID=$(ps -ef |grep tomcat |grep -w 'apache-tomcat-7.0.67'|grep -v 'grep'|grep -v 'catalina.sh'|awk '{print $2}')
if [[ $TomcatID ]]
then
echo "tomcat is running"
else
TITLE="[bad_news]:$HOSTNAME "
MESSAGE="Time:${DATA},Tomcat prograss is ended"
send_warning
fi
监控数据库:
#! /bin/bash
#MySQL running这个字符串根据数据库版本正常运行时status显示的信息确定
/sbin/service mysql status | grep "MySQL running" > /dev/null
if [ $? -eq 0 ]
then
#状态正常检查3306端口是否正常监听
netstat -ntp | grep 3306 > /dev/null
if [ $? -ne 0 ]
then
/sbin/service mysql restart
sleep 3
/sbin/service mysql status | grep " MySQL running" > /dev/null
if [ $? -ne 0 ]
then
TITLE="[bad_news]:$HOSTNAME mysql state"
MESSAGE="Time:${DATA},mysql service has stoped ,Automatic startup failure, please start it manually!"
send_warning
fi
fi
else
/sbin/service mysql start
sleep 2;
/sbin/service mysql status | grep "MySQL running" > /dev/null
if [ $? -ne 0 ]
then
TITLE="[bad_news]:$HOSTNAME mysql state"
MESSAGE="Time:${DATA},mysql service has stoped ,Automatic startup failure, please start it manually!"
send_warning
fi
fi
监控服务器是否“活着”:看是否能够ping通
#!/bin/bash
LANG=C
server_all_list=(119.29.119.199:80 119.29.119.255:80)
date=$(date -d "today" +"%Y-%m-%d_%H:%M:%S")
server_all_len=${#server_all_list[*]}
i=0
while [ $i -lt $server_all_len ]
do
server_ip=$(echo ${server_all_list[$i]} | awk -F ':' '{print $1}')
server_port=$(echo ${server_all_list[$i]} | awk -F ':' '{print $2}')
if ping -c 1 $server_ip
then
echo "The server ${server_ip} can be connected!"
else
echo "The server ${server_ip} can not be connected!"
echo "Time: ${date} The server ${server_ip} can not be connected!" | mutt -s "[bad_news]:${server_ip}" 190133124@qq.com [email protected]
fi
let i++
done
给脚本添加执行权限
chmod +x /mydata/monitor/*.sh
定时启动:每隔1分钟执行
vi /etc/crontab
将需要执行的脚本添加在crontab最后一行:
* * * * * root /mydata/system_backup/monitor/monitor.sh
* * * * * root /mydata/system_backup/monitor/monitor_ping.sh
若没有安装crontab,则使用yum安装
yum install vixie-cron
重启crontab服务
service crond start
service crond stop
service crond restart
加入开机自动启动:
chkconfig --level 35 crond on