叶总给的脚本。
某应用系统在试运行期间,WEB端经常退出,为保证系统的应急响应,将WEB端部署成了5个节点,分别为zjgpwebnode1~5,并执行了监控进程:
一、创建监控执行脚本文件/home/monitor.sh,内容如下:
SHELL=/bin/bash
LANG=zh_CN.GB18030
SUPPORTED=zh_CN.GB18030:zh_CN:zh_CN.UTF-8:zh:en_US.UTF-8:en_US:en
LD_LIBRARY_PATH=/lib:/usr/lib:/lib64:/usr/lib64
export LANG
export SUPPORTED
export LD_LIBRARY_PATH
su - root >> /home/resetlog.log
LANG=zh_CN.GB18030
SUPPORTED=zh_CN.GB18030:zh_CN:zh_CN.UTF-8:zh:en_US.UTF-8:en_US:en
LD_LIBRARY_PATH=/lib:/usr/lib:/lib64:/usr/lib64
export LANG
export SUPPORTED
export LD_LIBRARY_PATH
WEBLOGIC1=`ps -ef|grep -c Dweblogic.Name=zjgpwebnode1`
WEBLOGIC2=`ps -ef|grep -c Dweblogic.Name=zjgpwebnode2`
WEBLOGIC3=`ps -ef|grep -c Dweblogic.Name=zjgpwebnode3`
WEBLOGIC4=`ps -ef|grep -c Dweblogic.Name=zjgpwebnode4`
WEBLOGIC5=`ps -ef|grep -c Dweblogic.Name=zjgpwebnode5`
WEBLOGIC6=`ps -ef|grep -c Dweblogic.Name=myserver`
if [ $WEBLOGIC1 != "2" ] ; then
date >> /home/resetlog.log
echo "Now Node1 is down!" >>/home/resetlog.log
locale >> /home/resetlog.log
set >> /home/resetlog.log
echo "Restart Node1 start" >>/home/resetlog.log
cd /home/weblogic/bea/user_projects/domains/webcluster/
nohup ./startManagedWebLogic.sh zjgpwebnode1 >1.out &
echo "Restart Node1 end" >>/home/resetlog.log
else
date >> /home/resetlog.log
echo "Node1 is OK!" >>/home/resetlog.log
fi
if [ $WEBLOGIC2 != "2" ] ; then
date >> /home/resetlog.log
echo "Now Node2 is down!" >>/home/resetlog.log
locale >> /home/resetlog.log
set >> /home/resetlog.log
echo "Restart Node2 start" >>/home/resetlog.log
cd /home/weblogic/bea/user_projects/domains/webcluster/
nohup ./startManagedWebLogic.sh zjgpwebnode2 >2.out &
echo "Restart Node2 end" >>/home/resetlog.log
else
date >> /home/resetlog.log
echo "Node2 is OK!" >>/home/resetlog.log
fi
if [ $WEBLOGIC3 != "2" ] ; then
date >> /home/resetlog.log
echo "Now Node3 is down!" >>/home/resetlog.log
locale >> /home/resetlog.log
set >> /home/resetlog.log
echo "Restart Node3 start" >>/home/resetlog.log
cd /home/weblogic/bea/user_projects/domains/webcluster/
nohup ./startManagedWebLogic.sh zjgpwebnode3 >3.out &
echo "Restart Node3 end" >>/home/resetlog.log
else
date >> /home/resetlog.log
echo "Node3 is OK!" >>/home/resetlog.log
fi
if [ $WEBLOGIC4 != "2" ] ; then
date >> /home/resetlog.log
echo "Now Node4 is down!" >>/home/resetlog.log
locale >> /home/resetlog.log
set >> /home/resetlog.log
echo "Restart Node4 start" >>/home/resetlog.log
cd /home/weblogic/bea/user_projects/domains/webcluster/
nohup ./startManagedWebLogic.sh zjgpwebnode4 >4.out &
echo "Restart Node4 end" >>/home/resetlog.log
else
date >> /home/resetlog.log
echo "Node4 is OK!" >>/home/resetlog.log
fi
if [ $WEBLOGIC5 != "2" ] ; then
date >> /home/resetlog.log
echo "Now Node5 is down!" >>/home/resetlog.log
locale >> /home/resetlog.log
echo "Restart Node5 start" >>/home/resetlog.log
cd /home/weblogic/bea/user_projects/domains/webcluster/
nohup ./startManagedWebLogic.sh zjgpwebnode5 >5.out &
echo "Restart Node5 end" >>/home/resetlog.log
else
date >> /home/resetlog.log
echo "Node5 is OK!" >>/home/resetlog.log
fi
if [ $WEBLOGIC6 != "2" ] ; then
date >> /home/resetlog.log
echo "Now TimerNode is down!" >>/home/resetlog.log
echo "Restart Timer Node start" >>/home/resetlog.log
cd /home/weblogic/bea/user_projects/domains/zjgpwebtimer
nohup ./startWebLogic.sh >> /home/logs/timer-web.log &
echo "Restart Timer Node end" >>/home/resetlog.log
else
date >> /home/resetlog.log
echo "Timer Node is OK!" >>/home/resetlog.log
fi
exit
请注意,该文件必须采用UltraEdit来编辑,并且保存时必须选择为unix格式进行保存,主要是回车换行符的问题,否则汇报“syntax error: unexpected end of file”错误;
在实际执行时发现,采用crond系统计划任务的方式进行运行时,环境变量与原来的root用户登录时不一致,导致系统有乱码产生,故在脚本中增加了业务系统运行所需的环境变量;
二、创建好monitor.sh文件后,必须在计划任务中添加该执行任务:
执行:crontab -e,添加
*/20 * * * * /home/monitor.sh
注:意思为每20分钟执行一次监控程序monitor.sh
三、添加完成后重启计划任务程序以使计划任务生效:
/sbin/service crond restart