通过Nagios监控Weblogic服务

                    通过Nagios监控Weblogic服务 

转自:http://skymax.blog.51cto.com/365901/101603

 1. 前言本文主要介绍如何通过Nagios软件来监控Weblogic服务运行状况,其中主要包括Weblogic Server以及Weblogic JDBC Pool的运行状态。Nagios的插件中本身并不提供对于Weblogic服务监控的功能,所以要根据Nagios Plugin API编写自己的脚本,扩展其插件,完成我们所需要的功能。对于Weblogic运行状态信息的获得需通过JMX本文参考了Nagios3的官方文档中有关Nagios Plugin部分,以及Weblogic官方文档有关JMX和命令行部分,具体的Weblogic版本是8.14  2. Nagios Plugin API概述作为一个Nagios插件,无论你是用脚本(如shellperl)还是用c编译后的可执行程序实现,它必须至少完成两件事,1、退出时有一个返回值。2、至少向标准输出设备(STDOUT)输出一行文本。返回值定义:

Plugin Return Code Service State Host State
0 OK UP
1 WARNING UP or DOWN/UNREACHABLE*
2 CRITICAL DOWN/UNREACHABLE
3 UNKNOWN DOWN/UNREACHABLE

输出文本至少要一行,其信息主要反映被监控应用、服务的状态。例如:DISK OK - free space: / 3326 MB (56%);  3. 监控Weblogic的实现方法对于Weblogic运行状况的获得,我们是通过命令行的方式实现的,通过调用Weblogicweblogic.Admin类实现的。这个类的功能很强大,可以通过它管理和配置Weblogic以下介绍几个常用的命令写法。1、获得server运行状态

$ java weblogic.Admin -url ${URL} -username ${USER_NAME} -password ${PASS_WORD} get -pretty \        -mbean "${DOMAIN_NAME}:Location=${SERVER_NAME},Name=${SERVER_NAME},Type=ServerRuntime”

 2、获得JDBC Pool运行状态

$ java weblogic.Admin -url ${URL} -username ${USER_NAME} -password ${PASS_WORD} GET -pretty \        -mbean "${DOMAIN_NAME}:Location=${SERVER_NAME},Name=${POOL_NAME},ServerRuntime=${SERVER_NAME},Type=JDBCConnectionPoolRuntime"

将***标记部分的变量替换成相应真实环境值即可。

${URL} weblogicURL,例如t3://192.168.1.2:7002
${USER_NAME} 用户名
${PASS_WORD} 密码
${DOMAIN_NAME} weblogic域的名称,如mydomain
${SERVER_NAME} Server
${POOL_NAME} JDBC Pool名称

在运行上述命令前需要设置JAVA_HOME,并且将$JAVA_HOME/bin添加到PATH中,将weblogicweblogic81/server/lib/weblogic.jar包添加到CLASSPATH中。  4. 具体实现的shell脚本有了监控的方法,根据Nagios Plugin API规则编写自己的shell实现脚本。具体的shell脚本如下:check_wls.sh

#!/bin/ksh #check_wls.sh --jdbcpool url username password domainname servername poolname#check_wls.sh --server url username password domainname servername  PROGNAME=`basename $0`PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`REVISION=`echo '$Revision: 1749 $' | sed -e 's/[^0-9.]//g'` . $PROGPATH/utils.sh print_usage() {  echo "Usage:"  echo "  $PROGNAME --jdbcpool url username password domainname servername poolname  echo "  $PROGNAME --server url username password domainname servername  echo "  $PROGNAME --help"  echo "  $PROGNAME --version"} print_help() {  print_revision $PROGNAME $REVISION  echo ""  print_usage  echo ""  echo "Check Weblogic status"  echo ""  echo "--jdbcpool url username password domainname servername poolname"  echo "   Check Weblogic JDBC Pool"  echo "--server url username password domainname servername"  echo "   Check Weblogic Server" } if [[ -z "$JAVA_HOME" ]]then  echo "Please set JAVA_HOME!"  exit $STATE_UNKNOWNfi if [[ -z "$CLASSPATH" ]]then  echo "Please set CLASSPATH!"  exit $STATE_UNKNOWNelse  echo $CLASSPATH | grep "weblogic.jar" | wc -l | read N  if [[ "$N" = "0" ]]  then    echo "Please add weblogic.jar to CLASSPATH!"    exit $STATE_UNKNOWN  fifi PATH=$JAVA_HOME/bin:$PATHexport PATH JDBC_TYPE="JDBCConnectionPoolRuntime"SERVER_TYPE="ServerRuntime" cmd="$1" # Information optionscase "$cmd" in--help)    print_help    exit $STATE_OK    ;;-h)    print_help    exit $STATE_OK    ;;--version)    print_revision $PROGNAME $REVISION    exit $STATE_OK    ;;-V)    print_revision $PROGNAME $REVISION    exit $STATE_OK    ;;esac case "$cmd" in--server)    URL=${2}    USER_NAME=${3}    PASS_WORD=${4}    DOMAIN_NAME=${5}    SERVER_NAME=${6}    SERVER_INFO="${DOMAIN_NAME}:${SERVER_NAME}"       RE=`java weblogic.Admin -url ${URL} -username ${USER_NAME} -password ${PASS_WORD} get -pretty \        -mbean "${DOMAIN_NAME}:Location=${SERVER_NAME},Name=${SERVER_NAME},Type=${SERVER_TYPE}"`    printf "${RE}" | grep ^"-" | wc -l | read N       if [[ "$N" -lt  "1" ]]    then      #error      printf "${RE}" | awk '{ printf $0 }' | read ERR_INFO      echo "CRITICAL - ${ERR_INFO}"      exit $STATE_CRITICAL         fi       if [[ "$N" -ge  "1" ]]    then      HEALTH_STATE=""          RUN_STATE=""      #HealthState State      printf "${RE}" | while read NAME VALUE      do        #PoolState WaitingForConnectionCurrentCount State        #echo "NAME:${NAME} VALUE:${VALUE}"        case "${NAME}" in        HealthState:)          HEALTH_STATE=${VALUE}        ;;        State:)          RUN_STATE=${VALUE}        ;;        esac      done           #echo "HEALTH_STATE:${HEALTH_STATE}"      #echo "RUN_STATE:${RUN_STATE}"           HEALTH_STATE_INFO=${HEALTH_STATE}           echo ${HEALTH_STATE_INFO} | awk -F, '{ print $1 }' | awk -F: '{ print $2 }' | read HEALTH_STATE                 #echo "HEALTH_STATE:${HEALTH_STATE}"      #HEALTH_OK HEALTH_WARN HEALTH_CRITICAL HEALTH_FAILED           if [[ "${RUN_STATE}" != "RUNNING" ]]      then        echo "CRITICAL - ${SERVER_INFO} State is ${RUN_STATE}"        exit $STATE_CRITICAL       fi           case "${HEALTH_STATE}" in      EALTH_OK)             ;;      HEALTH_WARN)        echo "WARN - ${SERVER_INFO} HealthState is ${HEALTH_STATE_INFO}"        exit $STATE_WARNING      ;;      HEALTH_CRITICAL)        echo "CRITICAL - ${SERVER_INFO} HealthState is ${HEALTH_STATE_INFO}"        exit $STATE_CRITICAL      ;;      HEALTH_FAILED)        echo "FAILED - ${SERVER_INFO} HealthState is ${HEALTH_STATE_INFO}"        exit $STATE_CRITICAL      ;;      esac         fi    echo "OK - ${SERVER_INFO} State is ${RUN_STATE},HealthState is ${HEALTH_STATE_INFO}"    exit $STATE_OK  ;;--jdbcpool)    URL=${2}    USER_NAME=${3}    PASS_WORD=${4}    DOMAIN_NAME=${5}    SERVER_NAME=${6}    POOL_NAME=${7}    POOL_INFO="${DOMAIN_NAME}:${SERVER_NAME}:${POOL_NAME}"    RE=`java weblogic.Admin -url ${URL} -username ${USER_NAME} -password ${PASS_WORD} GET -pretty \        -mbean "${DOMAIN_NAME}:Location=${SERVER_NAME},Name=${POOL_NAME},ServerRuntime=${SERVER_NAME},Type=${JDBC_TYPE}"`       printf "${RE}" | grep ^"-" | wc -l | read N       if [[ "$N" -lt  "1" ]]    then         #error      printf "${RE}" | awk '{ printf $0 }' | read ERR_INFO      echo "CRITICAL - ${ERR_INFO}"      exit $STATE_CRITICAL    fi       if [[ "$N" -ge  "1" ]]    then      POOL_STATE=""      WAIT_CNT=""      RUN_STATE=""      printf "${RE}" | while read NAME VALUE      do        #PoolState WaitingForConnectionCurrentCount State        #echo "NAME:${NAME} VALUE:${VALUE}"        case "${NAME}" in        PoolState:)          POOL_STATE=${VALUE}        ;;        WaitingForConnectionCurrentCount:)          WAIT_CNT=${VALUE}        ;;        State:)          RUN_STATE=${VALUE}        ;;        esac      done      #echo "POOL_STATE:${POOL_STATE}"      #echo "WAIT_CNT:${WAIT_CNT}"      #echo "RUN_STATE:${RUN_STATE}"      if [[ "${POOL_STATE}" != "true" ]]      then        echo "CRITICAL - ${POOL_INFO} PoolState is ${POOL_STATE}"        exit $STATE_CRITICAL      fi           if [[ "${RUN_STATE}" != "Running" ]]      then        echo "CRITICAL - ${POOL_INFO} State is ${RUN_STATE}"        exit $STATE_CRITICAL      fi           if [[ "${WAIT_CNT}" -gt "0" ]]      then        echo "WARNING - ${POOL_INFO} WaitingForConnectionCurrentCount is ${WAIT_CNT}"        exit $STATE_WARNING      fi         else      #error      printf "${RE}" | awk '{ printf $0 }' | read ERR_INFO      echo "CRITICAL - ${ERR_INFO}"      exit $STATE_CRITICAL    fi    echo "OK - ${POOL_INFO} State is ${RUN_STATE},PoolState is ${POOL_STATE},WaitingForConnectionCurrentCount is ${WAIT_CNT}"    exit $STATE_OK    ;;*)    print_usage    exit $STATE_UNKNOWN    ;;esac 

  5. 配置Weblogic监控check_wls.sh上传到Nagios软件的libexec目录下,并创建一个ln文件check_wls

$ ln -s ./check_wls.sh ./check_wls

nrpe的配置文件中增加相关的命令定义。Weblogic的具体配置信息如下,

${URL} t3://172.17.1.2:7001
${USER_NAME} weblogic
${PASS_WORD} weblogic
${DOMAIN_NAME} mydomain
${SERVER_NAME} myserver
${POOL_NAME} mypool

编辑nrpe.cfg文件,增加如下内容,

$ vi ./nrpe.cfg... .... ... .... ... .... ... .... ... .... ... ....#check weblogic [check_wls]command[check_wls_server_myserver]=/usr/local/nagios//libexec/check_wls --server t3://172.2.10.2:7001 weblogic weblogic mydomain myserver command[check_wls_jdbcpool_mypool]=/usr/local/nagios//libexec/check_wls --jdbcpool t3://172.2.10.2:7001 weblogic weblogic mydomain myserver mypool

nrpe的启动脚本中添加环境变量(CLASSPATHJAVA_HOME

... .... ... .... ... .... ... .... ... .... ... ....JAVA_HOME=/data/bea/bea/jdk142_05export JAVA_HOMECLASSPATH=/data/bea/bea/weblogic81/server/lib/weblogic.jarexport CLASSPATH... .... ... .... ... .... ... .... ... .... ... ....

编辑监控主机的nagios.cfg文件,添加如下内容。

$ vi ./nagios.cfg... .... ... .... ... .... ... .... ... .... ... ....# Define a host for the local machine define host{        use                     linux-box            ; Name of host template to use              ; This host definition will inherit all variables that are defined              ; in (or inherited by) the linux-server host template definition.        host_name               sol_172.2.10.2        alias                   sol_172.2.10.2        address                 172.2.10.2        } #the check_wls_server_myserver on the remote host.define service{        use                     generic-service        host_name               sol_172.2.10.2        service_description     Weblogic Server myserver        check_command           check_nrpe!check_wls_server_myserver        }       #the check_wls_jdbcpool_mypool on the remote host.define service{        use                     generic-service        host_name               sol_172.2.10.2        service_description     Weblogic JDBCPool mypool        check_command           check_nrpe!check_wls_jdbcpool_mypool        }

验证配置是否正确。重启监控主机上的nagios服务以及远程主机上的nrpe服务。通过IE观察监控情况。

spacer.gif
5.1

就此配置工作完成。 6. 结语本文介绍了一种通过Nagios监控Weblogic应用的实现方式,按照Nagios Plugin API规则编写自己的Shell脚本实现该功能,并简单的描述了配置过程,提供了Shell源码。希望大家指正。


你可能感兴趣的:(通过Nagios监控Weblogic服务)