DataStage Job Monitor problems

问题现场:

 

              DATASTAGE director中,不显示抽取流状态,都是READY!

 

问题分析:

 

             此问题属于DATASTAGE的JobMonApp服务没有启动或者没有正确启动;

 

解决办法:

 

             1.手工启停服务,方法:到目录:/datastage/pxengine/java/这个目录下,执行:./jobmoninit start 

 

                同时查看日志:jobmonapp.log

 

             2.如果这里解决不了,还有一个原因就是/ETC/HOSTS这个文件总,缺少 127.0.0.1   localhost 这一行!

 

备份:参加

 

  

Problem(Abstract)
DataStage job run statistics (i.e. rows per second processed) do not update in DataStage Designer or Director clients.
 
 
Diagnosing the problem
This section contains a quick series of diagnosis steps for those familiar with DataStage and Job monitor. If more detail is needed for any step, please refer to the more detailed instructions in the "Resolving the Problem" section:
If customer has checked and verified that the following:
  1. Confirm the JobMonApp process is up and running: 
    ps -ef | grep JobMonApp
  2. Confirm default ports 13400 and 13401 are listening:
    netstat -an | grep 134
  3. Check job monitor log file for errors:
    cat /ibm/InformationServer/Server/PXEngine/java/JobMonApp.log
  4. Confirm job monitor is setup to use ports 13400 and 13401
    cat /ibm/InformationServer/Server/PXEngine/etc/jobmon_ports
  5. If job monitor log shows no errors but job log reports "Failed to connect to JobMonApp on port 13401" then update jobmon_ports file to use 2 new ports which are not already in use. This will require restart of DataStage.
  6. If problem still occurs, confirm that /etc/hosts file contains the following entry
    127.0.0.1 localhost
    without a localhost entry, Job Monitor will be unable to use the ports correctly.

 
 
Resolving the problem
DataStage jobs generate statistics (such as rows per second processed) which can be displayed on each link when a job is run via Designer. However, these statistics only update when the job monitor application, JobMonApp, is running.
JobMonApp is started with command jobmoninit script located in directory:
.../ibm/InformationServer/Server/PXEngine/java


Verify that JobMonApp is running

On Unix systems, you can enter the following command to confirm if JobMonApp process is running:
ps -ef | grep JobMonApp
to confirm if the job monitor is running. On Windows systems, you can enter "ksh" command first to obtain a command shell prompt (if installed) from where you can enter the ps command. If the job monitor is running, you should see a process entry with long description including the command string which started it and the 2 active ports requested. You will see a second entry also which is from the "grep" command itself, so you should see at least 2 matches if the job monitor is running, i.e.:
$ ps -ef | grep JobMonApp

  SYSTEM   4964   4044  0 09:22:27 con  0:00 C:/IBM/InformationServer/ASBNode/apps/jre/bin/java -Xrs -classpath C:/IBM/InformationServer/Server/PXEngine/java/JobMonApp.jar;C:/IBM/InformationServer/Server/PXEngine/java/xerces/xmlParserAPIs.jar;C:/IBM/InformationServer/Server/PXEngine/java/xerces/xercesImpl.jar JobMonApp
 13400 13401 -debug
  dsadm    7788   4136  0 16:48:50 con  0:00 grep JobMonApp

However, on some platforms such as Solaris, the PS output may be truncated, and since JobMonApp appears at the end of string, the above command may not find a match even though the job is running. In this situation, you can instead look in the .jobmonpid file, located in same directory as jobmoninit. The .jobmonpid file contains the process id last used for JobMonApp. You can then query that process id to see if it is running, i.e. if .jobmonpid file contains 3174, then enter command:
$ ps -ef | grep 3174

If JobMonApp is not running, then run the "jobmoninit" command script to restart it. 


Check for errors in JobMonApp.log

If JobMonApp is running, but your jobs do not update statistics, then the next place to check for an error is the JobMonApp.log file written to the above directory. Historical logs are also saved in the same directory. During a normal startup, JobMonApp requires that is 2 defined ports be available, not used by other programs. One port is used to communicate with the job, while the other port is used to communicate with the DataStage engine. Both ports must be available, so if startup message indicates one port available and one has conflict, then Job Monitor will not function correctly.

A normal startup log will appear as follows:
WELCOME to the Job Mon Application.
Tue Jun 30 09:22:28 PDT 2009
Using ports: 13400 and 13401

A startup with port conflict will instead contain:
Tue May 19 13:48:19 CDT 2009 
Using ports: 13400 and 13401 
Could not listen on port: 13400 Address already in use

Additionally, if the failing port is used to communicate with the running job, it may cause an additional error to appear in the job log:
Failed to connect to JobMonApp on port 13401


Resolving port conflicts for JobMonApp

To resolve port conflict issues for JobMonApp, use the following command to determine current usage for each port used by job monitor, i.e.:
netstat -a | grep 134
Confirm that the output shows both ports 13400 and 13401.

If the ports are found, they should have a status of "LISTENING". If the status is CLOSE_WAIT or something else, it could indicate that an older instance of DataStage or JobMonApp did not successfully release the port. While some operating systems have commands to force the release of the port or to kill an application holding the port, in some cases it may take a system restart to free the port.

If this port conflict continues even after a system restart, then multiple applications may have been setup to use this port. If you are running multiple DataStage instances on one server, you should check the /etc/services file to confirm your ports have not been allocated to multiple applications. Then look at the following file:
.../ibm/InformationServer/Server/PXEngine/etc/jobmon_ports
This file contains 2 variables that define the ports used by JobMonApp:
APT_JOBMON_PORT1=13400
APT_JOBMON_PORT2=13401
For systems with multiple instances of DataStage running, ensure that each instance is using a separate set of ports for the job monitor application.

If two DataStage instances are using the same job monitor ports, you will need to update this file for one instance. When the above 2 values are changed, both DataStage and JobMonApp will need to be stopped and restarted before the change takes effect.

Also confirm that your /etc/hosts file on DataStage server machine contains the following entry:
127.0.0.1 localhost
Without a localhost definition, the job monitor may not be able to communicate correctly on the above ports.

If no port conflict exists, and no port errors are found in the JobMonApp.log file, but the log does contain other errors, it may be necessary to contact Information Server technical support if the error message does not give a clear cause of the problem. 


Running JobMonApp with debug output

If no errors appear in log file, or if more detailed error messages are needed, you can run JobMonApp in debug mode. To enable debug output you will modify jobmoninit, so first create a backup copy of jobmoninit. Next, edit jobmoninit and find the section of script for your current operating system, and then locate a line similar to:
nohup $APT_ORCHHOME/java/jre/bin/java -classpath $CLASSPATH JobMonApp $jobmon_port1 $jobmon_port2 > $logfile 2>&1 &
Add the option "-debug" after second port, i.e.:
nohup $APT_ORCHHOME/java/jre/bin/java -classpath $CLASSPATH JobMonApp $jobmon_port1 $jobmon_port2 -debug > $logfile 2>&1 &
Some platforms will have 2 commands listed, one with -Xrs option and one without. In that situation you can update both lines. After making this change, stop and restart the job monitor application. Additional debug output should now appear in the log file which may provide more insight into cause of JobMonApp problems.


Contacting technical support for job monitor problems

When contacting technical support with a job monitor issue, provide the following files and details:
  • OS platform/release info
  • Problem symptoms/errors
  • Version.xml
  • JobMonApp.log
  • jobmon_ports file
  • /etc/services file
  • output of command: ps -ef | grep JobMonApp
  • output of command: netstat -a

你可能感兴趣的:(File,command,application,output,conflict,statistics)