描述:在192.168.200.201下安装websphere management节点和一个受管节点Custom01,在192.168.200.202上安装一个受管节点Custom02,启动192.168.200.201下management节点和受管节点Custom01,命令如下:./startManager.sh;./startNode.sh。然后试图启动192.168.200.202的受管节点Custom02,报如下错误:
************ Start Display Current Environment ************
Host Operating System is Linux, version 2.6.18-128.el5xen
Java version = J2RE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460-20080816_22093 (JIT enabled, AOT enabled)
J9VM - 20080816_022093_LHdSMr
JIT - r9_20080721_1330ifx2
GC - 20080724_AA, Java Compiler = j9jit24, Java VM name = IBM J9 VM
was.install.root = /opt/IBM/WebSphere/AppServer
user.install.root = /opt/IBM/WebSphere/AppServer/profiles/Custom01
Java Home = /opt/IBM/WebSphere/AppServer/java/jre
ws.ext.dirs = /opt/IBM/WebSphere/AppServer/java/lib:/opt/IBM/WebSphere/AppServer/classes:/opt/IBM/WebSphere/AppServer/lib:/opt/IBM/WebSphere/AppServer/installedChannels:/opt/IBM/WebSphere/AppServer/lib/ext:/opt/IBM/WebSphere/AppServer/web/help:/opt/IBM/WebSphere/AppServer/deploytool/itp/plugins/com.ibm.etools.ejbdeploy/runtime
Classpath = /opt/IBM/WebSphere/AppServer/profiles/Custom01/properties:/opt/IBM/WebSphere/AppServer/properties:/opt/IBM/WebSphere/AppServer/lib/startup.jar:/opt/IBM/WebSphere/AppServer/lib/bootstrap.jar:/opt/IBM/WebSphere/AppServer/lib/lmproxy.jar:/opt/IBM/WebSphere/AppServer/lib/urlprotocols.jar:/opt/IBM/WebSphere/AppServer/java/lib/tools.jar
Java Library path = /opt/IBM/WebSphere/AppServer/java/jre/lib/amd64/default:/opt/IBM/WebSphere/AppServer/java/jre/lib/amd64:/opt/IBM/WebSphere/AppServer/bin::/usr/lib
Current trace specification = *=info
************* End Display Current Environment *************
[12/8/10 16:17:11:018 GMT+08:00] 00000000 ManagerAdmin I TRAS0017I: The startup trace state is *=info.
[12/8/10 16:17:11:102 GMT+08:00] 00000000 AdminTool A ADMU3100I: Reading configuration for server: nodeagent
[12/8/10 16:17:11:210 GMT+08:00] 00000000 WsServerLaunc E ADMU3002E: Exception attempting to process server nodeagent
[12/8/10 16:17:11:212 GMT+08:00] 00000000 WsServerLaunc E ADMU3007E: Exception java.io.FileNotFoundException: /opt/IBM/WebSphere/AppServer/profiles/Custom01/config/cells/server-app-mh2Node01Cell/nodes/server-app-mh2Node01/servers/nodeagent/server.xml (No such file or directory)
到目录下找,的确文件不存在。
然后到网上查找原因,有人建议如下:
你的NODE是不是已经加入CELL,你有没有运行过ADDNODE.SH?
2.安装WAS补丁到6.1.0.29后重试
3.如果还不行,我建议你别为WAS管理上的问题折腾了,马上把DMGR, NODE的PROFILE统统删掉,为一个WAS管理的问题折腾一个星期不值得,立即用PMT.SH重建CELL和NODE。
于是运行./addNode.sh 192.168.200.201 8879,输入用户名、密码后,提示如下:
ADMU0001I: Begin federation of node newNode1 with Deployment Manager at
DeploymentMgr:8879.
ADMU0009I: Successfully connected to Deployment Manager Server: DeploymentMgr:8879
ADMU0505I: Servers found in configuration:
ADMU0506I: Server name: newServer1
ADMU2010I: Stopping all server processes for node newNode1
ADMU0512I: Server newServer1 cannot be reached. It appears to be stopped.
ADMU0024I: Deleting the old backup directory.
ADMU0015I: Backing up the original cell repository.
ADMU0012I: Creating Node Agent configuration for node: newNode1
ADMU0027E: An error occurred during federation ADMU0036E: The Deployment
Manager cannot lookup by name host newHost1 at address
10.230.21.184; rolling back to original configuration.
ADMU0211I: Error details may be seen in the file:
/…/profile/logs/addNode.log
ADMU0026I: An error occurred during federation; rolling back to original
configuration.
ADMU0111E: Program exiting with error:
于是继续搜索,找到一个解决方法:
Solution:
make sure newHost1 and the host of deploymentManager can be reached each other
or check both /etc/hosts to make sure have both hostName。
于是在两天机器上都添加对方的ip和主机名,方法如下:
vi /etc/hosts,录入
127.0.0.1 localhost.localdomain localhost
192.168.1.100 server1 server1
192.168.1.120 server2 server2
再次运行./addNode.sh 192.168.200.201 8879,执行成功。受管节点也跟着启动啦。一切0k.