Oracle 11R2 Grid Infrastructure执行root.sh脚本 execution failed的处理

Oracle在Redhat Linux 6.1上执行/u01/app/product/11.2.0/crs/root.sh脚本时报以下错误信息:

/u01/app/product/11.2.0/crs/bin/srvctl start nodeapps -n beiku1 ... failed
FirstNode configuration failed at /u01/app/product/11.2.0/crs/crs/install/ line 9379.
/u01/app/product/11.2.0/crs/perl/bin/perl -I/u01/app/product/11.2.0/crs/perl/lib -I/u01/app/product/11.2.0/crs/crs/install /u01/app/product/11.2.0/crs/crs/install/ execution failed

从上面的错误信息可以看到在执行srvctl start nodeapps -n bieku1时失败,尝试手动执行这个命令

[grid@beiku1 bin]$ ./srvctl start nodeapps -n beiku1
PRCR-1013 : Failed to start resource ora.ons
PRCR-1064 : Failed to start resource ora.ons on node beiku1
CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/beiku1/agent/crsd/oraagent_grid/oraagent_grid.log"
CRS-2674: Start of 'ora.ons' on 'beiku1' failed

错误信息是Start of 'ora.ons' on 'beiku1' failed,那么来检查$ORACLE_HOME/cfgtoollogs/crsconfig/rootcrs_$HOSTNAME.log日志文件

[grid@beiku1 crs]$ cd $ORACLE_HOME/cfgtoollogs/crsconfig/
[grid@beiku1 crsconfig]$ ls -lrt
total 332
-rwxrwxr-x 1 grid oinstall  81336 Aug 26 15:36 srvmcfg0.log
-rwxrwxr-x 1 grid oinstall  18719 Aug 26 15:36 srvmcfg1.log
-rwxrwxr-x 1 grid oinstall  23213 Aug 26 15:36 srvmcfg2.log
-rwxrwxr-x 1 grid oinstall  24700 Aug 26 15:36 srvmcfg3.log
-rwxrwxr-x 1 grid oinstall  10705 Aug 26 15:36 srvmcfg4.log
-rwxrwxr-x 1 grid oinstall  25594 Aug 26 15:37 srvmcfg5.log
-rwxrwxr-x 1 grid oinstall 132771 Aug 26 15:37 rootcrs_beiku1.log
[grid@beiku1 crsconfig]$ cat rootcrs_beiku1.log
2015-08-26 15:36:52: J2EE (OC4J) Container Resource Add Wallet ... passed ...
2015-08-26 15:36:52: Running as user grid: /u01/app/product/11.2.0/crs/bin/qosctl -autogenerate
2015-08-26 15:36:52: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/qosctl -autogenerate '
2015-08-26 15:36:54: Removing file /tmp/fileoriV8Q
2015-08-26 15:36:54: Successfully removed file: /tmp/fileoriV8Q
2015-08-26 15:36:54: /bin/su successfully executed

2015-08-26 15:36:54: qosctl output: User qosadmin added successfully.

User oc4jadmin added successfully.

2015-08-26 15:36:54: Running as user grid: /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user oc4jadmin
2015-08-26 15:36:54: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user oc4jadmin '
2015-08-26 15:36:55: Removing file /tmp/fileHsIIY7
2015-08-26 15:36:55: Successfully removed file: /tmp/fileHsIIY7
2015-08-26 15:36:55: /bin/su successfully executed

2015-08-26 15:36:55: Running as user grid: /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user qosadmin
2015-08-26 15:36:55: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user qosadmin '
2015-08-26 15:36:55: Removing file /tmp/fileQXtLZo
2015-08-26 15:36:55: Successfully removed file: /tmp/fileQXtLZo
2015-08-26 15:36:55: /bin/su successfully executed

2015-08-26 15:36:55: Invoking "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:55: trace file=/u01/app/product/11.2.0/crs/cfgtoollogs/crsconfig/srvmcfg5.log
2015-08-26 15:36:55: Running as user grid: /u01/app/product/11.2.0/crs/bin/srvctl add cvu
2015-08-26 15:36:55:   Invoking "/u01/app/product/11.2.0/crs/bin/srvctl add cvu" as user "grid"
2015-08-26 15:36:55: Executing /bin/su grid -c "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:55: Executing cmd: /bin/su grid -c "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:57: add cvu ... success
2015-08-26 15:36:57: starting nodeapps...
2015-08-26 15:36:57: DHCP_flag=0
2015-08-26 15:36:57: nodes_to_start=beiku1
2015-08-26 15:37:18: exit value of start nodeapps/vip is 1
2015-08-26 15:37:18: output for start nodeapps is  PRCR-1013 : Failed to start resource ora.ons PRCR-1064 : Failed to start resource ora.ons on node beiku1 CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/beiku1/agent/crsd/oraagent_grid/oraagent_grid.log" CRS-2674: Start of 'ora.ons' on 'beiku1' failed
2015-08-26 15:37:18: output of startnodeapp after removing already started mesgs is PRCR-1013 : Failed to start resource ora.ons PRCR-1064 : Failed to start resource ora.ons on node beiku1 CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/beiku1/agent/crsd/oraagent_grid/oraagent_grid.log" CRS-2674: Start of 'ora.ons' on 'beiku1' failed
2015-08-26 15:37:18: /u01/app/product/11.2.0/crs/bin/srvctl start nodeapps -n beiku1 ... failed

检查I $GRID_HOME/opmn/logs/ons.log.*文件,看是否有以下错误:

[grid@beiku1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@beiku1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.beiku1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@beiku1 logs]$ cat ons.log.beiku1
[2015-08-26T15:37:02+08:00] [internal] getaddrinfo(::0, 6200, 1) failed (Hostname and service name not provided or found): Connection timed out

如果存在上面的错误信息,那么原因就是/etc/hosts文件中localhost对应的IP地址不是127.0.0.1。解决方法如就是确保DNS和/etc/hosts文件正确设置了localhost,DNS或/etc/hosts文件依赖于(/etc/nsswitch.conf, or /etc/netsvc.conf depend on platform),这些配置文件中的命名解决方案的设置,可以参考MOS中的ID 942166.1 or ID 969254.1文档来进行处理。


[grid@beiku1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@beiku1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.beiku1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@beiku1 logs]$ cat ons.log.beiku1
[2015-08-26T15:37:02+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:37:02+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)
[2015-08-26T15:39:42+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:39:42+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)
[2015-08-26T15:48:40+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:48:40+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)


[grid@beiku1 logs]$ grep port $ORACLE_HOME/opmn/conf/ons.config
localport=6100          # line added by Agent
remoteport=6200         # line added by Agent

[root@beiku1 /]# lsof | grep 6200 | grep LISTEN
ons       16413      grid    6u     IPv6     162533                  TCP *:6200 (LISTEN)

可以看到进程ID16413的ons进程占用了6200端口,解决方法是确保这个端口不被其它进行所占用,如果是在执行 rootupgrade.sh脚本进行升级之前被占用,那么可能的原因是旧版本的ons进程还在运行。


[grid@beiku1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@beiku1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.beiku1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@beiku1 logs]$ cat ons.log.beiku1
[2015-08-26T15:48:40+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:48:40+08:00] [ons] [ERROR:1] [17] [ons-listener] 0000:0000:0000:0000:0000:0000:0000:0001,6100: BIND (Cannot assign requested address)

这种情况可能是IPV6被部分配置了,11gR2 Grid Infrastructure不支持IPv6。解决方法就是在$GRID_HOME/opmn/conf/ons.config and ons.config.文件中设置下面的参数:


[root@beiku1 /]# lsof | grep 6200 | grep LISTEN
ons       16413      grid    6u     IPv6     162533                  TCP *:6200 (LISTEN)
[root@beiku1 /]# kill -9 16413


[root@beiku1 /]# ./u01/app/product/11.2.0/crs/
Performing root user operation for Oracle 11g

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/product/11.2.0/crs

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/product/11.2.0/crs/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
PRKO-2190 : VIP exists for node beiku1, VIP name beiku1-vip
Preparing packages for installation...
Configure Oracle Grid Infrastructure for a Cluster ... succeeded



