第六章、报错处理

6.1 root.sh脚本执行失败处理

安装grid时,执行rootsh脚本报错如下:

Adding Clusterware entries to inittab

USM driver install actions failed

/u01/app/11.2.0/grid/perl/bin/perl-I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install/u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed

查看mos发现为一个bug,ROOT.SH OR ACFSROOT INSTALL,FAILS: ACFS-9109: SLES11 SP3。分别在节点1与节点2上打补丁。

unzip p17475946_112040_Linux-x86-64.zip

grid@jason2:/u01/app/11.2.0/grid/OPatch>./opatch napply -oh /u01/app/11.2.0/grid -local /mnt/17475946

Oracle Interim Patch Installer version11.2.0.3.4

Copyright (c) 2012, Oracle Corporation.  All rights reserved.

Oracle Home       : /u01/app/11.2.0/grid

Central Inventory : /u01/app/oraInventory

  from           :/u01/app/11.2.0/grid/oraInst.loc

OPatch version    : 11.2.0.3.4

OUI version       : 11.2.0.4.0

Log file location :/u01/app/11.2.0/grid/cfgtoollogs/opatch/opatch2015-05-15_09-04-45AM_1.log

Verifying environment and performingprerequisite checks...

OPatch continues with these patches:   17475946 

Do you want to proceed? [y|n]

y

User Responded with: Y

All checks passed.

Please shutdown Oracle instances runningout of this ORACLE_HOME on the local system.

(Oracle Home = '/u01/app/11.2.0/grid')

Is the local system ready for patching?[y|n]

y

User Responded with: Y

Backing up files...

Applying interim patch '17475946' to OH'/u01/app/11.2.0/grid'

Patching component oracle.usm,11.2.0.4.0...

Verifying the update...

Patch 17475946 successfully applied.

Log file location:/u01/app/11.2.0/grid/cfgtoollogs/opatch/opatch2015-05-15_09-04-45AM_1.log

OPatch succeeded.

grid@jason2:/u01/app/11.2.0/grid/OPatch>

6.2 Crsctl执行关闭时报错

Crsctl执行关闭时报错。

jason1:/u01/app/11.2.0/grid/bin# ./crsctl stop crs

CRS-2791: Starting shutdownof Oracle High AvailabilityServices-managed resources on 'jason1'

CRS-2673: Attempting to stop'ora.crsd' on 'jason1'

CRS-2790: Starting shutdownof Cluster Ready Services-managedresources on 'jason1'

CRS-2673: Attempting to stop'ora.OCR.dg' on 'jason1'

CRS-2673: Attempting to stop'ora.registry.acfs' on 'jason1'

CRS-2673: Attempting to stop'ora.oracle.db' on 'jason1'

CRS-2673: Attempting to stop'ora.LISTENER_SCAN1.lsnr' on'jason1'

CRS-2673: Attempting to stop'ora.cvu' on 'jason1'

CRS-2673: Attempting to stop'ora.LISTENER.lsnr' on 'jason1'

CRS-2673: Attempting to stop'ora.oc4j' on 'jason1'

CRS-2677: Stop of 'ora.LISTENER.lsnr'on 'jason1' succeeded

CRS-2673: Attempting to stop'ora.jason1.vip' on 'jason1'

CRS-2677: Stop of 'ora.cvu'on 'jason1' succeeded

CRS-2672: Attempting to start'ora.cvu' on 'jason2'

CRS-2677: Stop of'ora.LISTENER_SCAN1.lsnr' on 'jason1'succeeded

CRS-2673: Attempting to stop'ora.scan1.vip' on 'jason1'

CRS-2677: Stop of'ora.jason1.vip' on 'jason1' succeeded

CRS-2672: Attempting to start'ora.jason1.vip' on 'jason2'

CRS-2676: Start of 'ora.cvu'on 'jason2' succeeded

CRS-2677: Stop of 'ora.scan1.vip'on 'jason1' succeeded

CRS-2672: Attempting to start'ora.scan1.vip' on 'jason2'

CRS-2677: Stop of'ora.oracle.db' on 'jason1' succeeded

CRS-2673: Attempting to stop'ora.DATA.dg' on 'jason1'

CRS-2673: Attempting to stop'ora.FRA.dg' on 'jason1'

CRS-2677: Stop of'ora.registry.acfs' on 'jason1' succeeded

CRS-2676: Start of'ora.jason1.vip' on 'jason2' succeeded

CRS-2676: Start of'ora.scan1.vip' on 'jason2' succeeded

CRS-2672: Attempting to start'ora.LISTENER_SCAN1.lsnr' on'jason2'

CRS-2677: Stop of'ora.OCR.dg' on 'jason1' succeeded

CRS-2677: Stop of'ora.FRA.dg' on 'jason1' succeeded

CRS-2677: Stop of'ora.DATA.dg' on 'jason1' succeeded

CRS-2673: Attempting to stop'ora.asm' on 'jason1'

CRS-2676: Start of'ora.LISTENER_SCAN1.lsnr' on 'jason2'succeeded

CRS-2677: Stop of 'ora.asm'on 'jason1' succeeded

CRS-2677: Stop of 'ora.oc4j'on 'jason1' succeeded

CRS-2672: Attempting to start'ora.oc4j' on 'jason2'

CRS-2676: Start of 'ora.oc4j'on 'jason2' succeeded

CRS-2673: Attempting to stop'ora.ons' on 'jason1'

CRS-2677: Stop of 'ora.ons'on 'jason1' succeeded

CRS-2673: Attempting to stop'ora.net1.network' on 'jason1'

CRS-2677: Stop of'ora.net1.network' on 'jason1' succeeded

CRS-2792: Shutdown of ClusterReady Services-managed resourceson 'jason1' has completed

CRS-2677: Stop of 'ora.crsd'on 'jason1' succeeded

CRS-2673: Attempting to stop'ora.crf' on 'jason1'

CRS-2673: Attempting to stop'ora.ctssd' on 'jason1'

CRS-2673: Attempting to stop'ora.evmd' on 'jason1'

CRS-2673: Attempting to stop'ora.asm' on 'jason1'

CRS-2673: Attempting to stop'ora.mdnsd' on 'jason1'

CRS-2673: Attempting to stop'ora.drivers.acfs' on 'jason1'

CRS-2677: Stop of 'ora.crf'on 'jason1' succeeded

CRS-2677: Stop of 'ora.evmd'on 'jason1' succeeded

CRS-2677: Stop of 'ora.mdnsd'on 'jason1' succeeded

CRS-2677: Stop of 'ora.ctssd'on 'jason1' succeeded

CRS-5022: Stop of resource"ora.drivers.acfs" failed:current state is "UNKNOWN"

CRS-2675: Stop of'ora.drivers.acfs' on 'jason1' failed

CRS-2677: Stop of 'ora.asm'on 'jason1' succeeded

CRS-2673: Attempting to stop'ora.cluster_interconnect.haip' on'jason1'

CRS-2677: Stop of'ora.cluster_interconnect.haip' on 'jason1'succeeded

CRS-2673: Attempting to stop'ora.cssd' on 'jason1'

CRS-2677: Stop of 'ora.cssd'on 'jason1' succeeded

CRS-2673: Attempting to stop'ora.gipcd' on 'jason1'

CRS-2677: Stop of 'ora.gipcd'on 'jason1' succeeded

CRS-2673: Attempting to stop'ora.gpnpd' on 'jason1'

CRS-2677: Stop of 'ora.gpnpd'on 'jason1' succeeded

CRS-2799: Failed toshut down resource 'ora.drivers.acfs' on 'jason1'

CRS-2795: Shutdownof Oracle High Availability Services-managedresources on 'jason1' has failed

CRS-4687: Shutdowncommand has completed with errors.

CRS-4000: CommandStop failed, or completed with errors.

jason1:/u01/app/11.2.0/grid/bin#

    查看crs日志发现每次关闭时否有如下日志存在

CRS-10001:07-Jan-16 23:23ACFS-9118: oracleacfs.ko driver in use - cannot unload

证明每次在关闭资源ora.drivers.acfs时,还有程序在使用该模块导致关闭失败。查看MOS内容如下:

SLES: 11.2.0.3  "crsctl stop crs" Fails to Stop ora.drivers.acfs With CRS-2675 (文档  ID 1417294.1)



 

In this Document

APPLIES TO:
 Oracle Database - Enterprise Edition - Version 11.2.0.3 and later
 SUSE \ UnitedLinux x86-64

SYMPTOMS

11.2.0.3 Grid Infrastructure  on Suse Linux, "crsctl stop crs" or "crsctl stop crs -f"  fails:

·          "crsctl stop crs" or "crsctl stop crs -f"  output

CRS-2675: Stop of  'ora.drivers.acfs' on 'racnode1' failed
 ..
 CRS-2799: Failed  to shut down resource 'ora.drivers.acfs' on 'racnode1'
 CRS-2795: Shutdown of Oracle High Availability Services-managed resources on  'racnode1' has failed
 CRS-4687: Shutdown command has completed with errors.
 CRS-4000: Command Stop failed, or completed with errors.

·          $GRID_HOME/log//alert.log

[client(4936)]CRS-10001:01-Feb-12  12:35 ACFS-9290: Waiting for ASM to shutdown.
 [client(5098)]CRS-10001:01-Feb-12 12:35 ACFS-9118: oracleacfs.ko driver in  use - cannot unload.
 [client(5117)]CRS-10001:01-Feb-12 12:35 ACFS-9118: oracleacfs.ko driver in  use - cannot unload.
 
 /log/racnode1/agent/ohasd/orarootagent_root/orarootagent_root.log"

·          "$GRID_HOME/bin/acfsload stop" output

..
 ACFS-9119: oracleacfs driver failed to unload.

·          $GRID_HOME/log//agent/ohasd/orarootagent_root/orarootagent_root.log

2012-02-15 09:44:43.219:  [    AGFW][2097063696] {0:0:36062} Agent received the message:  RESOURCE_STOP[ora.drivers.acfs 1 1] ID 4099:217493
 2012-02-15 09:44:43.219: [    AGFW][2097063696]  {0:0:36062} Preparing STOP command for: ora.drivers.acfs 1 1
 2012-02-15 09:44:43.219: [    AGFW][2097063696] {0:0:36062}  ora.drivers.acfs 1 1 state changed from: ONLINE to: STOPPING
 2012-02-15 09:44:43.220: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  (:CLSN00108:) clsn_agent::stop {
 2012-02-15 09:44:43.220: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  Arg Value = stop
 2012-02-15 09:44:43.220: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  getOracleHomeAttrib: oracle_home = /ocw/grid
 2012-02-15 09:44:43.220: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  Utils::getCrsHome crsHome /ocw/grid
 ..
 2012-02-15 09:44:43.220: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  Adding Environment Variables _ORA_AGENT_ACTION=TRUE
 2012-02-15 09:44:43.220: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  Adding Environment Variables __IS_HASD_AGENT=
 2012-02-15 09:44:43.220: [ora.drivers.acfs][2094962448] {0:0:36062} [stop] Utils:execCmd  action = 2 flags = 6 ohome = (null) cmdname = acfsload.
 2012-02-15 09:44:43.220: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  getOracleHomeAttrib: oracle_home = /ocw/grid
 2012-02-15 09:44:43.321: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  (:CLSN00010:)ACFS-9290: Waiting for ASM to shutdown.
 2012-02-15 09:44:43.321: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  (:CLSN00010:)
 ..
 2012-02-15 09:44:53.433: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  (:CLSN00010:)ACFS-9118:  oracleacfs.ko driver in use - cannot unload.
 2012-02-15 09:44:53.433: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  (:CLSN00010:)
 2012-02-15 09:44:53.433: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  execCmd ret = 0
 2012-02-15 09:44:53.434: [ora.drivers.acfs][2094962448] {0:0:36062} [stop]  (:CLSN00108:) clsn_agent::stop }
 2012-02-15 09:44:53.434: [    AGFW][2094962448] {0:0:36062}  Command: stop  for resource: ora.drivers.acfs 1 1 completed with status: SUCCESS

CAUSE

The issue was investigated  in bug 13726093 and it's closed as OS  issue

SOLUTION


 Engage OS vendor to disable software that scans ACFS devices.

The temporary workaround is  to disable the resource if ACFS is not needed. To disable:

# $GRID_HOME/bin/acfsroot disable


 Once it's disabled and node rebooted, it's status will be

ora.registry.acfs
            ONLINE   OFFLINE      racnode1

 

Note: 

It's known that multipath daemon may open /dev/ofsctl:

lsof /dev/ofsctl
 COMMAND PID USER  FD TYPE DEVICE SIZE/OFF NODE NAME
 multipath 5356  root 58r BLK 251,0 0t0 41683748 /dev/ofsctl

Adding ofsctl to multipath blacklist solves the issue.

 


 for findability: rootcrs.pl -unlock ; roothas.pl -unlock

REFERENCES

BUG:13606669 - CRSCTL STOP  CLUSTERWARE FAILS TO STOP, ACFS-9118
 
BUG:13613644 - CRSCTL STOP CRS IS  NOT STOPPING THE ORA.DRIVERS.ACFS RESOURCE 11.2.0.3
 
BUG:13726093 - CRSCTL STOP CRS  FAILS TO STOP ACFS
 
BUG:13736590 - CRS-2799: FAILED TO  SHUT DOWN RESOURCE 'ORA.DRIVERS.ACFS'
 
BUG:13810374 - FAILS TO STOP ACFS  RESOURCE IN 11.2.0.3

 

    因本实例中使用multipath绑定硬盘作为ASM磁盘,故multipath服务打开/dev/ofsctl设备,所以导致ora.drivers.acfs失败,最终整个关闭命令完成带有错误。根据提示在multipath配置文件中设置

blacklist{

   devnode "^sda"

   devnode "ofsctl"

}

设置后,再次重试关闭命令问题解决。