Oracle 11g R2 Clusterware新特性 – HAIP详解

 ( 十二月 18, 2012 at 上午 8:23)

前段时间在聊RAC私网(Private Network)冗余技术的时候,朋友们大多搬出Bonding技术,当时我说了个HAIP,但朋友中几乎没有对它了解的。当时,我简单介绍了下HAIP特性,今天则更细一点去了解。

HAIP全称为Highly Available Virtual IP,是Oracle从11.2.0.2版本开始提供的私有网络冗余技术,这是Oracle独有的技术,也就是说HAIP特性仅限于Oracle软件使用。

MOS上有个非常详细地文档ID 1210883.1介绍了HAIP,其中对HAIP的描述如下:
Redundant Interconnect without any 3rd-party IP failover technology (bond, IPMP or similar) is supported natively by Grid Infrastructure starting from 11.2.0.2.  Multiple private network adapters can be defined either during the installation phase or afterward using the oifcfg.  Oracle Database, CSS, OCR, CRS, CTSS, and EVM components in 11.2.0.2 employ it automatically.

Grid Infrastructure can activate a maximum of four private network adapters at a time even if more are defined. The ora.cluster_interconnect.haip resource will start one to four link local  HAIP on private network adapters for interconnect communication for Oracle RAC, Oracle ASM, and Oracle ACFS etc.

Grid automatically picks free link local addresses from reserved 169.254.*.* subnet for HAIP. According to RFC-3927, link local subnet 169.254.*.* should not be used for any other purpose. With HAIP, by default, interconnect traffic will be load balanced across all active interconnect interfaces, and corresponding HAIP address will be failed over transparently to other adapters if one fails or becomes non-communicative. .

The number of HAIP addresses is decided by how many private network adapters are active when Grid comes up on the first node in the cluster .  If there's only one active private network, Grid will create one; if two, Grid will create two; and if more than two, Grid will create four HAIPs. The number of HAIPs won't change even if more private network adapters are activated later, a restart of clusterware on all nodes is required for the number to change, however, the newly activated adapters can be used for fail over purpose. 
 
上文中第一句就说,在没有任何第三方IP故障技术(Bond,IPMP等)的支持下,Grid Infrastructure在11.2.0.2版本开始出现了HAIP特性。多个私网适配器在安装期间或者后期通过oifcfg工具来定义。在Oracle数据库中,从11.2.0.2开始CSS、OCR、CRS、CTSS和EVM组件都将自动使用这个特性。

GI最多支持4个有效地私网配置,所以定义4个以上没有任何意义。ora.cluster_interconnect.haip资源为Oracle RAC、ORACLE ASM和ORACLE ACFS等通过私网进行内部通信的组件创建1到4个HAIP地址,这些IP地址表现为169.254.*.*,因此需要保证169.254.*.*在我们的网络环境里避免使用。在下面将会给出我的环境里HAIP内容。

当GI集群中第一个节点启动的时候,根据活动的私网适配器个数来决定HAIP地址数,尽管只有一个私有网络,GI依然会分配一个HAIP地址,如果有多余两个私网适配器,GI将创建4个HAIP。4个是目前最多支持的个数,因此即使有更多的私网适配器被GI激活,HAIP个数也不再改变。

HAIP介绍到这里,大家可能已经猜想出了它的优点:
1)私有网络的带宽提升
2)负载均衡提高整个私有网络的通信能力
3)当其中一个私网借口通信失败将会透明移动HAIP
4)hosts文件里不需要配置Private IP地址的解析

OK,下面就开始从我的RAC环境里看一下HAIP相关的内容

环境介绍:

  • 3-NODES Oracle Database 11g R2 RAC(With ASM) on Linux 5
  • VERSION : 11.2.0.3
  • HAIP : 配置了3个私网适配器

首先,我们在GI的安装期间,从OUI可以看到配置私网的界面,如下图:
 

[root@rac1 ~]# cd $GRID_HOME/bin
root@rac1 bin]# ./crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac1                     Started             
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       rac1                                         
ora.crf
      1        ONLINE  ONLINE       rac1                                         
ora.crsd
      1        ONLINE  ONLINE       rac1                                         
ora.cssd
      1        ONLINE  ONLINE       rac1                                         
ora.cssdmonitor
      1        ONLINE  ONLINE       rac1                                         
ora.ctssd
      1        ONLINE  ONLINE       rac1                     ACTIVE:0            
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.drivers.acfs
      1        ONLINE  ONLINE       rac1                                         
ora.evmd
      1        ONLINE  ONLINE       rac1                                         
ora.gipcd
      1        ONLINE  ONLINE       rac1                                         
ora.gpnpd
      1        ONLINE  ONLINE       rac1                                         
ora.mdnsd
      1        ONLINE  ONLINE       rac1 

-- 从以上信息中我们可以发现这两行:
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       rac1  

即为HAIP资源状态为ONLINE

上面环境介绍里已经说明,我的这套RAC是配置了3个私网适配器。

首先查看下网络接口配置信息:
[root@rac1 bin]# ./oifcfg getif
eth0  192.168.53.0  global  public
eth1  10.0.53.0  global  cluster_interconnect
eth2  10.0.53.0  global  cluster_interconnect
eth3  10.0.53.0  global  cluster_interconnect
-- 可以看出3个有效私网接口eth1eth2eth3

再查看下详细地网络接口信息:
[root@rac1 bin]# ./oifcfg iflist -p -n
eth0  192.168.53.0  PRIVATE  255.255.255.0
eth1  10.0.53.0  PRIVATE  255.255.255.0
eth1  169.254.192.0  UNKNOWN  255.255.192.0
eth1  169.254.0.0  UNKNOWN  255.255.192.0
eth2  10.0.53.0  PRIVATE  255.255.255.0
eth2  169.254.64.0  UNKNOWN  255.255.192.0
eth3  10.0.53.0  PRIVATE  255.255.255.0
eth3  169.254.128.0  UNKNOWN  255.255.192.0

查看主机接口信息:
[root@rac1 bin]# ip a
...部分省略...
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:89:8d:da brd ff:ff:ff:ff:ff:ff
    inet 10.0.53.10/24 brd 10.0.53.255 scope global eth1
    inet 169.254.199.113/18 brd 169.254.255.255 scope global eth1:1
    inet 169.254.25.108/18 brd 169.254.63.255 scope global eth1:2
    inet6 fe80::a00:27ff:fe89:8dda/64 scope link 
       valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:ad:77:7d brd ff:ff:ff:ff:ff:ff
    inet 10.0.53.11/24 brd 10.0.53.255 scope global eth2
    inet 169.254.81.139/18 brd 169.254.127.255 scope global eth2:1
    inet6 fe80::a00:27ff:fead:777d/64 scope link 
       valid_lft forever preferred_lft forever
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:27:0f:39 brd ff:ff:ff:ff:ff:ff
    inet 10.0.53.111/24 brd 10.0.53.255 scope global eth3
    inet 169.254.154.185/18 brd 169.254.191.255 scope global eth3:1
    inet6 fe80::a00:27ff:fe27:f39/64 scope link 
       valid_lft forever preferred_lft forever
...部分省略...

以下为这套RAC最近一次启动的时候告警日志里留下的HAIP相关信息:
Private Interface 'eth1:2' configured from GPnP for use as a private interconnect.
  [name='eth1:2', type=1, ip=169.254.25.108, mac=08-00-27-89-8d-da, net=169.254.0.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Private Interface 'eth2:1' configured from GPnP for use as a private interconnect.
  [name='eth2:1', type=1, ip=169.254.81.139, mac=08-00-27-ad-77-7d, net=169.254.64.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Private Interface 'eth3:1' configured from GPnP for use as a private interconnect.
  [name='eth3:1', type=1, ip=169.254.154.185, mac=08-00-27-27-0f-39, net=169.254.128.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Private Interface 'eth1:1' configured from GPnP for use as a private interconnect.
  [name='eth1:1', type=1, ip=169.254.199.113, mac=08-00-27-89-8d-da, net=169.254.192.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Public Interface 'eth0' configured from GPnP for use as a public interface.
  [name='eth0', type=1, ip=192.168.53.1, mac=08-00-27-03-53-1b, net=192.168.53.0/24, mask=255.255.255.0, use=public/1]
Public Interface 'eth0:1' configured from GPnP for use as a public interface.
  [name='eth0:1', type=1, ip=192.168.53.11, mac=08-00-27-03-53-1b, net=192.168.53.0/24, mask=255.255.255.0, use=public/1]
Public Interface 'eth0:2' configured from GPnP for use as a public interface.
  [name='eth0:2', type=1, ip=192.168.53.6, mac=08-00-27-03-53-1b, net=192.168.53.0/24, mask=255.255.255.0, use=public/1]
Picked latch-free SCN scheme 3
...中间部分参数等相关内容略...
Cluster communication is configured to use the following interface(s) for this instance
  169.254.25.108
  169.254.81.139
  169.254.154.185
  169.254.199.113


另外,为了介绍单一私有网络环境,我把我另外一个RAC环境里的内容摘了下来:

私网配置信息:
[grid@pos1 ~]$ oifcfg getif
eth2  10.0.53.0  global  cluster_interconnect
bond0  192.168.53.0  global  public

详细查看网络接口信息:
[grid@pos1 ~]$ oifcfg iflist -p -n
eth2  10.0.53.0  PRIVATE  255.255.255.0
eth2  169.254.0.0  UNKNOWN  255.255.0.0
eth3  10.0.0.0  PRIVATE  255.255.255.0
bond0  192.168.53.0  PRIVATE  255.255.255.0

查看主机接口信息:
[grid@pos1 ~]$ /sbin/ip a
...部分省略...
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether e8:39:35:a8:3f:5a brd ff:ff:ff:ff:ff:ff
    inet 10.0.51.10/24 brd 10.0.51.255 scope global eth2
    inet 169.254.138.86/16 brd 169.254.255.255 scope global eth2:1
...部分省略...

我们也可以从动态视图v$cluster_interconnects里查看到GI自动分配的HAIP地址信息
SQL> select name, ip_address from v$cluster_interconnects;

NAME                           IP_ADDRESS
------------------------------ --------------------------------

eth2:1                         169.254.138.86

下面看下私网失败的情况

为模拟失败,我手动down掉了eth3接口
[root@rac1 bin]# ifdown eth3

查看网络接口信息,已经看不到eth3相关内容了
[root@rac1 bin]# ./oifcfg iflist -p -n
eth0  192.168.53.0  PRIVATE  255.255.255.0
eth1  10.0.53.0  PRIVATE  255.255.255.0
eth1  169.254.0.0  UNKNOWN  255.255.192.0
eth1  169.254.128.0  UNKNOWN  255.255.192.0
eth2  10.0.53.0  PRIVATE  255.255.255.0
eth2  169.254.64.0  UNKNOWN  255.255.192.0
eth2  169.254.192.0  UNKNOWN  255.255.192.0

查看主机信息,发现原本在eth3接口上的HAIP地址169.254.154.185/18已经透明飘逸到了eht1接口上
[root@rac1 bin]# ip a
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:89:8d:da brd ff:ff:ff:ff:ff:ff
    inet 10.0.53.10/24 brd 10.0.53.255 scope global eth1
    inet 169.254.25.108/18 brd 169.254.63.255 scope global eth1:2
    inet 169.254.154.185/18 brd 169.254.191.255 scope global eth1:3
    inet6 fe80::a00:27ff:fe89:8dda/64 scope link 
       valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:ad:77:7d brd ff:ff:ff:ff:ff:ff
    inet 10.0.53.11/24 brd 10.0.53.255 scope global eth2
    inet 169.254.81.139/18 brd 169.254.127.255 scope global eth2:1
    inet 169.254.199.113/18 brd 169.254.255.255 scope global eth2:2
    inet6 fe80::a00:27ff:fead:777d/64 scope link 
       valid_lft forever preferred_lft forever

我们从一些日志里能够获取到私网通信失败相关的日志内容:
[root@rac1 ~]# cd $GRID_HOME/log/rac1/ohasd
[root@rac1 ohasd]# tail -f ohasd.log 
...抓取相关内容…
2012-12-18 04:45:49.455: [GIPCHGEN][1147304256]gipchaInterfaceFail: marking interface failing 0x9fae090 { host '', haName 'CLSFRAME_rac-cluster', local (nil), ip '10.0.53.111', subnet '10.0.53.0', mask '255.255.255.0', mac '08-00-27-27-0f-39', ifname 'eth3', numRef 0, numFail 0, idxBoot 0, flags 0x184d }
2012-12-18 04:45:49.713: [GIPCHGEN][1128393024]gipchaInterfaceDisable: disabling interface 0x9fae090 { host '', haName 'CLSFRAME_rac-cluster', local (nil), ip '10.0.53.111', subnet '10.0.53.0', mask '255.255.255.0', mac '08-00-27-27-0f-39', ifname 'eth3', numRef 0, numFail 0, idxBoot 0, flags 0x19cd }
2012-12-18 04:45:49.713: [GIPCHDEM][1128393024]gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x9fae090 { host '', haName 'CLSFRAME_rac-cluster', local (nil), ip '10.0.53.111', subnet '10.0.53.0', mask '255.255.255.0', mac '08-00-27-27-0f-39', ifname 'eth3', numRef 0, numFail 0, idxBoot 0, flags 0x19ed }


[root@rac1 ~]# cd $GRID_HOME/log/rac1/agent/ohasd/orarootagent_root/
[root@rac1 orarootagent_root]# tail -f orarootagent_root.log 
...抓取相关内容…
2012-12-18 04:45:48.168: [ USRTHRD][1132529984] {0:0:2} failed to receive ARP request
2012-12-18 04:45:48.168: [ USRTHRD][1132529984] {0:0:2} Assigned IP 169.254.154.185 no longer valid on inf eth3
2012-12-18 04:45:48.202: [ USRTHRD][1132529984] {0:0:2} VipActions::startIp {
2012-12-18 04:45:48.231: [ USRTHRD][1132529984] {0:0:2} Adding 169.254.154.185 on eth3:1
2012-12-18 04:45:48.235: [ USRTHRD][1132529984] {0:0:2} VipActions::startIp }
2012-12-18 04:45:48.236: [ USRTHRD][1132529984] {0:0:2} Reassigned IP:  169.254.154.185 on interface eth3
2012-12-18 04:45:48.404: [ USRTHRD][1126226240] {0:0:2} HAIP:  Updating member info HAIP1;10.0.53.0#0;10.0.53.0#1
2012-12-18 04:45:48.422: [ USRTHRD][1126226240] {0:0:2} HAIP:  Moving ip '169.254.154.185' from inf 'eth3' to inf 'eth1'
2012-12-18 04:45:48.422: [ USRTHRD][1126226240] {0:0:2} pausing thread
2012-12-18 04:45:48.423: [ USRTHRD][1126226240] {0:0:2} posting thread
2012-12-18 04:45:48.423: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]start {
2012-12-18 04:45:48.432: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]start }
2012-12-18 04:45:48.432: [ USRTHRD][1126226240] {0:0:2} HAIP:  Moving ip '169.254.199.113' from inf 'eth1' to inf 'eth2'
2012-12-18 04:45:48.432: [ USRTHRD][1126226240] {0:0:2} pausing thread
2012-12-18 04:45:48.432: [ USRTHRD][1126226240] {0:0:2} posting thread
2012-12-18 04:45:48.432: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]start {
2012-12-18 04:45:48.432: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]start }
2012-12-18 04:45:48.433: [ USRTHRD][1140934976] {0:0:2} [NetHAWork] thread started
2012-12-18 04:45:48.433: [ USRTHRD][1140934976] {0:0:2}  Arp::sCreateSocket { 
2012-12-18 04:45:48.433: [ USRTHRD][1143036224] {0:0:2} [NetHAWork] thread started
2012-12-18 04:45:48.433: [ USRTHRD][1143036224] {0:0:2}  Arp::sCreateSocket { 
2012-12-18 04:45:48.435: [ USRTHRD][1140934976] {0:0:2}  Arp::sCreateSocket } 
2012-12-18 04:45:48.436: [ USRTHRD][1140934976] {0:0:2} Starting Probe for ip 169.254.154.185
2012-12-18 04:45:48.436: [ USRTHRD][1140934976] {0:0:2} Transitioning to Probe State
2012-12-18 04:45:48.436: [ USRTHRD][1143036224] {0:0:2}  Arp::sCreateSocket } 
2012-12-18 04:45:48.436: [ USRTHRD][1143036224] {0:0:2} Starting Probe for ip 169.254.199.113
2012-12-18 04:45:48.436: [ USRTHRD][1143036224] {0:0:2} Transitioning to Probe State
2012-12-18 04:45:48.619: [ USRTHRD][1140934976] {0:0:2}  Arp::sProbe { 
2012-12-18 04:45:48.619: [ USRTHRD][1140934976] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:45:48.620: [ USRTHRD][1140934976] {0:0:2}  Arp::sProbe } 
2012-12-18 04:45:49.327: [ USRTHRD][1143036224] {0:0:2}  Arp::sProbe { 
2012-12-18 04:45:49.327: [ USRTHRD][1143036224] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:45:49.328: [ USRTHRD][1143036224] {0:0:2}  Arp::sProbe } 
2012-12-18 04:45:50.390: [ USRTHRD][1140934976] {0:0:2}  Arp::sProbe { 
2012-12-18 04:45:50.390: [ USRTHRD][1140934976] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:45:50.391: [ USRTHRD][1140934976] {0:0:2}  Arp::sProbe } 
2012-12-18 04:45:50.802: [ USRTHRD][1143036224] {0:0:2}  Arp::sProbe { 
2012-12-18 04:45:50.803: [ USRTHRD][1143036224] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:45:50.803: [ USRTHRD][1143036224] {0:0:2}  Arp::sProbe } 
2012-12-18 04:45:52.373: [ USRTHRD][1140934976] {0:0:2}  Arp::sProbe { 
2012-12-18 04:45:52.373: [ USRTHRD][1140934976] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:45:52.388: [ USRTHRD][1140934976] {0:0:2}  Arp::sProbe } 
2012-12-18 04:45:52.388: [ USRTHRD][1140934976] {0:0:2} Transitioning to Announce State
2012-12-18 04:45:52.548: [ USRTHRD][1143036224] {0:0:2}  Arp::sProbe { 
2012-12-18 04:45:52.548: [ USRTHRD][1143036224] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:45:52.549: [ USRTHRD][1143036224] {0:0:2}  Arp::sProbe } 
2012-12-18 04:45:52.549: [ USRTHRD][1143036224] {0:0:2} Transitioning to Announce State
2012-12-18 04:45:54.374: [ USRTHRD][1140934976] {0:0:2}  Arp::sAnnounce { 
2012-12-18 04:45:54.374: [ USRTHRD][1140934976] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:45:54.374: [ USRTHRD][1140934976] {0:0:2}  Arp::sAnnounce } 
2012-12-18 04:45:54.547: [ USRTHRD][1143036224] {0:0:2}  Arp::sAnnounce { 
2012-12-18 04:45:54.547: [ USRTHRD][1143036224] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:45:54.548: [ USRTHRD][1143036224] {0:0:2}  Arp::sAnnounce } 
2012-12-18 04:45:56.370: [ USRTHRD][1140934976] {0:0:2}  Arp::sAnnounce { 
2012-12-18 04:45:56.370: [ USRTHRD][1140934976] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:45:56.371: [ USRTHRD][1140934976] {0:0:2}  Arp::sAnnounce } 
2012-12-18 04:45:56.371: [ USRTHRD][1140934976] {0:0:2} Transitioning to Defend State
2012-12-18 04:45:56.371: [ USRTHRD][1140934976] {0:0:2} VipActions::startIp {
2012-12-18 04:45:56.371: [ USRTHRD][1140934976] {0:0:2} Adding 169.254.154.185 on eth1:3
2012-12-18 04:45:56.371: [ USRTHRD][1140934976] {0:0:2} VipActions::startIp }
2012-12-18 04:45:56.371: [ USRTHRD][1140934976] {0:0:2} Assigned IP:  169.254.154.185 on interface eth1
2012-12-18 04:45:56.421: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]stop {
2012-12-18 04:45:56.546: [ USRTHRD][1143036224] {0:0:2}  Arp::sAnnounce { 
2012-12-18 04:45:56.546: [ USRTHRD][1143036224] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:45:56.547: [ USRTHRD][1143036224] {0:0:2}  Arp::sAnnounce } 
2012-12-18 04:45:56.547: [ USRTHRD][1143036224] {0:0:2} Transitioning to Defend State
2012-12-18 04:45:56.547: [ USRTHRD][1143036224] {0:0:2} VipActions::startIp {
2012-12-18 04:45:56.547: [ USRTHRD][1143036224] {0:0:2} Adding 169.254.199.113 on eth2:2
2012-12-18 04:45:56.547: [ USRTHRD][1143036224] {0:0:2} VipActions::startIp }
2012-12-18 04:45:56.547: [ USRTHRD][1143036224] {0:0:2} Assigned IP:  169.254.199.113 on interface eth2
2012-12-18 04:45:56.750: [ USRTHRD][1132529984] {0:0:2} [NetHAWork] thread stopping
2012-12-18 04:45:56.750: [ USRTHRD][1132529984] {0:0:2} Thread:[NetHAWork]isRunning is reset to false here
2012-12-18 04:45:56.788: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]stop }
2012-12-18 04:45:56.788: [ USRTHRD][1126226240] {0:0:2} VipActions::stopIp {
2012-12-18 04:45:56.788: [ USRTHRD][1126226240] {0:0:2} NetInterface::sStopIp {
2012-12-18 04:45:56.788: [ USRTHRD][1126226240] {0:0:2} Stopping ip '169.254.154.185', inf 'eth3', mask '10.0.53.0'
2012-12-18 04:45:56.825: [ USRTHRD][1126226240] {0:0:2} NetInterface::sStopIp }
2012-12-18 04:45:56.825: [ USRTHRD][1126226240] {0:0:2} VipActions::stopIp }
2012-12-18 04:45:56.825: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]stop {
2012-12-18 04:45:57.085: [ USRTHRD][1136732480] {0:0:2} [NetHAWork] thread stopping
2012-12-18 04:45:57.085: [ USRTHRD][1136732480] {0:0:2} Thread:[NetHAWork]isRunning is reset to false here
2012-12-18 04:45:57.097: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]stop }
2012-12-18 04:45:57.097: [ USRTHRD][1126226240] {0:0:2} VipActions::stopIp {
2012-12-18 04:45:57.097: [ USRTHRD][1126226240] {0:0:2} NetInterface::sStopIp {
2012-12-18 04:45:57.097: [ USRTHRD][1126226240] {0:0:2} Stopping ip '169.254.199.113', inf 'eth1', mask '10.0.53.0'
2012-12-18 04:45:57.097: [ USRTHRD][1126226240] {0:0:2} NetInterface::sStopIp }
2012-12-18 04:45:57.097: [ USRTHRD][1126226240] {0:0:2} VipActions::stopIp }
2012-12-18 04:45:57.102: [ USRTHRD][1126226240] {0:0:2} USING HAIP[  0 ]:  eth1 - 169.254.154.185 
2012-12-18 04:45:57.102: [ USRTHRD][1126226240] {0:0:2} USING HAIP[  1 ]:  eth2 - 169.254.199.113 
2012-12-18 04:45:57.102: [ USRTHRD][1126226240] {0:0:2} USING HAIP[  2 ]:  eth1 - 169.254.25.108 
2012-12-18 04:45:57.102: [ USRTHRD][1126226240] {0:0:2} USING HAIP[  3 ]:  eth2 - 169.254.81.139 



[root@rac1 ~]# cd $GRID_HOME/log/rac1/cssd/
[root@rac1 cssd]# tail -f ocssd.log 
...抓取相关内容…
2012-12-18 04:45:48.391: [ GIPCNET][1094244672] gipcmodNetworkProcessSend: [network]  failed send attempt endp 0x1940a3d0 [0000000000000428] { gipcEndpoint : localAddr 'udp://10.0.53.111:44337', remoteAddr '', numPend 5, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x2, usrFlags 0x4000 }, req 0x2aaaac02d000 [000000000039a324] { gipcSendRequest : addr 'udp://10.0.53.2:10745', data 0x2aaaac03e4b8, len 272, olen 0, parentEndp 0x1940a3d0, ret gipcretEndpointNotAvailable (40), objFlags 0x0, reqFlags 0x2 }
2012-12-18 04:45:48.391: [ GIPCNET][1094244672] gipcmodNetworkProcessSend: slos op  :  sgipcnValidateSocket
2012-12-18 04:45:48.391: [ GIPCNET][1094244672] gipcmodNetworkProcessSend: slos dep :  Invalid argument (22)
2012-12-18 04:45:48.391: [ GIPCNET][1094244672] gipcmodNetworkProcessSend: slos loc :  address not 
2012-12-18 04:45:48.391: [ GIPCNET][1094244672] gipcmodNetworkProcessSend: slos info:  addr '10.0.53.111:44337', len 272, buf 0x2aaaac03e4b8, cookie 0x2aaaac02d000
2012-12-18 04:45:48.391: [GIPCXCPT][1094244672] gipcInternalSendSync: failed sync request, ret gipcretEndpointNotAvailable (40)
2012-12-18 04:45:48.391: [GIPCXCPT][1094244672] gipcSendSyncF [gipchaLowerInternalSend : gipchaLower.c : 781]: EXCEPTION[ ret gipcretEndpointNotAvailable (40) ]  failed to send on endp 0x1940a3d0 [0000000000000428] { gipcEndpoint : localAddr 'udp://10.0.53.111:44337', remoteAddr '', numPend 5, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x2, usrFlags 0x4000 }, addr 0x191edab0 [0000000000001325] { gipcAddress : name 'udp://10.0.53.2:10745', objFlags 0x0, addrFlags 0x1 }, buf 0x2aaaac03e4b8, len 272, flags 0x0
2012-12-18 04:45:48.391: [GIPCHGEN][1094244672] gipchaInterfaceFail: marking interface failing 0x194a8c10 { host 'rac2', haName 'CSS_rac-cluster', local 0x193f9b20, ip '10.0.53.2:10745', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0x6 }
2012-12-18 04:45:48.391: [GIPCHALO][1094244672] gipchaLowerInternalSend: failed to initiate send on interface 0x194a8c10 { host 'rac2', haName 'CSS_rac-cluster', local 0x193f9b20, ip '10.0.53.2:10745', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0x86 }, hctx 0x18e1b240 [0000000000000010] { gipchaContext : host 'rac1', name 'CSS_rac-cluster', luid '8068c448-00000000', numNode 2, numInf 3, usrFlags 0x0, flags 0x67 }
2012-12-18 04:45:48.392: [GIPCHGEN][1094244672] gipchaInterfaceFail: marking interface failing 0x194a3680 { host 'rac3', haName 'CSS_rac-cluster', local 0x193f9b20, ip '10.0.53.3:40268', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 3, flags 0x6 }
2012-12-18 04:45:48.392: [GIPCHGEN][1094244672] gipchaInterfaceDisable: disabling interface 0x193f9b20 { host '', haName 'CSS_rac-cluster', local (nil), ip '10.0.53.111', subnet '10.0.53.0', mask '255.255.255.0', mac '08-00-27-27-0f-39', ifname 'eth3', numRef 0, numFail 2, idxBoot 0, flags 0x194d }
2012-12-18 04:45:48.392: [GIPCHGEN][1094244672] gipchaInterfaceDisable: disabling interface 0x194a8c10 { host 'rac2', haName 'CSS_rac-cluster', local 0x193f9b20, ip '10.0.53.2:10745', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0x86 }
2012-12-18 04:45:48.392: [GIPCHGEN][1094244672] gipchaInterfaceDisable: disabling interface 0x194a3680 { host 'rac3', haName 'CSS_rac-cluster', local 0x193f9b20, ip '10.0.53.3:40268', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 3, flags 0x86 }
2012-12-18 04:45:48.392: [GIPCHALO][1094244672] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x194a8c10 { host 'rac2', haName 'CSS_rac-cluster', local 0x193f9b20, ip '10.0.53.2:10745', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0xa6 }
2012-12-18 04:45:48.392: [GIPCHGEN][1094244672] gipchaInterfaceReset: resetting interface 0x194a8c10 { host 'rac2', haName 'CSS_rac-cluster', local 0x193f9b20, ip '10.0.53.2:10745', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0xa6 }
2012-12-18 04:45:48.392: [GIPCHALO][1094244672] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x194a3680 { host 'rac3', haName 'CSS_rac-cluster', local 0x193f9b20, ip '10.0.53.3:40268', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 3, flags 0xa6 }
2012-12-18 04:45:48.393: [GIPCHGEN][1094244672] gipchaInterfaceReset: resetting interface 0x194a3680 { host 'rac3', haName 'CSS_rac-cluster', local 0x193f9b20, ip '10.0.53.3:40268', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 3, flags 0xa6 }
2012-12-18 04:45:48.705: [GIPCHDEM][1094244672] gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x193f9b20 { host '', haName 'CSS_rac-cluster', local (nil), ip '10.0.53.111', subnet '10.0.53.0', mask '255.255.255.0', mac '08-00-27-27-0f-39', ifname 'eth3', numRef 0, numFail 0, idxBoot 0, flags 0x196d }
2012-12-18 04:45:48.705: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote interface for node 'rac2', haName 'CSS_rac-cluster', inf 'udp://10.0.53.2:10745'
2012-12-18 04:45:48.705: [GIPCHGEN][1094244672] gipchaWorkerAttachInterface: Interface attached inf 0x194a8c10 { host 'rac2', haName 'CSS_rac-cluster', local 0x193fd310, ip '10.0.53.2:10745', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 4, flags 0x6 }
2012-12-18 04:45:48.705: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote interface for node 'rac3', haName 'CSS_rac-cluster', inf 'udp://10.0.53.3:40268'
2012-12-18 04:45:48.705: [GIPCHGEN][1094244672] gipchaWorkerAttachInterface: Interface attached inf 0x194a3680 { host 'rac3', haName 'CSS_rac-cluster', local 0x193fd310, ip '10.0.53.3:40268', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 3, flags 0x6 }
2012-12-18 04:45:49.831: [    CSSD][1125521728]clssnmSendingThread: sending status msg to all nodes

2012-12-18 04:45:49.831: [    CSSD][1125521728]clssnmSendingThread: sent 4 status msgs to all nodes


OK,我们手动up一下eth3以模拟恢复

[root@rac1 bin]# ./oifcfg iflist -p -n
eth0  192.168.53.0  PRIVATE  255.255.255.0
eth1  10.0.53.0  PRIVATE  255.255.255.0
eth1  169.254.0.0  UNKNOWN  255.255.192.0
eth1  169.254.192.0  UNKNOWN  255.255.192.0
eth2  10.0.53.0  PRIVATE  255.255.255.0
eth2  169.254.64.0  UNKNOWN  255.255.192.0
eth3  10.0.53.0  PRIVATE  255.255.255.0
eth3  169.254.128.0  UNKNOWN  255.255.192.0


[root@rac1 bin]# ip a
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:89:8d:da brd ff:ff:ff:ff:ff:ff
    inet 10.0.53.10/24 brd 10.0.53.255 scope global eth1
    inet 169.254.25.108/18 brd 169.254.63.255 scope global eth1:2
    inet 169.254.199.113/18 brd 169.254.255.255 scope global eth1:1
    inet6 fe80::a00:27ff:fe89:8dda/64 scope link 
       valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:ad:77:7d brd ff:ff:ff:ff:ff:ff
    inet 10.0.53.11/24 brd 10.0.53.255 scope global eth2
    inet 169.254.81.139/18 brd 169.254.127.255 scope global eth2:1
    inet6 fe80::a00:27ff:fead:777d/64 scope link 
       valid_lft forever preferred_lft forever
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 08:00:27:27:0f:39 brd ff:ff:ff:ff:ff:ff
    inet 10.0.53.111/24 brd 10.0.53.255 scope global eth3
    inet 169.254.154.185/18 brd 169.254.191.255 scope global eth3:1
    inet6 fe80::a00:27ff:fe27:f39/64 scope link 
       valid_lft forever preferred_lft forever

ohasd.log日志内容:
2012-12-18 04:58:54.490: [GIPCHGEN][1147304256]gipchaNodeAddInterface: adding interface information for inf 0x9fae090 { host '', haName 'CLSFRAME_rac-cluster', local (nil), ip '10.0.53.111', subnet '10.0.53.0', mask '255.255.255.0', mac '08-00-27-27-0f-39', ifname 'eth3', numRef 0, numFail 0, idxBoot 0, flags 0x1841 }
2012-12-18 04:58:54.943: [GIPCHTHR][1128393024]gipchaWorkerCreateInterface: created local interface for node 'rac1', haName 'CLSFRAME_rac-cluster', inf 'udp://10.0.53.111:36909'
2012-12-18 04:58:54.944: [GIPCHTHR][1128393024]gipchaWorkerCreateInterface: created local bootstrap multicast interface for node 'rac1', haName 'CLSFRAME_rac-cluster', inf 'mcast://224.0.0.251:42424/10.0.53.111'
2012-12-18 04:58:54.944: [GIPCHTHR][1128393024]gipchaWorkerCreateInterface: created local bootstrap multicast interface for node 'rac1', haName 'CLSFRAME_rac-cluster', inf 'mcast://230.0.1.0:42424/10.0.53.111'
2012-12-18 04:58:54.944: [GIPCHTHR][1128393024]gipchaWorkerCreateInterface: created local bootstrap broadcast interface for node 'rac1', haName 'CLSFRAME_rac-cluster', inf 'udp://10.0.53.255:42424'


orarootagent_root.log日志内容:
2012-12-18 04:58:54.911: [ USRTHRD][1126226240] {0:0:2} HAIP:  Updating member info HAIP1;10.0.53.0#0;10.0.53.0#1;10.0.53.0#2
2012-12-18 04:58:54.932: [ USRTHRD][1126226240] {0:0:2} HAIP:  Moving ip '169.254.154.185' from inf 'eth1' to inf 'eth3'
2012-12-18 04:58:54.932: [ USRTHRD][1126226240] {0:0:2} pausing thread
2012-12-18 04:58:54.932: [ USRTHRD][1126226240] {0:0:2} posting thread
2012-12-18 04:58:54.932: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]start {
2012-12-18 04:58:54.932: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]start }
2012-12-18 04:58:54.932: [ USRTHRD][1126226240] {0:0:2} HAIP:  Moving ip '169.254.199.113' from inf 'eth2' to inf 'eth1'
2012-12-18 04:58:54.932: [ USRTHRD][1126226240] {0:0:2} pausing thread
2012-12-18 04:58:54.932: [ USRTHRD][1126226240] {0:0:2} posting thread
2012-12-18 04:58:54.933: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]start {
2012-12-18 04:58:54.933: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]start }
2012-12-18 04:58:54.934: [ USRTHRD][1136732480] {0:0:2} [NetHAWork] thread started
2012-12-18 04:58:54.936: [ USRTHRD][1136732480] {0:0:2}  Arp::sCreateSocket { 
2012-12-18 04:58:54.936: [ USRTHRD][1132529984] {0:0:2} [NetHAWork] thread started
2012-12-18 04:58:54.936: [ USRTHRD][1132529984] {0:0:2}  Arp::sCreateSocket { 
2012-12-18 04:58:54.942: [ USRTHRD][1136732480] {0:0:2}  Arp::sCreateSocket } 
2012-12-18 04:58:54.942: [ USRTHRD][1136732480] {0:0:2} Starting Probe for ip 169.254.154.185
2012-12-18 04:58:54.942: [ USRTHRD][1136732480] {0:0:2} Transitioning to Probe State
2012-12-18 04:58:54.942: [ USRTHRD][1132529984] {0:0:2}  Arp::sCreateSocket } 
2012-12-18 04:58:54.942: [ USRTHRD][1132529984] {0:0:2} Starting Probe for ip 169.254.199.113
2012-12-18 04:58:54.943: [ USRTHRD][1132529984] {0:0:2} Transitioning to Probe State
2012-12-18 04:58:55.299: [ USRTHRD][1132529984] {0:0:2}  Arp::sProbe { 
2012-12-18 04:58:55.299: [ USRTHRD][1132529984] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:58:55.300: [ USRTHRD][1132529984] {0:0:2}  Arp::sProbe } 
2012-12-18 04:58:55.306: [ USRTHRD][1136732480] {0:0:2}  Arp::sProbe { 
2012-12-18 04:58:55.307: [ USRTHRD][1136732480] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:58:55.307: [ USRTHRD][1136732480] {0:0:2}  Arp::sProbe } 
2012-12-18 04:58:56.677: [ USRTHRD][1136732480] {0:0:2}  Arp::sProbe { 
2012-12-18 04:58:56.677: [ USRTHRD][1136732480] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:58:56.678: [ USRTHRD][1136732480] {0:0:2}  Arp::sProbe } 
2012-12-18 04:58:57.137: [ USRTHRD][1132529984] {0:0:2}  Arp::sProbe { 
2012-12-18 04:58:57.137: [ USRTHRD][1132529984] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:58:57.138: [ USRTHRD][1132529984] {0:0:2}  Arp::sProbe } 
2012-12-18 04:58:57.945: [ USRTHRD][1136732480] {0:0:2}  Arp::sProbe { 
2012-12-18 04:58:57.946: [ USRTHRD][1136732480] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:58:57.946: [ USRTHRD][1136732480] {0:0:2}  Arp::sProbe } 
2012-12-18 04:58:57.946: [ USRTHRD][1136732480] {0:0:2} Transitioning to Announce State
[  clsdmc][1087572288]CLSDMC.C returnbuflen=8, extraDataBuf=CC, returnbuf=6AB34B0
2012-12-18 04:58:58.176: [ora.ctssd][1087572288] {0:0:2} [check] clsdmc_respget return: status=0, ecode=0, returnbuf=[0x6ab34b0], buflen=8
2012-12-18 04:58:58.176: [ora.ctssd][1087572288] {0:0:2} [check] translateReturnCodes, return = 0, state detail = ACTIVE:0Checkcb data [0x6ab34b0]: mode[0xcc] offset[0 ms].
2012-12-18 04:58:58.654: [ USRTHRD][1132529984] {0:0:2}  Arp::sProbe { 
2012-12-18 04:58:58.654: [ USRTHRD][1132529984] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:58:58.655: [ USRTHRD][1132529984] {0:0:2}  Arp::sProbe } 
2012-12-18 04:58:58.655: [ USRTHRD][1132529984] {0:0:2} Transitioning to Announce State
2012-12-18 04:58:59.952: [ USRTHRD][1136732480] {0:0:2}  Arp::sAnnounce { 
2012-12-18 04:58:59.952: [ USRTHRD][1136732480] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:58:59.953: [ USRTHRD][1136732480] {0:0:2}  Arp::sAnnounce } 
2012-12-18 04:59:00.654: [ USRTHRD][1132529984] {0:0:2}  Arp::sAnnounce { 
2012-12-18 04:59:00.654: [ USRTHRD][1132529984] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:59:00.655: [ USRTHRD][1132529984] {0:0:2}  Arp::sAnnounce } 
2012-12-18 04:59:01.951: [ USRTHRD][1136732480] {0:0:2}  Arp::sAnnounce { 
2012-12-18 04:59:01.952: [ USRTHRD][1136732480] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:59:01.952: [ USRTHRD][1136732480] {0:0:2}  Arp::sAnnounce } 
2012-12-18 04:59:01.952: [ USRTHRD][1136732480] {0:0:2} Transitioning to Defend State
2012-12-18 04:59:01.952: [ USRTHRD][1136732480] {0:0:2} VipActions::startIp {
2012-12-18 04:59:01.952: [ USRTHRD][1136732480] {0:0:2} Adding 169.254.154.185 on eth3:1
2012-12-18 04:59:01.953: [ USRTHRD][1136732480] {0:0:2} VipActions::startIp }
2012-12-18 04:59:01.953: [ USRTHRD][1136732480] {0:0:2} Assigned IP:  169.254.154.185 on interface eth3
2012-12-18 04:59:02.656: [ USRTHRD][1132529984] {0:0:2}  Arp::sAnnounce { 
2012-12-18 04:59:02.656: [ USRTHRD][1132529984] {0:0:2} Arp::sSend:  sending type 1
2012-12-18 04:59:02.657: [ USRTHRD][1132529984] {0:0:2}  Arp::sAnnounce } 
2012-12-18 04:59:02.657: [ USRTHRD][1132529984] {0:0:2} Transitioning to Defend State
2012-12-18 04:59:02.657: [ USRTHRD][1132529984] {0:0:2} VipActions::startIp {
2012-12-18 04:59:02.657: [ USRTHRD][1132529984] {0:0:2} Adding 169.254.199.113 on eth1:1
2012-12-18 04:59:02.658: [ USRTHRD][1132529984] {0:0:2} VipActions::startIp }
2012-12-18 04:59:02.658: [ USRTHRD][1132529984] {0:0:2} Assigned IP:  169.254.199.113 on interface eth1
2012-12-18 04:59:02.931: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]stop {
2012-12-18 04:59:03.133: [ USRTHRD][1140934976] {0:0:2} [NetHAWork] thread stopping
2012-12-18 04:59:03.133: [ USRTHRD][1140934976] {0:0:2} Thread:[NetHAWork]isRunning is reset to false here
2012-12-18 04:59:03.133: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]stop }
2012-12-18 04:59:03.133: [ USRTHRD][1126226240] {0:0:2} VipActions::stopIp {
2012-12-18 04:59:03.133: [ USRTHRD][1126226240] {0:0:2} NetInterface::sStopIp {
2012-12-18 04:59:03.133: [ USRTHRD][1126226240] {0:0:2} Stopping ip '169.254.154.185', inf 'eth1', mask '10.0.53.0'
2012-12-18 04:59:03.133: [ USRTHRD][1126226240] {0:0:2} NetInterface::sStopIp }
2012-12-18 04:59:03.133: [ USRTHRD][1126226240] {0:0:2} VipActions::stopIp }
2012-12-18 04:59:03.136: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]stop {
2012-12-18 04:59:03.634: [ USRTHRD][1143036224] {0:0:2} [NetHAWork] thread stopping
2012-12-18 04:59:03.634: [ USRTHRD][1143036224] {0:0:2} Thread:[NetHAWork]isRunning is reset to false here
2012-12-18 04:59:03.635: [ USRTHRD][1126226240] {0:0:2} Thread:[NetHAWork]stop }
2012-12-18 04:59:03.635: [ USRTHRD][1126226240] {0:0:2} VipActions::stopIp {
2012-12-18 04:59:03.635: [ USRTHRD][1126226240] {0:0:2} NetInterface::sStopIp {
2012-12-18 04:59:03.635: [ USRTHRD][1126226240] {0:0:2} Stopping ip '169.254.199.113', inf 'eth2', mask '10.0.53.0'
2012-12-18 04:59:03.635: [ USRTHRD][1126226240] {0:0:2} NetInterface::sStopIp }
2012-12-18 04:59:03.635: [ USRTHRD][1126226240] {0:0:2} VipActions::stopIp }
2012-12-18 04:59:03.641: [ USRTHRD][1126226240] {0:0:2} USING HAIP[  0 ]:  eth3 - 169.254.154.185 
2012-12-18 04:59:03.641: [ USRTHRD][1126226240] {0:0:2} USING HAIP[  1 ]:  eth1 - 169.254.199.113 
2012-12-18 04:59:03.641: [ USRTHRD][1126226240] {0:0:2} USING HAIP[  2 ]:  eth1 - 169.254.25.108 
2012-12-18 04:59:03.641: [ USRTHRD][1126226240] {0:0:2} USING HAIP[  3 ]:  eth2 - 169.254.81.139 
2012-12-18 04:59:04.060: [ora.crsd][1087572288] {0:0:2} [check] clsdmc_respget return: status=0, ecode=10201
2012-12-18 04:59:04.060: [ora.crsd][1087572288] {0:0:2} [check] DaemonAgent::check returned 0
2012-12-18 04:59:10.669: [ora.crf][1113618752] {0:0:2} [check] clsdmc_respget return: status=0, ecode=0
2012-12-18 04:59:10.670: [ora.crf][1113618752] {0:0:2} [check] Check return = 0, state detail = NULL

ocssd.log日志内容:
2012-12-18 04:58:54.488: [GIPCHGEN][1095821632] gipchaNodeAddInterface: adding interface information for inf 0x1a2bfec0 { host '', haName 'CSS_rac-cluster', local (nil), ip '10.0.53.111', subnet '10.0.53.0', mask '255.255.255.0', mac '08-00-27-27-0f-39', ifname 'eth3', numRef 0, numFail 0, idxBoot 0, flags 0x1841 }
2012-12-18 04:58:54.714: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created local interface for node 'rac1', haName 'CSS_rac-cluster', inf 'udp://10.0.53.111:33643'
2012-12-18 04:58:54.714: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created local bootstrap multicast interface for node 'rac1', haName 'CSS_rac-cluster', inf 'mcast://224.0.0.251:42424/10.0.53.111'
2012-12-18 04:58:54.715: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created local bootstrap multicast interface for node 'rac1', haName 'CSS_rac-cluster', inf 'mcast://230.0.1.0:42424/10.0.53.111'
2012-12-18 04:58:54.715: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created local bootstrap broadcast interface for node 'rac1', haName 'CSS_rac-cluster', inf 'udp://10.0.53.255:42424'
2012-12-18 04:58:54.715: [GIPCHGEN][1094244672] gipchaNodeAddInterface: adding interface information for inf 0x2aaaac064f20 { host 'rac2', haName 'CSS_rac-cluster', local (nil), ip '10.0.53.10', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1842 }
2012-12-18 04:58:54.715: [GIPCHGEN][1094244672] gipchaNodeAddInterface: adding interface information for inf 0x2aaaac065b90 { host 'rac2', haName 'CSS_rac-cluster', local (nil), ip '10.0.53.11', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1842 }
2012-12-18 04:58:54.715: [GIPCHGEN][1094244672] gipchaNodeAddInterface: adding interface information for inf 0x2aaaac065ef0 { host 'rac2', haName 'CSS_rac-cluster', local (nil), ip '10.0.53.111:33643', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1842 }
2012-12-18 04:58:54.715: [GIPCHGEN][1094244672] gipchaNodeAddInterface: adding interface information for inf 0x2aaaac066250 { host 'rac3', haName 'CSS_rac-cluster', local (nil), ip '10.0.53.10', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1842 }
2012-12-18 04:58:54.715: [GIPCHGEN][1094244672] gipchaNodeAddInterface: adding interface information for inf 0x2aaaac0bed60 { host 'rac3', haName 'CSS_rac-cluster', local (nil), ip '10.0.53.11', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1842 }
2012-12-18 04:58:54.715: [GIPCHGEN][1094244672] gipchaNodeAddInterface: adding interface information for inf 0x2aaaac0bf0c0 { host 'rac3', haName 'CSS_rac-cluster', local (nil), ip '10.0.53.111:33643', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1842 }
2012-12-18 04:58:54.715: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac2', haName 'CSS_rac-cluster', inf 'mcast://224.0.0.251:42424/10.0.53.10'
2012-12-18 04:58:54.716: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac2', haName 'CSS_rac-cluster', inf 'mcast://230.0.1.0:42424/10.0.53.10'
2012-12-18 04:58:54.716: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap broadcast interface for node 'rac2', haName 'CSS_rac-cluster', inf 'udp://10.0.53.255:42424'
2012-12-18 04:58:54.716: [GIPCHGEN][1094244672] gipchaWorkerAttachInterface: Interface attached inf 0x2aaaac064f20 { host 'rac2', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.10', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.716: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac2', haName 'CSS_rac-cluster', inf 'mcast://224.0.0.251:42424/10.0.53.11'
2012-12-18 04:58:54.716: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac2', haName 'CSS_rac-cluster', inf 'mcast://230.0.1.0:42424/10.0.53.11'
2012-12-18 04:58:54.716: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap broadcast interface for node 'rac2', haName 'CSS_rac-cluster', inf 'udp://10.0.53.255:42424'
2012-12-18 04:58:54.716: [GIPCHGEN][1094244672] gipchaWorkerAttachInterface: Interface attached inf 0x2aaaac065b90 { host 'rac2', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.11', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.716: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac2', haName 'CSS_rac-cluster', inf 'mcast://224.0.0.251:42424/10.0.53.111:33643'
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac2', haName 'CSS_rac-cluster', inf 'mcast://230.0.1.0:42424/10.0.53.111:33643'
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap broadcast interface for node 'rac2', haName 'CSS_rac-cluster', inf 'udp://10.0.53.255:33643'
2012-12-18 04:58:54.717: [GIPCHGEN][1094244672] gipchaWorkerAttachInterface: Interface attached inf 0x2aaaac065ef0 { host 'rac2', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.111:33643', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac3', haName 'CSS_rac-cluster', inf 'mcast://224.0.0.251:42424/10.0.53.10'
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac3', haName 'CSS_rac-cluster', inf 'mcast://230.0.1.0:42424/10.0.53.10'
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap broadcast interface for node 'rac3', haName 'CSS_rac-cluster', inf 'udp://10.0.53.255:42424'
2012-12-18 04:58:54.717: [GIPCHGEN][1094244672] gipchaWorkerAttachInterface: Interface attached inf 0x2aaaac066250 { host 'rac3', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.10', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac3', haName 'CSS_rac-cluster', inf 'mcast://224.0.0.251:42424/10.0.53.11'
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac3', haName 'CSS_rac-cluster', inf 'mcast://230.0.1.0:42424/10.0.53.11'
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap broadcast interface for node 'rac3', haName 'CSS_rac-cluster', inf 'udp://10.0.53.255:42424'
2012-12-18 04:58:54.717: [GIPCHGEN][1094244672] gipchaWorkerAttachInterface: Interface attached inf 0x2aaaac0bed60 { host 'rac3', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.11', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac3', haName 'CSS_rac-cluster', inf 'mcast://224.0.0.251:42424/10.0.53.111:33643'
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap multicast interface for node 'rac3', haName 'CSS_rac-cluster', inf 'mcast://230.0.1.0:42424/10.0.53.111:33643'
2012-12-18 04:58:54.717: [GIPCHTHR][1094244672] gipchaWorkerCreateInterface: created remote bootstrap broadcast interface for node 'rac3', haName 'CSS_rac-cluster', inf 'udp://10.0.53.255:33643'
2012-12-18 04:58:54.717: [GIPCHGEN][1094244672] gipchaWorkerAttachInterface: Interface attached inf 0x2aaaac0bf0c0 { host 'rac3', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.111:33643', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.717: [GIPCHGEN][1094244672] gipchaInterfaceDisable: disabling interface 0x2aaaac065ef0 { host 'rac2', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.111:33643', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.718: [GIPCHALO][1094244672] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x2aaaac065ef0 { host 'rac2', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.111:33643', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1866 }
2012-12-18 04:58:54.855: [GIPCHGEN][1094244672] gipchaInterfaceDisable: disabling interface 0x2aaaac065b90 { host 'rac2', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.11', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.855: [GIPCHALO][1094244672] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x2aaaac065b90 { host 'rac2', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.11', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1866 }
2012-12-18 04:58:54.855: [GIPCHGEN][1094244672] gipchaInterfaceDisable: disabling interface 0x2aaaac066250 { host 'rac3', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.10', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.855: [GIPCHGEN][1094244672] gipchaInterfaceDisable: disabling interface 0x2aaaac0bed60 { host 'rac3', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.11', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.855: [GIPCHGEN][1094244672] gipchaInterfaceDisable: disabling interface 0x2aaaac0bf0c0 { host 'rac3', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.111:33643', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.856: [GIPCHALO][1094244672] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x2aaaac066250 { host 'rac3', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.10', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1866 }
2012-12-18 04:58:54.856: [GIPCHALO][1094244672] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x2aaaac0bf0c0 { host 'rac3', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.111:33643', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1866 }
2012-12-18 04:58:54.856: [GIPCHALO][1094244672] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x2aaaac0bed60 { host 'rac3', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.11', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1866 }
2012-12-18 04:58:54.919: [GIPCHGEN][1094244672] gipchaInterfaceDisable: disabling interface 0x2aaaac064f20 { host 'rac2', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.10', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1846 }
2012-12-18 04:58:54.920: [GIPCHALO][1094244672] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x2aaaac064f20 { host 'rac2', haName 'CSS_rac-cluster', local 0x1a2bfec0, ip '10.0.53.10', subnet '10.0.53.0', mask '255.255.255.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x1866 }
2012-12-18 04:58:57.861: [    CSSD][1125521728]clssnmSendingThread: sending status msg to all nodes
2012-12-18 04:58:57.862: [    CSSD][1125521728]clssnmSendingThread: sent 4 status msgs to all nodes

最后想说一下,Oracle的一些新特性虽很好用,但不免有一些BUG的存在,MOS ID 1210883.1文档上已列出一些BUG内容,请参考。

Permalink

1 Comment »

  1. luocs  said,
    十二月 18, 2012 @  上午 8:27

    MOS文档BUG内容部分:
    11gR2 Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip [ID 1210883.1]

    It's NOT supported to disable HAIP or stop HAIP while the cluster is up and running, however:
    1. The feature is disabled in 11.2.0.2/11.2.0.3 if Sun Cluster exists
    2. The feature does not exist in Windows 11.2.0.2/11.2.0.3
    3. The feature is disabled in 11.2.0.2/11.2.0.3 if Fujitsu PRIMECLUSTER exists
    
    4. With the fix of bug 11077756 (fixed in 11.2.0.2 GI PSU6, 11.2.0.3), HAIP will be disabled if it fails to start while running root script (root.sh or rootupgrade.sh), for more details, refer to Section bug 11077756
    Known Issues
    Bug 12425730
    Issue: HAIP fails to start, gipcd.log shows rank 0 or "-1" for private network
    
    Fixed in: 11.2.0.3 and onward, refer to note 1374360.1 for details.
    bug 12674817
    
    Issue: HAIP fails to start if root script (root.sh or rootupgrade.sh) is executed via sudo
    
    Symptom:
     • Output of root script:
     CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'racnode1'
     CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:
     Start action for HAIP aborted
     CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'racnode1' failed
     • $GRID_HOME/log//agent/ohasd/orarootagent_root/orarootagent_root.log
     2010-12-04 17:19:54.893: [ USRTHRD][2084] {0:3:37} failed to create arp
     2010-12-04 17:19:54.893: [ USRTHRD][2084] {0:3:37} (null) category: -2, operation: ioctl, loc: bpfopen:2,os, OS error: 14, other:
     
     OR
     
     2011-09-29 16:44:46.770: [ USRTHRD][3600] {0:3:14} failed to create arp
     2011-09-29 16:44:46.771: [ USRTHRD][3600] {0:3:14} (null) category: -2, operation: open, loc: bpfopen:1,os, OS error: 2, other:
     
     OR
     
     2011-09-29 16:44:46.770: [ USRTHRD][3600] {0:3:14} failed to create arp
     2011-09-29 16:44:46.771: [ USRTHRD][3600] {0:3:14} (null) category: -2, operation: open, loc: bpfopen:1,os, OS error: 22, other:
     
     OR
     
     2011-11-03 11:03:01.217: [ USRTHRD][25] {0:0:166} (null) category: -2, operation: open, loc: devopen:1,os, OS error: 2, other:
    Solution/Workaround:
    
    It's known on AIX and Solaris that command executed via sudo etc may not have full root environment, which could cause HAIP startup failure.
    
    The solution is to execute root script (root.sh or rootupgrade.sh) as real root user directly. If root script already failed, it may fail with same error while re-running, and the workaround is to reboot the node and run root script as root user directly.
    
    On AIX, alternative workaround is to execute "/usr/sbin/tcpdump -D" as root and verify that the following exists before re-running root script:
    ls -ltr /dev/bpf*
    cr--------   1 root     system       42,  0 Oct 03 10:32 /dev/bpf0
    ..
     
    Bug 10332426
    Issue: HAIP fails to start while running rootupgrade.sh
    
    Symptom:
      Output of root script:
     CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'racnode1'
     CRS-5017: The resource action "ora.cluster_interconnect.haip start"
     encountered the following error:
     Start action for HAIP aborted
     CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'racnode1' failed
      $GRID_HOME/log/<hostname>/gipcd/gipcd.log
     2010-12-12 09:41:35.201: [ CLSINET][1088543040] Returning NETDATA: 0 interfaces
     2010-12-12 09:41:40.201: [ CLSINET][1088543040] Returning NETDATA: 0 interfaces
    Solution:
    
    The cause is mismatch of private network information in OCR and on OS, output of the following should be consistent with each other regarding network adapter name, subnet and netmask - see note 1296579.1 for what to check.
     oifcfg iflist -p -n
     oifcfg getif
     ifconfig
    Bug 10363902
    Issue: GIPC HA disabled or HAIP fails to start if cluster interconnect is Infiniband or any other network hardware that has hardware address (MAC) longer than 6 bytes
    
    Fixed in: 11.2.0.3 for Linux and Solaris
    
    Symptom:
      Output of root script:
     CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'racnode1'
     CRS-5017: The resource action "ora.cluster_interconnect.haip start"
     encountered the following error:
     Start action for HAIP aborted
     CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'racnode1' failed
      $GRID_HOME/log/<hostname>/gipcd/gipcd.log
     2010-12-07 13:23:08.560: [ USRTHRD][3858] {0:0:62} Arp::sCreateSocket {
     2010-12-07 13:23:08.560: [ USRTHRD][3858] {0:0:62} failed to create arp
     2010-12-07 13:23:08.561: [ USRTHRD][3858] {0:0:62} (null) category: -2,
     operation: ssclsi_aix_get_phys_addr, loc: aixgetpa:4,n, OS error: 2, other:
     
     2010-12-30 10:52:37.373: [ USRTHRD][15] {0:0:124} (null) category: -2, operation: ssclsi_dlpi_request, loc: dlpireq:8,na, OS error: 7, other:
     2010-12-30 10:52:37.462: [ USRTHRD][15] {0:0:124} Arp::sCreateSocket {
     2010-12-30 10:52:37.463: [ USRTHRD][15] {0:0:124} failed to create arp
     
     # lanscan
     Hardware Station        Crd Hdw   Net-Interface  NM  MAC       HP-DLPI DLPI
     Path     Address        In# State NamePPA        ID  Type      Support Mjr#
     ..
     LinkAgg1 0x0000004CFE8* 901 UP    lan901 snap901  9  IB        Yes     119
     ..
     IPOIB0   0x0000004CFE8* 9000 UP  lan9000 snap9000 5  IB        Yes     119
     
    Bug 10357258
    Issue: many HAIP created after active NIC fails in IPMP
    
    Fixed in: 11.2.0.3, 11.2.0.2 GI PSU3, interim patch 10357258 exists for 11.2.0.2, patch 11865154 for 11.2.0.2.1, affects Solaris only
    
    Symptom:
      ifconfig output:
     nxge3:2: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu  1500 index 5
               inet 169.254.20.88 netmask ffff0000 broadcast 169.254.255.255
     nxge3:3: flags=21000842<BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500  index 5
               inet 169.254.20.88 netmask ffff0000 broadcast 169.254.255.255
     ..
     
     Note the same HAIP shows up multiple times
     
    Bug 10397652
    Issue: HAIP does not failover even when private network experiences problem (i.e. switch port disabled or such) as OS is not providing reliable link information
    
    Fixed in: 11.2.0.3
    
    Workaround on AIX is to set "MONITOR" flag for all private network adapters
     # ifconfig en1 monitor
     # ifconfig en1
     en1: flags=5e080863,2c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,
     GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN,MONITOR>
             inet 192.168.10.83 netmask 0xfffffc00 broadcast 192.168.11.255
             inet 169.254.74.136 netmask 0xffff8000 broadcast 169.254.127.255
              tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
    Bug 10253028
    Issue: "oifcfg iflist -p -n" not showing HAIP on AIX
    
    Fixed in: Expected behaviour on AIX
    
    Symptom:
      "oifcfg getif" output
     en12  10.0.1.0  global  public
     en13  10.1.1.0  global  cluster_interconnect
      "ifconfig -a" output
     en13: flags=5e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN>
            inet 10.1.1.143 netmask 0xffffff00 broadcast 10.1.1.255
            inet 169.254.228.154 netmask 0xffff0000 broadcast 169.254.255.255
             tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
     ..
     Note HAIP exists
      v$cluster_interconnects
     SQL> select * from gv$cluster_interconnects;
     
       INST_ID NAME            IP_ADDRESS       IS_ SOURCE
     ---------- --------------- ---------------- ---
             1 en13            169.254.228.154  NO
             2 en13            169.254.55.162   NO
      "oifcfg iflist -p -n" output
     en12  10.0.1.0  PUBLIC  255.255.255.0
     en13  10.1.1.0  PUBLIC  255.255.255.0
     
     Note usually we expect HAIP to be listed here as well, however it's not listed on AIX
     
      
    Bug 9795321
    Issue: Wrong MTU size for HAIP on Solaris, refer to note 1290585.1 for more details.
    Bug 11077756
    Issue: Startup failure of HAIP fails root script, fix of the bug will allow root script to continue so HAIP issue can be worked later.
    
    Fixed in: 11.2.0.2 GI PSU6, 11.2.0.3 and above
    
    Note: the consequence is that HAIP will be disabled. Once the cause is identified and solution is implemented, HAIP needs to be enabled when there's an outage window. To enable, as root on ALL nodes:
    
    # $GRID_HOME/bin/crsctl modify res ora.cluster_interconnect.haip -attr "ENABLED=1" -init
    # $GRID_HOME/bin/crsctl stop crs
    # $GRID_HOME/bin/crsctl start crs
    Bug 12546712
    Issue: ASM crashes as HAIP does not fail over when two or more private network fails , refer to note 1323995.1 for more details.
    Note 1366211.1
    Issue: HAIP fails to start if default gateway is configured for VLAN for private network on network switch
    
    orarootagent_root.log shows: PROBE: conflict detected src { 169.254.12.247, <gateway MAC on switch> }, target { 0.0.0.0, <private NIC MAC> }
    
    The solution is to remove default gateway setting on network switch for private network (VLAN), refer to note 1366211.1 for more details.
    bug 10114953
    Issue: Only one HAIP created on HP-UX
    
    The bug is fixed in 11.2.0.4, patch 10114953 is required before 11.2.0.4 is released.
    
    OS kernel parameter dlpi_max_ub_promisc must be set to greater than 1 for the patch to be effective.
    Note 1447517.1
    Issue: HAIP fails to start on AIX as other system devices using same major/minor number as bpf devices
    
    orarootagent_root.log shows: category: -2, operation: SETIF, loc: bpfopen:21,o, OS error: 22, other: dev /dev/bpf0, ifr en15
    
    The solution is to ensure no other device is using same major/minor as bpf device, refer to note 1447517.1 for more details.
    回复

你可能感兴趣的:(高可用相关(rac)