我是数据库架构+优化,不做运维哈
目前这套RAC是我在3月实施,一直运行正常,突然间5月down机一台,心想down机是件简单的事情(银行,证券除外)
在我重启数据库后,还是无法正常启动,查看日志也没有什么信息。
问题描述:
目前此省为双机rac,150,152 两台数据库,152正常,150 down机
异常原因:
152,150两台服务器,磁阵的映射路径不一致导致数据库启动错误(150为多路径,152为普通路径)
解决方案:
150这台机器,是不是项目有其他安排?怎么改变了磁阵映射路径? 对应的接口人联系谁?
[root@DP2 ~]# multipath -ll
[root@DP2 ~]# raw -qa
/dev/raw/raw1: bound to major 8, minor 16
/dev/raw/raw2: bound to major 8, minor 48
/dev/raw/raw3: bound to major 8, minor 112
/dev/raw/raw4: bound to major 8, minor 144
[root@DP2 ~]#
[root@DP1 ~]# multipath -ll
multipath.conf line 12, invalid keyword: devnode
multipath.conf line 13, invalid keyword: wwid
multipath.conf line 14, invalid keyword: }
mpath2 (3690b11c00017bfed00000602511138b0) dm-1 DELL,MD32xx
[size=1.8T][features=2 pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 7:0:1:1 sdh 8:112 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 7:0:0:1 sdc 8:32 [active][ghost]
mpath1 (3690b11c00017c13400000644511069ed) dm-0 DELL,MD32xx
[size=1.8T][features=2 pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 7:0:0:0 sdb 8:16 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 7:0:1:0 sdg 8:96 [active][ghost]
mpath5 (3690b11c00017c134000005e151104912) dm-4 DELL,Universal Xport
[size=20M][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
\_ 7:0:0:31 sdf 8:80 [failed][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 7:0:1:31 sdk 8:160 [failed][ready]
mpath4 (3690b11c00017bfed00000604511138c3) dm-3 DELL,MD32xx
[size=1.8T][features=2 pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 7:0:1:3 sdj 8:144 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 7:0:0:3 sde 8:64 [active][ghost]
mpath3 (3690b11c00017c1340000064751106a0a) dm-2 DELL,MD32xx
[size=1.8T][features=2 pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 7:0:0:2 sdd 8:48 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 7:0:1:2 sdi 8:128 [active][ghost]
第2节点状态情况
[root@DP2 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DG01DATA.dg
ONLINE ONLINE dp2
ora.LISTENER.lsnr
ONLINE ONLINE dp2
ora.asm
ONLINE ONLINE dp2 Started
ora.gsd
OFFLINE OFFLINE dp2
ora.net1.network
ONLINE ONLINE dp2
ora.ons
ONLINE ONLINE dp2
ora.registry.acfs
ONLINE ONLINE dp2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE dp2
ora.cvu
1 OFFLINE OFFLINE
ora.dp1.vip
1 OFFLINE OFFLINE
ora.dp2.vip
1 ONLINE ONLINE dp2
ora.oc4j
1 OFFLINE OFFLINE
ora.roam.dataclient.svc
1 ONLINE ONLINE dp2
ora.roam.dataldr.svc
1 ONLINE ONLINE dp2
ora.roam.db
1 OFFLINE OFFLINE
2 ONLINE ONLINE dp2 Open
ora.scan1.vip
1 ONLINE ONLINE dp2
[root@DP2 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
第1节点状态:
[root@DP1 ~]# crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
[root@DP1 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
[root@DP1 ~]#
下午等IT部调整磁阵路径后,贴出结果
/etc/init.d/ohasd stop
root@localhost source /home/oracle/.bash_profile
为了磁阵损坏,带来不便,数据我已经备份好了。
[root@DP1 raw]# ls -l
total 0
crw-rw---- 1 grid oinstall 162, 1 May 9 15:20 raw1
crw-rw---- 1 grid oinstall 162, 2 May 9 15:20 raw2
crw-rw---- 1 grid oinstall 162, 3 May 9 15:20 raw3
crw-rw---- 1 grid oinstall 162, 4 May 9 15:20 raw4
[root@DP1 raw]#
[root@DP1 raw]# crsctl check crs
CRS-4639: Could not contact Oracle High Availability Services
[root@DP1 raw]# /etc/init.d/ohasd start
Starting ohasd:
CRS-4123: Oracle High Availability Services has been started.
[root@DP1 raw]#
由普通路径,我已经换成了多路径方式
,请等待
在经过5分钟的等待,已经完成了,第一个节点已经好了,哎我们的等待是值得的
CRS-4123: Oracle High Availability Services has been started.
[root@DP1 raw]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
下面继续,我们把第2个节点按传统的路径,加载试试
如果能启动起来,那我得修改blog了 ,哈哈。。。。。。
请等待...
CRS-4123: Oracle High Availability Services has been started.
启动成功,事实证明单路径和多路径可以共用
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE dp1
ora.cvu
1 ONLINE ONLINE dp1
ora.dp1.vip
1 ONLINE ONLINE dp1
ora.dp2.vip
1 ONLINE ONLINE dp2
ora.oc4j
1 OFFLINE OFFLINE
ora.xxx.dataclient.svc
1 ONLINE ONLINE dp1
ora.xxx.dataldr.svc
1 ONLINE ONLINE dp1
ora.xxx.db
1 ONLINE ONLINE dp1 Open
2 ONLINE ONLINE dp2 Open
ora.scan1.vip