官方有关OCR 和 Voting disk重建的文档参考:
How to Recreate OCR / Voting Disk Accidentally Deleted [ID 399482.1]
http://blog.csdn.net/tianlesoftware/archive/2010/12/02/6049378.aspx
OCR 和Voting disk 的备份与恢复参考:
Oracle 10g RAC OCR和 VotingDisk 的备份与恢复
http://blog.csdn.net/tianlesoftware/archive/2010/04/09/5467273.aspx
先对Voting disk 和OCR做一个备份。
[root@rac1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.orcl.db application ONLINE ONLINE rac2
ora....oltp.cs application ONLINE ONLINE rac2
ora....cl1.srv application ONLINE ONLINE rac1
ora....cl2.srv application ONLINE ONLINE rac2
ora....l1.inst application ONLINE ONLINE rac1
ora....l2.inst application ONLINE ONLINE rac2
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
[root@rac1 bin]# ./crsctl query css votedisk
0. 0 /dev/raw/raw3
1. 0 /dev/raw/raw4
2. 0 /dev/raw/raw5
located 3 votedisk(s)
[root@rac1 bin]# dd if=/dev/raw/raw3 of=/u01/votingdisk.bak
401562+0 records in
401562+0 records out
205599744 bytes (206 MB) copied, 1685.53 seconds, 122 kB/s
[root@rac1 u01]# cd /u01/app/oracle/product/crs/bin/
[root@rac1 bin]# ./ocrconfig -export /u01/ocr.bak
[root@rac1 bin]# ll /u01
total 202132
drwxrwxrwx 3 oracle oinstall 4096 Nov 30 17:08 app
-rwxr-xr-x 1 oracle oinstall 1043097 Nov 30 18:59 clsfmt.bin
-rw-r--r-- 1 root root 103141 Dec 2 08:38 ocr.bak
-rwxr-xr-x 1 oracle oinstall 5542 Nov 30 19:00 srvctl
-rw-r--r-- 1 root root 205599744 Dec 2 08:45 votingdisk.bak
重建具体操作如下:
1. 停止所有节点的CRS
[root@rac1 bin]# ./crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
2. 备份每个节点的Clusterware Home
[root@rac1 bin]# cd /u01/app/oracle/product/
[root@rac1 product]# ls
10.2.0 crs
[root@rac1 product]# cp crs crs_back
3. 在所有节点执行<CRS_HOME>/install/rootdelete.sh命令
[root@rac1 install]# pwd
/u01/app/oracle/product/crs/install
[root@rac1 install]# ./rootdelete.sh
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources.
Error while stopping resources. Possible cause: CRSD is down.
Stopping CSSD.
Unable to communicate with the CSS daemon.
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down...
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
4. 在执行安装的节点执行<CRS_HOME>/install/rootdeinstall.sh命令
因为我是在rac1节点上执行安装的, 所以也在该节点执行该命令。 只需要在该节点执行就可以了。
[root@rac1 install]# sh /u01/app/oracle/product/crs/install/rootdeinstall.sh
Removing contents from OCR mirror device
2560+0 records in
2560+0 records out
10485760 bytes (10 MB) copied, 108.972 seconds, 96.2 kB/s
Removing contents from OCR device
2560+0 records in
2560+0 records out
10485760 bytes (10 MB) copied, 89.2502 seconds, 117 kB/s
5. 检查CRS进程,如果没有返回值,继续下一步
[root@rac1 install]# ps -e | grep -i 'ocs[s]d'
[root@rac1 install]# ps -e | grep -i 'cr[s]d.bin'
[root@rac1 install]# ps -e | grep -i 'ev[m]d.bin'
6. 在安装节点(第4步中的节点)执行<CRS_HOME>/root.sh命令
[root@rac1 crs]# /u01/app/oracle/product/crs/root.sh --注意,是root用户。
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
Checking to see if Oracle CRS stack is already configured
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
assigning default hostname rac1 for node 1.
assigning default hostname rac2 for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 1: rac1 rac1-priv rac1
node 2: rac2 rac2-priv rac2
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Now formatting voting device: /dev/raw/raw3
Now formatting voting device: /dev/raw/raw4
Now formatting voting device: /dev/raw/raw5
Format of 3 voting devices complete.
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rac1
CSS is inactive on these nodes.
rac2
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.
7. 在剩下的节点执行<CRS_HOME>/root.sh命令
[root@rac2 crs]# /u01/app/oracle/product/crs/root.sh
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
Checking to see if Oracle CRS stack is already configured
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/u01/app/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
assigning default hostname rac1 for node 1.
assigning default hostname rac2 for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 1: rac1 rac1-priv rac1
node 2: rac2 rac2-priv rac2
clscfg: Arguments check out successfully.
NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rac1
rac2
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
Error 0(Native: listNetInterfaces:[3])
[Error 0(Native: listNetInterfaces:[3])]
这里报错了。 root.sh 在最后一个节点执行时会调用vipca命令。 这里因为网络接口没有配置好。 所以执行失败了。 我们配置一下接口,在Xmanager里,用root用户,手工运行vipca命令即可。
[root@rac1 bin]# ./oifcfg getif -- 没有返回接口信息
[root@rac1 bin]# ./oifcfg iflist
eth1 192.168.6.0
virbr0 192.168.122.0
eth0 192.168.6.0
[root@rac1 bin]# ./oifcfg setif -global eth0/192.168.6.0:public-- 注意IP最后是0
[root@rac1 bin]# ./oifcfg setif -global eth1/192.168.6.0:cluster_interconnect
[root@rac1 bin]# ./oifcfg getif -- 验证配置
eth0 192.168.6.0 global public
eth1 192.168.6.0 global cluster_interconnect
[root@rac1 bin]#
配置玩后,随便在一个节点用root用户运行一下vipca命令就可以了。 这个是有窗口的。 需要X 支持。所有用X manager。 其他工具也可以。 能运行就可以了。 执行完后nodeapps的VIP,ONS,GSD就创建完成了。
[root@rac1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
8. 配置监听(netca)
重建Listener会将监听器信息写入OCR)
[oracle@rac1 ~]$ mv $TNS_ADMIN/listener.ora /tmp/listener.ora.original
[oracle@rac2 ~]$ mv $TNS_ADMIN/listener.ora /tmp/listener.ora.original
然后在X Manager里,用oracle用户执行netca命令。 这个也是可视化的窗口。
[root@rac1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
9. 配置ONS(racgons)
10g 下使用:
<CRS_HOME>/install/racgons add_config hostname1:port hostname2:port
[oracle@rac1 bin]$ pwd
/u01/app/oracle/product/crs/bin
[oracle@rac1 bin]$ racgons add_config rac1:6251 rac2:6251
11g 使用:
<CRS_HOME>/install/onsconfig add_config hostname1:port hostname2:port
[oracle@rac1 bin]$ onsconfig add_config rac1:6251 rac2:6251
验证配置:
[oracle@rac1 bin]$ onsctl ping
Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
{node = rac1, port = 6251}
Adding remote host rac1:6251
onscfg[1]
{node = rac2, port = 6251}
Adding remote host rac2:6251
ons is running ...
如果没有启动,用 onsctl start 启动一下即可。
10. 添加其他资源到OCR
注意, 注册用的名字和要之前安装的一样。 区分大小写。
ASM
语法:srvctl add asm -n <node_name> -i <asm_instance_name> -o <oracle_home>
[oracle@rac1 bin]$ echo $ORACLE_HOME
/u01/app/oracle/product/10.2.0/db_1
[oracle@rac1 bin]$ srvctl add asm -n rac1 -i +ASM1 -o $ORACLE_HOME
[oracle@rac1 bin]$ srvctl add asm -n rac2 -i +ASM2 -o /u01/app/oracle/product/10.2.0/db_1
DATABASE
语法:srvctl add database -d <db_unique_name> -o <oracle_home>
[oracle@rac1 bin]$ srvctl add database -d orcl -o /u01/app/oracle/product/10.2.0/db_1
INSTANCE
语法:srvctl add instance -d <db_unique_name> -i <instance_name> -n <node_name>
[oracle@rac1 bin]$ srvctl add instance -d orcl -i orcl1 -n rac1
[oracle@rac1 bin]$ srvctl add instance -d orcl -i orcl2 -n rac2
SERVICE
语法:srvctl add service -d <db_unique_name> -s <service_name> -r <preferred_list> -P <TAF_policy>
-r preferred_list 是首先使用的实例的列表,还可是用-a 表示备用实例
TAF_policy可设置为NONE,BASIC,PRECONNECT
[oracle@rac1 bin]$ srvctl add service -d orcl -s oltp -r orcl1,orcl2 -P BASIC
添加完了我们来查看一下:
[oracle@rac1 bin]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.orcl.db application OFFLINE OFFLINE
ora....oltp.cs application OFFLINE OFFLINE
ora....cl1.srv application OFFLINE OFFLINE
ora....cl2.srv application OFFLINE OFFLINE
ora....l1.inst application OFFLINE OFFLINE
ora....l2.inst application OFFLINE OFFLINE
ora....SM1.asm application OFFLINE OFFLINE
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application OFFLINE OFFLINE
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
11. 启动资源和检查
[oracle@rac1 bin]$ srvctl start asm -n rac1
[oracle@rac1 bin]$ srvctl start asm -n rac2
[oracle@rac1 bin]$ srvctl start database -d orcl
[oracle@rac1 bin]$ srvctl start service -d orcl
[root@rac1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.orcl.db application ONLINE ONLINE rac1
ora....oltp.cs application ONLINE ONLINE rac2
ora....cl1.srv application ONLINE ONLINE rac1
ora....cl2.srv application ONLINE ONLINE rac2
ora....l1.inst application ONLINE ONLINE rac1
ora....l2.inst application ONLINE ONLINE rac2
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
[oracle@rac1 bin]$ cluvfy stage -post crsinst -n rac1,rac2
Performing post-checks for cluster services setup
Checking node reachability...
Node reachability check passed from node "rac1".
Checking user equivalence...
User equivalence check passed for user "oracle".
Checking Cluster manager integrity...
Checking CSS daemon...
Daemon status check passed for "CSS daemon".
Cluster manager integrity check passed.
Checking cluster integrity...
Cluster integrity check passed
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.
Uniqueness check for OCR device passed.
Checking the version of OCR...
OCR of correct Version "2" exists.
Checking data integrity of OCR...
Data integrity check for OCR passed.
OCR integrity check passed.
Checking CRS integrity...
Checking daemon liveness...
Liveness check passed for "CRS daemon".
Checking daemon liveness...
Liveness check passed for "CSS daemon".
Checking daemon liveness...
Liveness check passed for "EVM daemon".
Checking CRS health...
CRS health check passed.
CRS integrity check passed.
Checking node application existence...
Checking existence of VIP node application (required)
Check passed.
Checking existence of ONS node application (optional)
Check passed.
Checking existence of GSD node application (optional)
Check passed.
Post-check for cluster services setup was successful.
[oracle@rac1 bin]$
重建结束。