环境:RHEL6.5+11.2.0.4 RAC,两节点
问题描述:故意把OLR删掉,重启后发现GI无法启动
分析过程:
1.确认GI启动到了哪一个阶段
[grid@rac1 ~]$ crsctl status resource -t -init CRS-4639: Could not contact Oracle High Availability Services CRS-4000: Command Status failed, or completed with errors. 解析:发现连OHASD都没有启动,两种可能:1是init.ohasd脚本没有被调用 2是ohasd.bin守护进程没有启动成功,那么: [grid@rac1 ~]$ ps -ef | grep ohas |grep -v grep root 960 1 0 09:23 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run 发现,脚本被调用了,但是守护进程没有成功启动。
2016-04-18 12:26:25.918: [ default][1661986592] OHASD Daemon Starting. Command string :restart 2016-04-18 12:26:25.919: [ default][1661986592] Initializing OLR 2016-04-18 12:26:25.919: [ OCROSD][1661986592]utopen:6m': failed in stat OCR file/disk /u01/app/11.2.0.1/grid/cdata/rac1.olr, errno=2, os err string=No such file or directory 2016-04-18 12:26:25.919: [ OCROSD][1661986592]utopen:7: failed to open any OCR file/disk, errno=2, os err string=No such file or directory 2016-04-18 12:26:25.919: [ OCRRAW][1661986592]proprinit: Could not open raw device 2016-04-18 12:26:25.919: [ OCRAPI][1661986592]a_init:16!: Backend init unsuccessful : [26] 2016-04-18 12:26:25.920: [ CRSOCR][1661986592] OCR context init failure. Error: PROCL-26: Error while accessing the physical storage Operating System error [No such file or directory] [2] 2016-04-18 12:26:25.920: [ default][1661986592] Created alert : (:OHAS00106:) : OLR initialization failed, error: PROCL-26: Error while accessing the physical storage Operating System error [No such file or directory] [2] 2016-04-18 12:26:25.920: [ default][1661986592][PANIC] OHASD exiting; Could not init OLR 2016-04-18 12:26:25.920: [ default][1661986592] Done. 解析:看报错是OLR打不开,那就过去看看存在不(手动删的,怎么可能存在) [grid@rac1 cdata]$ ll total 12 drwxrwxr-x 2 grid oinstall 4096 Apr 18 07:51 liming-cluster drwxr-xr-x 2 grid oinstall 4096 Apr 18 07:49 localhost drwxr-xr-x 2 grid oinstall 4096 Apr 18 08:11 rac1 OLR不存在了。
3.查看OLR的备份是否存在
[grid@rac1 rac1]$ ll total 6644 -rw------- 1 root root 6803456 Apr 18 08:11 backup_20160418_081108.olr 可以的。
4.恢复OLR
<span style="font-size:18px;">[root@rac1 bin]# ./ocrconfig -local -restore /u01/app/11.2.0.1/grid/cdata/rac1/backup_20160418_081108.olr PROTL-35: The configured OLR location is not accessible. 书中没写的步骤来了! [grid@rac1 cdata]$ touch rac1.olr [root@rac1 bin]# ./ocrconfig -local -restore /u01/app/11.2.0.1/grid/cdata/rac1/backup_20160418_081108.olr [root@rac1 bin]# [grid@rac1 cdata]$ ll total 6660 drwxrwxr-x 2 grid oinstall 4096 Apr 18 07:51 liming-cluster drwxr-xr-x 2 grid oinstall 4096 Apr 18 07:49 localhost drwxr-xr-x 2 grid oinstall 4096 Apr 18 08:11 rac1 -rw-r--r-- 1 grid oinstall 272756736 Apr 18 13:02 rac1.olr </span>
5.启动GI,恢复正常
<span style="font-size:18px;">[root@rac1 bin]# ./crsctl start crs </span>