<一>有两台oracle服务器,运用ASM共享存储,早上发现因归档日志满了,数据库启动不了,结果手工将+flash磁盘上多余的archivelog删除后,发现还是启动不起来.报错及解决如下:
版本:11.2.0.1 做duplicate后 备库启动时报错,不能mount上,查看alert日志
<txt>ALTER DATABASE MOUNT
<txt>Errors in file /u01/app/oracle/diag/rdbms/dg/dg/trace/dg_rbal_16494.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
<txt>WARNING: FAILED to load library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
<txt>SUCCESS: diskgroup ASMOCR was mounted
<txt>ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '+ASMOCR/dg/controlfile/current.259.871317183'
ORA-15081: failed to submit an I/O operation to a disk
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '+ASMOCR/dg/controlfile/current.258.871317183'
ORA-15081: failed to submit an I/O operation to a disk
ERROR: failed to establish dependency between database dg and diskgroup resource ora.ASMOCR.dg
<txt>ORA-205 signalled during: ALTER DATABASE MOUNT...
以上是主要报错信息
根据报错信息分析下这个错误:
1.ora-205 报这个错是在数据库启动到mount时,从umount 到mount是要读取控制文件。我看了下spfile的控制文件参数是没有问题的
2.系统在启动时 读取控制文件报错ORA-204,是什么原因导致的?
3.因为用的是asm 此处diskgroup asmocr已经mount成功了
4. db在关联diskgroup出现错误:ERROR: failed to establish dependency between database dg and diskgroup resource ora.ASMOCR.dg
5.id oracle :uid=1101(oracle) gid=1000(oinstall) groups=1000(oinstall),1031(dba),1020(asmadmin),1021(asmdba),1300(oper)
用户的权限是没有问题的
6.加载asmlib出现问题
解决办法
<二>然后用rman处理了下手工删除归档日志的状态问题:
当手工删除了归档日志以后,Rman备份会检测到日志缺失,从而无法进一步继续执行。
所以此时需要手工执行crosscheck过程,之后Rman备份可以恢复正常。
1.Crosscheck日志
$ rman target /
Recovery Manager: Release 9.2.0.4.0 - 64bit Production
Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.
connected to target database: AVATAR2 (DBID=2480694409)
RMAN> crosscheck archivelog all;
using target database controlfile instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=25 devtype=DISK
validation failed for archived log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2714.dbf recid=2702 stamp=545107659
validation failed for archived log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2715.dbf recid=2703 stamp=545108268
...........
validation failed for archived log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2985.dbf recid=2973 stamp=545399327
validation succeeded for archived log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2986.dbf recid=2974 stamp=545400820
validation succeeded for archived log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2987.dbf recid=2975 stamp=545401757
validation succeeded for archived log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2988.dbf recid=2976 stamp=545402716
validation succeeded for archived log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2989.dbf recid=2977 stamp=545403661
validation succeeded for archived log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2990.dbf recid=2978 stamp=545404946
validation succeeded for archived log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2991.dbf recid=2979 stamp=545406220
Crosschecked 278 objects
RMAN>
RMAN> delete expired archivelog all;
released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=12 devtype=DISK
List of Archived Log Copies
Key Thrd Seq S Low Time Name
------- ---- ------- - --------- ----
376 1 2714 X 23-NOV-04 =/opt/oracle/oradata/avatar2/archive/1_2714.dbf
.....
RMAN> report obsolete;
RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
Report of obsolete backups and copies
Type Key Completion Time Filename/Handle
-------------------- ------ ------------------ --------------------
Backup Set 125 01-NOV-04
Backup Piece 125 01-NOV-04 /data1/oracle/orabak/full_1_541045804
Backup Set 131 04-NOV-04
Backup Piece 131 04-NOV-04 /data1/oracle/orabak/full_AVATAR2_20041104_131
....
Backup Set 173 06-DEC-04
Backup Piece 173 06-DEC-04 /data1/oracle/orabak/full_AVATAR2_20041206_173
Backup Set 179 11-DEC-04
Backup Piece 179 11-DEC-04 /data1/oracle/orabak/arch544588206.arc
.....
Backup Piece 189 17-DEC-04 /data1/oracle/orabak/arch545106606.arc
Backup Set 190 17-DEC-04
Backup Piece 190 17-DEC-04 /data1/oracle/orabak/arch545106665.arc
Backup Set 191 20-DEC-04
Backup Piece 191 20-DEC-04 /data1/oracle/orabak/arch_AVATAR2_20041220_194
Archive Log 2973 20-DEC-04 /opt/oracle/oradata/avatar2/archive/1_2985.dbf
Archive Log 2971 20-DEC-04 /opt/oracle/oradata/avatar2/archive/1_2984.dbf
.....
Archive Log 2705 17-DEC-04 /opt/oracle/oradata/avatar2/archive/1_2717.dbf
Archive Log 2704 17-DEC-04 /opt/oracle/oradata/avatar2/archive/1_2716.dbf
Archive Log 2703 17-DEC-04 /opt/oracle/oradata/avatar2/archive/1_2715.dbf
Archive Log 2702 17-DEC-04 /opt/oracle/oradata/avatar2/archive/1_2714.dbf
RMAN> delete obsolete;
RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
using channel ORA_DISK_1
Deleting the following obsolete backups and copies:
Type Key Completion Time Filename/Handle
-------------------- ------ ------------------ --------------------
Backup Set 125 01-NOV-04
Backup Piece 125 01-NOV-04 /data1/oracle/orabak/full_1_541045804
....
Archive Log 2704 17-DEC-04 /opt/oracle/oradata/avatar2/archive/1_2716.dbf
Archive Log 2703 17-DEC-04 /opt/oracle/oradata/avatar2/archive/1_2715.dbf
Archive Log 2702 17-DEC-04 /opt/oracle/oradata/avatar2/archive/1_2714.dbf
Do you really want to delete the above objects (enter YES or NO)? yes
deleted backup piece
backup piece handle=/data1/oracle/orabak/full_AVATAR2_20041206_173 recid=173 stamp=544156241
.....
deleted archive log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2715.dbf recid=2703 stamp=545108268
deleted archive log
archive log filename=/opt/oracle/oradata/avatar2/archive/1_2714.dbf recid=2702 stamp=545107659
Deleted 286 objects
RMAN> crosscheck archivelog all;
released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=19 devtype=DISK
specification does not match any archive log in the recovery catalog
<三>完事后启动oracle监听,又遇到报错:
尝试手动启动,提示错误:
$ lsnrctl start
LSNRCTL for IBM/AIX RISC System/6000: Version 9.2.0.4.0 - Production on 17-NOV-2011 00:39:20
Copyright (c) 1991, 2002, Oracle Corporation. All rights reserved.
Starting /moracle/product/9.2.0/bin/tnslsnr: please wait...
TNSLSNR for IBM/AIX RISC System/6000: Version 9.2.0.4.0 - Production
System parameter file is /moracle/product/9.2.0/network/admin/listener.ora
Log messages written to /moracle/product/9.2.0/network/log/listener.log
Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC)))
TNS-12542: TNS:address already in use
TNS-12560: TNS:protocol adapter error
TNS-00512: Address already in use
IBM/AIX RISC System/6000 Error: 67: Address already in use
但是仔细查看listener.ora没发现问题,而且这配置文件一般也不会做任何修改.于是估计是crs的部分服务状态出问题了,通过crs_stat -t命令查看,结果报 crs_stat没找到命令,只能去grid安装目录bin层查看,发现两个db都是offline状态:
于是通过srvctl 命令分别online两个db,然后查看状态就没问题了(db listener正常,crs_stat命令也能找到,不用非得去grid bin目录下查看了):
<四>最后补充下常用rac srvctl管理命令,非为这些命令时常会在解决类似问题中用到:
SRVCTL是ORACLE RAC集群配置管理的工具
SRVM server management:
1. SRVCTL Add命令
添加数据库或实例的配置信息。在增加实例中,与-i一起指定的名字应该与INSTANCE_NAME 和 ORACLE_SID参数匹配。
srvctl add database -d <database name> [-m domain_name] -o <ORACLE_HOME path> -p <spfile location and name>
srvctl add instance -d <database name> -i <instance 1 name> -n <node 1 name >
srvctl add instance -d <database name> -i <instance 2 name> -n <node 2 name >
命令参数:
-m 数据库域名 格式如”us.oracle.com”
指定的数据库域名必须匹配数据库INIT.ORA或者SPFILE中DB_DOMAIN 和DB_NAME参数。在增加数据库时,-d指定的数据库名必须与DB_NAME参数匹配
-n 实例节点名
-o $ORACLE_HOME(用来确定lsnrctl和Oracle等命令路径)
-p SPFILE 文件名
Eg:
$srvctl add database -d RAC -o /u01/oracle/product/10.2.0/db_1 -p +RAC_DISK/rac/spfilerac.ora
$srvctl add instance -d RAC -i rac1 -n node1
$srvctl add instance -d RAC -i rac2 -n node2
2.SRVCTL Config命令
显示保存在SRVM配置文件中的配置信息
srvctl config database
显示数据库配置列表
srvctl config database -d database_name
数据库配置信息显示的格式:
nodename1 instancename1 oraclehome
nodename2 instancename2 oraclehome
Eg:
$ srvctl config database
RAC
$srvctl config database -d rac
node1 rac1 /u01/oracle/product/10.2.0/db_1
node2 rac2 /u01/oracle/product/10.2.0/db_1
3.SRVCTL Modify命令
修改实例的节点配置信息,这些修改会在程序下次重新启动后生效,修改后的信息将永久保存。
srvctl modify instance -d database_name -i instance_name -n node_name
Eg:
$srvctl modify instance -d rac -n new_node
4.SRVCTL Remove命令
这是用来删除SRVM库中配置信息的命令,对象相关的环境设置也同样删除,如果你未使用强制标志(-f),ORACLE将提示你确认是否删除。
使用强制选项(-f),删除操作将不进行提示
srvctl remove database -d database_name [-f]
srvctl remove instance -d database_name -i instance_name [-f]
命令参数:
-f 强制删除应用时不进行确认提示
Eg:
$srvctl remove database -d rac
$srvctl remove instance -d rac -i rac1
$srvctl remove instance -d rac -i rac2
5.SRVCTL Start命令
启动数据库,所有实例或指定的实例,及启动所有相关未启动的监听。
注:对于start命令和其它一些可以使用连接字符串的操作,如果你不提供连接字符串,那么ORACLE会使用”/ as sysdba”在实例上执行相关的操作。另外,要执行类似的操作,你必须是OSDBA组的成员。
srvctl start database -d database_name [-o start_options] [-c connect_string]
srvctl start instance -d database_name -i instance_name [,instance_name-list] [-o start_options][-c connect_string]
命令参数:
-o 在SQL*Plus直接传递的startup命令选项,可以包括PFILE
-c 使用SQL*Plus连接数据库实例的连接字符串
Eg:
$srvctl start database -d rac
$ srvctl stop database -d rac -c “SYS/SYS_password as SYSDBA”
$srvctl start instance -d rac -i rac1,rac2
##############################################################
$srvctl start listener -n node1
$srvctl stop listener -n node2
$ srvctl stop listener -n node [-l listenername]
今天发现一个SRVCTL命令的小bug。(http://yangtingkun.itpub.net/post/468/275571)
如果用srvctl关闭监听后,再用lsnrctl start打开监听。这时srvctl仍然认为监听已经关闭。因此,再次使用srvctl关闭监听,似乎srvctl根本没有去执行。如果希望srvctl可以关闭监听,那么需要先用srvctl启动监听,然后再关闭。搜索了一下metalink,没有发现关于这个问题的说明。而且,这个问题只在关闭监听时出现,启动监听则没有问题。svrctl显然只记录它自己的操作,而不去检查listener真正的状态。
##############################################################
6.SRVCTL Status命令
显示指定数据库的当前状态
srvctl status database -d database_name
srvctl status instance -d database_name -i instance_name [,instance_name-list]
Eg:
$srvctl status database -d rac
$srvctl status instance -d rac -i rac1,rac2
7.SRVCTL Stop命令
停止数据库所有实例可者指定实例
srvctl stop database -d database_name [-o stop_options] [-c connect_string]
srvctl stop instance -d database_name -i instance_name [,instance_name_list] [-o stop_options][-c connect_string]
命令参数:
-c 使用SQL*Plus连接数据库实例的连接字符串
-o 在SQL*Plus直接传递的shutdown命令选项
Eg:
$srvctl stop database -d rac
$srvctl stop instance -d rac -i rac2
$ srvctl stop service -d db_name [-s service_name_list [-i inst_name]]
$ srvctl stop asm -n node
8.使用SRVCONFIG导入和导出RAW设备配置信息
你可使用SRVCONFIG导入和导出RAW设备配置信息,不管配置文件是在集群文件系统上还是在RAW设备上。你可以使用这种方法来备份与恢复SRVM配置信息。
Eg:
下面的命令用来导出配置信息的内容到你指定文件名的文本文件中。
$srvconfig -exp file_name
下面的命令用来从指定文本文件中导入配置信息到到你运行命令的RAC环境配置信息库。
$srvconfig -imp file_name
9.SRVCTL Getenv命令
getenv操作用来从SRVM配置文件中获取与显示环境变量
srvctl getenv database -d database_name [-t name[,name,……]]
srvctl getenv instance -d database_name -i instance_name [-t name[,name,……]]
Eg:
$srvctl getenv database -d rac
10.SRVCTL Setenv命令
设置SRVM配置文件中的环境变量值。
srvctl setenv database -d database_name -t [,name=value,……]
srvctl setenv instance -d database_name [-i instance_name] -t [,name=value,……]
Eg:
$srvctl setenv database -d rac -t LANG=en
11.SRVCTL Unsetenv命令
取消SRVM配置文件中环境变量定义值
srvctl unsetenv database -d database_name-t name[,name,……]
srvctl unsetenv instance -d database_name[-i instance_name] -t name[,name,……]
Eg:
$srvctl unsetenv database -d rac -t CLASSPATH
Updated @ 11-12-09 11:43
Example:In windows, the correct startup/shutdown steps is:
STARTUP:
node1$srvctl start nodeapps -n rac1
node1$srvctl start nodeapps -n rac2
node1$srvctl start asm -n rac1
node1$srvctl start asm -n rac2
node1$srvctl start database -d rac
node1$srvctl start service -d rac
node1$crs_stat -t
SHUTDOWN:
node1$srvctl stop service -d rac
node1$srvctl stop database -d rac
node1$srvctl stop asm -n rac2
node1$srvctl stop asm -n rac1
node1$srvctl stop nodeapps -n rac2
node1$srvctl stop nodeapps -n rac1
node1$crs_stat -t
-The End-