【问题记录】standby MRP进程挂了 报IO错误 重启MRP仍会自动停掉


standby 报错

Sun Sep 22 07:18:17 2013

KCF: write/open errorblock=0x58530 online=1

     file=69 /dev/rrac_lv16_5g

     error=27070 txt: 'IBM AIX RISCSystem/6000 Error: 5: I/O error

Additional information:1'

Sun Sep 22 07:19:40 2013

KCF: write/open errorblock=0x5852f online=1

     file=69 /dev/rrac_lv16_5g

...

 

Sun Sep 22 07:22:03 2013

Errors with log/arch/1_104730_751481221.dbf

MRP0: Background MediaRecovery terminated with error 12801        --MRP进程挂了

Sun Sep 22 07:22:04 2013

Errors in file /u01/oracle/admin/eptdb/bdump/eptdb_mrp0_667684.trc:

ORA-12801: error signaledin parallel query server P002

ORA-01579: write erroroccurred during recovery

 

 

发现问题后,尝试把MRP起起来

Sun Sep 22 09:14:01 2013

Completed: alter databaserecover managed standby database disconnect from session

Sun Sep 22 09:14:01 2013

Media Recovery Log/arch/1_104730_751481221.dbf

Media Recovery Log/arch/2_98851_751481221.dbf

 

...

 

之后又挂起

Sun Sep 22 09:24:10 2013

KCF: write/open error block=0x4b302online=1

     file=58/dev/rrac_lv09_20g

     error=27070 txt: ‘IBM AIX RISC System/6000Error: 5: I/O error

Additional information: 1’

KCF: write/open error block=0x4b303online=1

    file=58 /dev/rrac_lv09_20g

     error=27070 txt: ‘IBM AIX RISC System/6000Error: 5: I/O error

Additional information: 1’

KCF: write/open error block=0x4b304online=1

    file=58 /dev/rrac_lv09_20g

     error=27070 txt: ‘IBM AIX RISC System/6000Error: 5: I/O error

--感觉是IO的问题

...

Sun Sep 22 09:26:20 2013

Errors in file/u01/oracle/admin/eptdb/bdump/eptdb_p002_610360.trc:

ORA-01579: write erroroccurred during recovery

Sun Sep 22 09:26:20 2013

Errors in file/u01/oracle/admin/eptdb/bdump/eptdb_p001_921670.trc:

ORA-01579: write erroroccurred during recovery

Sun Sep 22 09:26:23 2013

Write error has occurred during recovery

Sun Sep 22 09:26:23 2013

Write error has occurred during recovery

Sun Sep 22 09:26:23 2013

Write error has occurred during recovery

Sun Sep 22 09:26:23 2013

Errors with log /arch/2_98852_751481221.dbf

MRP0: Background Media Recovery terminatedwith error 12801

Sun Sep 22 09:26:23 2013

Errors in file/u01/oracle/admin/eptdb/bdump/eptdb_mrp0_942194.trc:

ORA-12801: error signaled in parallel queryserver P000

ORA-01579: write error occurred duringrecovery

 

...

Sun Sep 22 09:56:51 2013

Errors in file/u01/oracle/admin/eptdb/udump/eptdb_rfs_880812.trc:

ORA-00345: redo log writeerror block 32589 count 2042

ORA-00312: online log 8thread 1: '/dev/rstb_redo1_2_1g'

ORA-27070: asyncread/write failed

IBM AIX RISC System/6000 Error: 5: I/Oerror

Additional information: 1

Sun Sep 22 09:57:59 2013

Primary database is in MAXIMUM PERFORMANCEmode

Sun Sep 22 09:57:59 2013

ARC3: Standby redo logfile selected forthread 2 sequence 98863 for destination LOG_ARCHIVE_DEST_2

Sun Sep 22 09:57:59 2013

RFS[25198]: Successfully opened standby log10: '/dev/rstb_redo1_4_1g'

Sun Sep 22 10:02:00 2013

Errors in file/u01/oracle/admin/eptdb/udump/eptdb_rfs_880812.trc:

ORA-00600: internal errorcode, arguments: [kcrrpicc.4], [], [], [], [], [], [], []

Sun Sep 22 10:02:04 2013

Redo Shipping Client Connected as PUBLIC

 

主库报错:

Sun Sep 22 10:00:08 2013

Errors in file/u01/oracle/admin/eptdb/bdump/eptdb1_lns1_1130638.trc:

ORA-00270: error creatingarchive log

LGWR: Error 270 closingarchivelog file 'standby1'

LNS: Standby redo logfile selected forthread 1 sequence 104743 for destination LOG_ARCHIVE_DEST_2

Sun Sep 22 10:03:54 2013

Errors in file/u01/oracle/admin/eptdb/bdump/eptdb1_lns1_1130638.trc:

ORA-00340: IO errorprocessing online log  of thread

Sun Sep 22 10:03:54 2013

LGWR: I/O error 340archiving log 1 to 'standby1'

 

此时LOG_ARCHIVE_DEST_2变为error的状态:

querydb1:/home/oracle>$orz dgarcdest

DEST_NAME            STATUS    DATABASE_MODE   DESTINATION

-------------------- ------------------------ --------------------

LOG_ARCHIVE_DEST_1   VALID    OPEN            /arch

LOG_ARCHIVE_DEST_2   ERROR     MOUNTED-STANDBY standby1

 

发现数据文件、归档都不能写,由于这些文件都是存储在裸设备上的,怀疑是放裸设备的卷出了问题。

将该问题向主机组反映后说是由于内存不足导致的,但是当时查过内存以及交换内存,都不是很紧张。

 

解决:

后来将操作系统重启,问题就解决了。

 

备库重新开启apply应用:

SQL> alter database recover managed standbydatabase disconnect from session;

 

SQL> SELECT PROCESS,STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS FROM V$MANAGED_STANDBY order by 1;

PROCESS  STATUS          THREAD# SEQUENCE#     BLOCK#     BLOCKS

--------- ------------ -------------------- ---------- ----------

ARCH          CLOSING                2      98890         190465        553

ARCH          CLOSING                1     104764         190465       1778

ARCH          CLOSING                1     104766         190465        607

ARCH          CLOSING                2      98891         190465        962

ARCH          CLOSING                1     104767         190465        463

MRP0         APPLYING_LOG             1     104768               0           0

--没有RFS进程

--原因是主库的LOG_ARCHIVE_DEST_2还是error状态

 

在主库使LOG_ARCHIVE_DEST_2重新生效:

alter system set LOG_ARCHIVE_DEST_2=enable;

 

再次在备库查询进程:

SQL> SELECT PROCESS,STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS FROM V$MANAGED_STANDBY order by 1;

PROCESS  STATUS          THREAD# SEQUENCE#     BLOCK#     BLOCKS

--------- ------------ -------------------- ---------- ----------

ARCH          CLOSING                2      98890         190465        553

ARCH          CLOSING                1     104764         190465       1778

ARCH          CLOSING                1     104766         190465        607

ARCH          CLOSING                2      98891         190465        962

ARCH          CLOSING                1     104767         190465        463

MRP0         WAIT_FOR_LOG           1     104768               0           0

RFS     IDLE                        1     104768         142875       6312

RFS     IDLE                        0         0               0           0

RFS     IDLE                        0         0               0           0

RFS     IDLE                        0         0               0           0

RFS     IDLE                        0         0               0           0

RFS     IDLE                        0         0               0           0

RFS     IDLE                        2      98892           67433      1892

RFS     IDLE                        0         0               0           0

 

 

 

 

 

 

 

你可能感兴趣的:(oracle错误,oracle,troubleshooting,OS)