ARCHIVER ERROR ORA-00354: CORRUPT REDO LOG BLOCK HEADER

Problem Description: ORA-16038: log 2 sequence# 13831 cannot be archived ORA-00354: corrupt redo log block header ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log'
LOG FILE
---------------
Filename = alert_TOOLS5_from_1021.log
See ...

...
Wed Oct 28 11:41:59 2009
Primary database is in MAXIMUM AVAILABILITY mode
Standby controlfile consistent with primary
RFS[1]: Successfully opened standby log 1: '/oradata/3/TOOLS/stdby_redo/srl0.log'
Wed Oct 28 11:42:00 2009
ARC0: Log corruption near block 604525 change 10551037679542 time ?
Wed Oct 28 11:42:00 2009
Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc0_2143.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 604525 change 10551037679542 time 10/28/2009 11:29:50
ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log'
ARC0: All Archive destinations made inactive due to error 354
Wed Oct 28 11:42:00 2009
ARC0: Closing local archive destination LOG_ARCHIVE_DEST_2: '/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc' (error 354)
(TOOLS)
Committing creation of archivelog '/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc' (error 354)
ARCH: Archival stopped, error occurred. Will continue retrying
Wed Oct 28 11:42:05 2009
ORACLE Instance TOOLS - Archival Error
Wed Oct 28 11:42:05 2009
ORA-16038: log 2 sequence# 13831 cannot be archived
ORA-00354: corrupt redo log block header
ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log'
Wed Oct 28 11:42:05 2009
Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc0_2143.trc:
ORA-16038: log 2 sequence# 13831 cannot be archived
ORA-00354: corrupt redo log block header
ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log'
Wed Oct 28 11:43:04 2009
ARCH: Archival stopped, error occurred. Will continue retrying
Wed Oct 28 11:43:04 2009
ORACLE Instance TOOLS - Archival Error
Wed Oct 28 11:43:04 2009
Primary database is in MAXIMUM AVAILABILITY mode
Changing standby controlfile to RESYNCHRONIZATION level
Wed Oct 28 11:43:04 2009
ORA-16014: log 1 sequence# 13832 not archived, no available destinations
ORA-00312: online log 1 thread 1: '/oradata/3/TOOLS/stdby_redo/srl0.log'
Wed Oct 28 11:43:04 2009
Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc1_2145.trc:
ORA-16014: log 1 sequence# 13832 not archived, no available destinations
ORA-00312: online log 1 thread 1: '/oradata/3/TOOLS/stdby_redo/srl0.log'
RFS[1]: Successfully opened standby log 2: '/oradata/3/TOOLS/stdby_redo/srl1.log'
Wed Oct 28 11:43:13 2009
RFS[3]: Archived Log: '/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc'
Wed Oct 28 11:43:14 2009
RFS LogMiner: Registered logfile [/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc] to LogMiner session id [4]
Wed Oct 28 11:43:15 2009
LOGMINER: Begin mining logfile for session 4 thread 1 sequence 13831, /oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc
Wed Oct 28 11:44:03 2009
RFS[3]: Archived Log: '/oradata/3/TOOLS/archive/dgarc/1_13832_635534096.arc'
...
LOG FILE
---------------
Filename = alert_TOOLS6_from_1021.log
See ...

...
Wed Oct 28 11:16:01 2009
Thread 1 advanced to log sequence 13830 (LGWR switch)
Current log# 8 seq# 13830 mem# 0: /oradata/1/redo/TOOLS/redo1a.log
Current log# 8 seq# 13830 mem# 1: /oradata/2/redo/TOOLS/redo1b.log
Current log# 8 seq# 13830 mem# 2: /oradata/3/redo/TOOLS/redo1c.log
Wed Oct 28 11:29:50 2009
LGWR: Standby redo logfile selected to archive thread 1 sequence 13831
LGWR: Standby redo logfile selected for thread 1 sequence 13831 for destination LOG_ARCHIVE_DEST_2
Wed Oct 28 11:29:50 2009
Thread 1 advanced to log sequence 13831 (LGWR switch)
Current log# 9 seq# 13831 mem# 0: /oradata/1/redo/TOOLS/redo2a.log
Current log# 9 seq# 13831 mem# 1: /oradata/2/redo/TOOLS/redo2b.log
Current log# 9 seq# 13831 mem# 2: /oradata/3/redo/TOOLS/redo2c.log
Wed Oct 28 11:41:59 2009
LGWR: Standby redo logfile selected to archive thread 1 sequence 13832
LGWR: Standby redo logfile selected for thread 1 sequence 13832 for destination LOG_ARCHIVE_DEST_2
Wed Oct 28 11:41:59 2009
Thread 1 advanced to log sequence 13832 (LGWR switch)
Current log# 10 seq# 13832 mem# 0: /oradata/1/redo/TOOLS/redo3a.log
Current log# 10 seq# 13832 mem# 1: /oradata/2/redo/TOOLS/redo3b.log
Current log# 10 seq# 13832 mem# 2: /oradata/3/redo/TOOLS/redo3c.log
Wed Oct 28 11:43:04 2009
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Standby redo logfile selected to archive thread 1 sequence 13833
LGWR: Standby redo logfile selected for thread 1 sequence 13833 for destination LOG_ARCHIVE_DEST_2
Wed Oct 28 11:43:04 2009
Thread 1 advanced to log sequence 13833 (LGWR switch)
Current log# 11 seq# 13833 mem# 0: /oradata/1/redo/TOOLS/redo4a.log
Current log# 11 seq# 13833 mem# 1: /oradata/2/redo/TOOLS/redo4b.log
Current log# 11 seq# 13833 mem# 2: /oradata/3/redo/TOOLS/redo4c.log
Wed Oct 28 11:45:04 2009
Destination LOG_ARCHIVE_DEST_2 is SYNCHRONIZED
LGWR: Standby redo logfile selected to archive thread 1 sequence 13834
LGWR: Standby redo logfile selected for thread 1 sequence 13834 for destination LOG_ARCHIVE_DEST_2
Wed Oct 28 11:45:05 2009
Thread 1 advanced to log sequence 13834 (LGWR switch)
Current log# 8 seq# 13834 mem# 0: /oradata/1/redo/TOOLS/redo1a.log
Current log# 8 seq# 13834 mem# 1: /oradata/2/redo/TOOLS/redo1b.log
Current log# 8 seq# 13834 mem# 2: /oradata/3/redo/TOOLS/redo1c.log
Wed Oct 28 11:46:03 2009
Thread 1 cannot allocate new log, sequence 13835
Checkpoint not complete
Current log# 8 seq# 13834 mem# 0: /oradata/1/redo/TOOLS/redo1a.log
Current log# 8 seq# 13834 mem# 1: /oradata/2/redo/TOOLS/redo1b.log
Current log# 8 seq# 13834 mem# 2: /oradata/3/redo/TOOLS/redo1c.log
Wed Oct 28 11:46:10 2009
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Standby redo logfile selected to archive thread 1 sequence 13835
LGWR: Standby redo logfile selected for thread 1 sequence 13835 for destination LOG_ARCHIVE_DEST_2
Wed Oct 28 11:46:11 2009
Thread 1 advanced to log sequence 13835 (LGWR switch)
Current log# 9 seq# 13835 mem# 0: /oradata/1/redo/TOOLS/redo2a.log
Current log# 9 seq# 13835 mem# 1: /oradata/2/redo/TOOLS/redo2b.log
Current log# 9 seq# 13835 mem# 2: /oradata/3/redo/TOOLS/redo2c.log
Wed Oct 28 11:48:03 2009
Thread 1 cannot allocate new log, sequence 13836
Checkpoint not complete
Current log# 9 seq# 13835 mem# 0: /oradata/1/redo/TOOLS/redo2a.log
Current log# 9 seq# 13835 mem# 1: /oradata/2/redo/TOOLS/redo2b.log
Current log# 9 seq# 13835 mem# 2: /oradata/3/redo/TOOLS/redo2c.log
Wed Oct 28 11:48:06 2009
...

From the standby, as at 2009-10-28, 11:42, when the archiver tried to archive the standby
redo logfile. it encountered this error:

ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 604525 change 10551037679542 time 10/28/2009 11:29:50
ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log'

Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc0_2143.trc
The real logfile is retrieved from primary by the standby RFS process, then the log apply continue as usual. The fact that the standby redo logs are corrupted and identified as corrupt by the ARC process , makes it clear that there could be some sort of I/O errors which has caused. Reviewing the alert.log file it is clear that the RFS process fetched the new copy of the file which is corrupted and the issue has been resolved. This is more an issue to be concentrated from the system adminisration end to determine in case there are any issues at the I.O subsystem. list some Script to Collect Data Guard Primary Site Diagnostic Information:
Overview -------- This script is intended to provide an easy method to provide information necessary to troubleshoot Data Guard issues. Script Notes ------------- This script is intended to be run via sqlplus as the SYS or Internal user. Script ------- - - - - - - - - - - - - - - - - Script begins here - - - - - - - - - - - - - - - - -- NAME: dg_prim_diag.sql (Run on PRIMARY with a LOGICAL or PHYSICAL STANDBY) -- ------------------------------------------------------------------------ -- Copyright 2002, Oracle Corporation -- LAST UPDATED: 2/23/04 -- -- Usage: @dg_prim_diag -- ------------------------------------------------------------------------ -- PURPOSE: -- This script is to be used to assist in collection information to help -- troubeshoot Data Guard issues with an emphasis on Logical Standby. -- ------------------------------------------------------------------------ -- DISCLAIMER: -- This script is provided for educational purposes only. It is NOT -- supported by Oracle World Wide Technical Support. -- The script has been tested and appears to work as intended. -- You should always run new scripts on a test instance initially. -- ------------------------------------------------------------------------ -- Script output is as follows: set echo off set feedback off column timecol new_value timestamp column spool_extension new_value suffix select to_char(sysdate,'Mondd_hhmi') timecol, '.out' spool_extension from sys.dual; column output new_value dbname select value || '_' output from v$parameter where name = 'db_name'; spool dg_prim_diag_&&dbname&&timestamp&&suffix set linesize 79 set pagesize 35 set trim on set trims on alter session set nls_date_format = 'MON-DD-YYYY HH24:MI:SS'; set feedback on select to_char(sysdate) time from dual; set echo on -- In the following the database_role should be primary as that is what -- this script is intended to be run on. If protection_level is different -- than protection_mode then for some reason the mode listed in -- protection_mode experienced a need to downgrade. Once the error -- condition has been corrected the protection_level should match the -- protection_mode after the next log switch. column role format a7 tru column name format a10 wrap select name,database_role role,log_mode, protection_mode,protection_level from v$database; -- ARCHIVER can be (STOPPED | STARTED | FAILED). FAILED means that the -- archiver failed to archive a log last time, but will try again within 5 -- minutes. LOG_SWITCH_WAIT The ARCHIVE LOG/CLEAR LOG/CHECKPOINT event log -- switching is waiting for. Note that if ALTER SYSTEM SWITCH LOGFILE is -- hung, but there is room in the current online redo log, then value is -- NULL column host_name format a20 tru column version format a9 tru select instance_name,host_name,version,archiver,log_switch_wait from v$instance; -- The following query give us information about catpatch. -- This way we can tell if the procedure doesn't match the image. select version, modified, status from dba_registry where comp_id = 'CATPROC'; -- Force logging is not mandatory but is recommended. Supplemental -- logging must be enabled if the standby associated with this primary is -- a logical standby. During normal operations it is acceptable for -- SWITCHOVER_STATUS to be SESSIONS ACTIVE or TO STANDBY. column force_logging format a13 tru column remote_archive format a14 tru column dataguard_broker format a16 tru select force_logging,remote_archive, supplemental_log_data_pk,supplemental_log_data_ui, switchover_status,dataguard_broker from v$database; -- This query produces a list of all archive destinations. It shows if -- they are enabled, what process is servicing that destination, if the -- destination is local or remote, and if remote what the current mount ID -- is. column destination format a35 wrap column process format a7 column archiver format a8 column ID format 99 column mid format 99 select dest_id "ID",destination,status,target, schedule,process,mountid mid from v$archive_dest order by dest_id; -- This select will give further detail on the destinations as to what -- options have been set. Register indicates whether or not the archived -- redo log is registered in the remote destination control file. set numwidth 8 column ID format 99 select dest_id "ID",archiver,transmit_mode,affirm,async_blocks async, net_timeout net_time,delay_mins delay,reopen_secs reopen, register,binding from v$archive_dest order by dest_id; -- The following select will show any errors that occured the last time -- an attempt to archive to the destination was attempted. If ERROR is -- blank and status is VALID then the archive completed correctly. column error format a55 wrap select dest_id,status,error from v$archive_dest; -- The query below will determine if any error conditions have been -- reached by querying the v$dataguard_status view (view only available in -- 9.2.0 and above): column message format a80 select message, timestamp from v$dataguard_status where severity in ('Error','Fatal') order by timestamp; -- The following query will determine the current sequence number -- and the last sequence archived. If you are remotely archiving -- using the LGWR process then the archived sequence should be one -- higher than the current sequence. If remotely archiving using the -- ARCH process then the archived sequence should be equal to the -- current sequence. The applied sequence information is updated at -- log switch time. select ads.dest_id,max(sequence#) "Current Sequence", max(log_sequence) "Last Archived" from v$archived_log al, v$archive_dest ad, v$archive_dest_status ads where ad.dest_id=al.dest_id and al.dest_id=ads.dest_id group by ads.dest_id; -- The following select will attempt to gather as much information as -- possible from the standby. SRLs are not supported with Logical Standby -- until Version 10.1. set numwidth 8 column ID format 99 column "SRLs" format 99 column Active format 99 select dest_id id,database_mode db_mode,recovery_mode, protection_mode,standby_logfile_count "SRLs", standby_logfile_active ACTIVE, archived_seq# from v$archive_dest_status; -- Query v$managed_standby to see the status of processes involved in -- the shipping redo on this system. Does not include processes needed to -- apply redo. select process,status,client_process,sequence# from v$managed_standby; -- The following query is run on the primary to see if SRL's have been -- created in preparation for switchover. select group#,sequence#,bytes from v$standby_log; -- The above SRL's should match in number and in size with the ORL's -- returned below: select group#,thread#,sequence#,bytes,archived,status from v$log; -- Non-default init parameters. set numwidth 5 column name format a30 tru column value format a48 wra select name, value from v$parameter where isdefault = 'FALSE'; spool off - - - - - - - - - - - - - - - - Script ends here - - - - - - - - - - - - - - - -
another one:
Overview -------- This script is intended to provide an easy method to provide information necessary to troubleshoot Data Guard issues. Script Notes ------------- This script is intended to be run via sqlplus as the SYS or Internal user. Script ------- - - - - - - - - - - - - - - - - Script begins here - - - - - - - - - - - - - - - - -- NAME: DG_phy_stby_diag.sql -- ------------------------------------------------------------------------ -- AUTHOR: -- Michael Smith - Oracle Support Services - DataServer Group -- Copyright 2002, Oracle Corporation -- ------------------------------------------------------------------------ -- PURPOSE: -- This script is to be used to assist in collection information to help -- troubeshoot Data Guard issues. -- ------------------------------------------------------------------------ -- DISCLAIMER: -- This script is provided for educational purposes only. It is NOT -- supported by Oracle World Wide Technical Support. -- The script has been tested and appears to work as intended. -- You should always run new scripts on a test instance initially. -- ------------------------------------------------------------------------ -- Script output is as follows: set echo off set feedback off column timecol new_value timestamp column spool_extension new_value suffix select to_char(sysdate,'Mondd_hhmi') timecol, '.out' spool_extension from sys.dual; column output new_value dbname select value || '_' output from v$parameter where name = 'db_name'; spool dgdiag_phystby_&&dbname&&timestamp&&suffix set lines 200 set pagesize 35 set trim on set trims on alter session set nls_date_format = 'MON-DD-YYYY HH24:MI:SS'; set feedback on select to_char(sysdate) time from dual; set echo on -- -- ARCHIVER can be (STOPPED | STARTED | FAILED) FAILED means that the archiver failed -- to archive a -- log last time, but will try again within 5 minutes. LOG_SWITCH_WAIT -- The ARCHIVE LOG/CLEAR LOG/CHECKPOINT event log switching is waiting for. Note that -- if ALTER SYSTEM SWITCH LOGFILE is hung, but there is room in the current online -- redo log, then value is NULL column host_name format a20 tru column version format a9 tru select instance_name,host_name,version,archiver,log_switch_wait from v$instance; -- The following select will give us the generic information about how this standby is -- setup. The database_role should be standby as that is what this script is intended -- to be ran on. If protection_level is different than protection_mode then for some -- reason the mode listed in protection_mode experienced a need to downgrade. Once the -- error condition has been corrected the protection_level should match the protection_mode -- after the next log switch. column ROLE format a7 tru select name,database_role,log_mode,controlfile_type,protection_mode,protection_level from v$database; -- Force logging is not mandatory but is recommended. Supplemental logging should be enabled -- on the standby if a logical standby is in the configuration. During normal -- operations it is acceptable for SWITCHOVER_STATUS to be SESSIONS ACTIVE or NOT ALLOWED. column force_logging format a13 tru column remote_archive format a14 tru column dataguard_broker format a16 tru select force_logging,remote_archive,supplemental_log_data_pk,supplemental_log_data_ui, switchover_status,dataguard_broker from v$database; -- This query produces a list of all archive destinations and shows if they are enabled, -- what process is servicing that destination, if the destination is local or remote, -- and if remote what the current mount ID is. For a physical standby we should have at -- least one remote destination that points the primary set but it should be deferred. COLUMN destination FORMAT A35 WRAP column process format a7 column archiver format a8 column ID format 99 select dest_id "ID",destination,status,target, archiver,schedule,process,mountid from v$archive_dest; -- If the protection mode of the standby is set to anything higher than max performance -- then we need to make sure the remote destination that points to the primary is set -- with the correct options else we will have issues during switchover. select dest_id,process,transmit_mode,async_blocks, net_timeout,delay_mins,reopen_secs,register,binding from v$archive_dest; -- The following select will show any errors that occured the last time an attempt to -- archive to the destination was attempted. If ERROR is blank and status is VALID then -- the archive completed correctly. column error format a55 tru select dest_id,status,error from v$archive_dest; -- Determine if any error conditions have been reached by querying thev$dataguard_status -- view (view only available in 9.2.0 and above): column message format a80 select message, timestamp from v$dataguard_status where severity in ('Error','Fatal') order by timestamp; -- The following query is ran to get the status of the SRL's on the standby. If the -- primary is archiving with the LGWR process and SRL's are present (in the correct -- number and size) then we should see a group# active. select group#,sequence#,bytes,used,archived,status from v$standby_log; -- The above SRL's should match in number and in size with the ORL's returned below: select group#,thread#,sequence#,bytes,archived,status from v$log; -- Query v$managed_standby to see the status of processes involved in the -- configuration. select process,status,client_process,sequence#,block#,active_agents,known_agents from v$managed_standby; -- Verify that the last sequence# received and the last sequence# applied to standby -- database. select al.thrd "Thread", almax "Last Seq Received", lhmax "Last Seq Applied" from (select thread# thrd, max(sequence#) almax from v$archived_log where resetlogs_change#=(select resetlogs_change# from v$database) group by thread#) al, (select thread# thrd, max(sequence#) lhmax from v$log_history where first_time=(select max(first_time) from v$log_history) group by thread#) lh where al.thrd = lh.thrd; -- The V$ARCHIVE_GAP fixed view on a physical standby database only returns the next -- gap that is currently blocking redo apply from continuing. After resolving the -- identified gap and starting redo apply, query the V$ARCHIVE_GAP fixed view again -- on the physical standby database to determine the next gap sequence, if there is -- one. select * from v$archive_gap; -- Non-default init parameters. set numwidth 5 column name format a30 tru column value format a50 wra select name, value from v$parameter where isdefault = 'FALSE'; spool off - - - - - - - - - - - - - - - - Script ends here - - - - - - - - - - - - - - - -

你可能感兴趣的:(oracle,Data,archivelog,corruption,standby,guard)