Oracle安全:SCN可能最大值与耗尽问题
在2012年第一季度的CPU补丁中,包含了一个关于SCN修正的重要变更,这个补丁提示,在异常情况下,Oracle的SCN可能出现异常增长,使得数据库的一切事务停止,由于SCN不能后退,所以数据库必须重建,才能够重用。
我曾经在以下链接中描述过这个问题:
http://www.eygle.com/archives/2012/03/oracle_scn_bug_exhaused.html
Oracle使用6 Bytes记录SCN,也就是48位,其最大值是:
SQL> col scn for 999,999,999,999,999,999 SQL> select power(2,48) scn from dual; SCN ------------------------ 281,474,976,710,656
Oracle在内部控制每秒增减的SCN不超过 16K,按照这样计算,这个数值可以使用大约544年:
SQL> select power(2,48) / 16 / 1024 / 3600 / 24 / 365 from dual; POWER(2,48)/16/1024/3600/24/365 ------------------------------- 544.770078
然而在出现异常时,尤其是当使用DB Link跨数据库查询时,SCN会被同步,在以下链接中,我曾经描述过此问题:
http://www.eygle.com/archives/2006/11/db_link_checkpoint_scn.html
一个数据库当前最大的可能SCN被称为"最大合理SCN",该值可以通过如下方式计算:
col scn for 999,999,999,999,999,999 select ( ( ( ( ( ( to_char(sysdate,'YYYY')-1988 )*12+ to_char(sysdate,'mm')-1 )*31+to_char(sysdate,'dd')-1 )*24+to_char(sysdate,'hh24') )*60+to_char(sysdate,'mi') )*60+to_char(sysdate,'ss') ) * to_number('ffff','XXXXXXXX')/4 scn from dual / 这个算法即SCN算法,以1988年1月1日 00点00时00分开始,每秒计算1个点数,最大SCN为16K。
这个内容可以参考如下链接:
http://www.eygle.com/archives/2006/01/how_big_scn_can_be.html
在CPU补丁中,Oracle提供了一个脚本 scnhealthcheck.sql 用于检查数据库当前SCN的剩余情况。
该脚本的算法和以上描述相同,最终将最大合理SCN 减去当前数据库SCN,计算得出一个指标:HeadRoom。也就是SCN尚余的顶部空间,这个顶部空间最后折合成天数:
以下是这个脚本的内容:
Rem Rem $Header: rdbms/admin/scnhealthcheck.sql st_server_tbhukya_bug-13498243/8 2012/01/17 03:37:18 tbhukya Exp $ Rem Rem scnhealthcheck.sql Rem Rem Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved. Rem Rem NAME Rem scnhealthcheck.sql - Scn Health check Rem Rem DESCRIPTION Rem Checks scn health of a DB Rem Rem NOTES Rem . Rem Rem MODIFIED (MM/DD/YY) Rem tbhukya 01/11/12 - Created Rem Rem define LOWTHRESHOLD=10 define MIDTHRESHOLD=62 define VERBOSE=FALSE set veri off; set feedback off; set serverout on DECLARE verbose boolean:=&&VERBOSE; BEGIN For C in ( select version, date_time, dbms_flashback.get_system_change_number current_scn, indicator from ( select version, to_char(SYSDATE,'YYYY/MM/DD HH24:MI:SS') DATE_TIME, (((( ((to_number(to_char(sysdate,'YYYY'))-1988)*12*31*24*60*60) + ((to_number(to_char(sysdate,'MM'))-1)*31*24*60*60) + (((to_number(to_char(sysdate,'DD'))-1))*24*60*60) + (to_number(to_char(sysdate,'HH24'))*60*60) + (to_number(to_char(sysdate,'MI'))*60) + (to_number(to_char(sysdate,'SS'))) ) * (16*1024)) - dbms_flashback.get_system_change_number) / (16*1024*60*60*24) ) indicator from v$instance ) ) LOOP dbms_output.put_line( '-----------------------------------------------------' || '---------' ); dbms_output.put_line( 'ScnHealthCheck' ); dbms_output.put_line( '-----------------------------------------------------' || '---------' ); dbms_output.put_line( 'Current Date: '||C.date_time ); dbms_output.put_line( 'Current SCN: '||C.current_scn ); if (verbose) then dbms_output.put_line( 'SCN Headroom: '||round(C.indicator,2) ); end if; dbms_output.put_line( 'Version: '||C.version ); dbms_output.put_line( '-----------------------------------------------------' || '---------' ); IF C.version > '10.2.0.5.0' and C.version NOT LIKE '9.2%' THEN IF C.indicator>&MIDTHRESHOLD THEN dbms_output.put_line('Result: A - SCN Headroom is good'); dbms_output.put_line('Apply the latest recommended patches'); dbms_output.put_line('based on your maintenance schedule'); IF (C.version < '11.2.0.2') THEN dbms_output.put_line('AND set _external_scn_rejection_threshold_hours=' || '24 after apply.'); END IF; ELSIF C.indicator<=&LOWTHRESHOLD THEN dbms_output.put_line('Result: C - SCN Headroom is low'); dbms_output.put_line('If you have not already done so apply' ); dbms_output.put_line('the latest recommended patches right now' ); IF (C.version < '11.2.0.2') THEN dbms_output.put_line('set _external_scn_rejection_threshold_hours=24 ' || 'after apply'); END IF; dbms_output.put_line('AND contact Oracle support immediately.' ); ELSE dbms_output.put_line('Result: B - SCN Headroom is low'); dbms_output.put_line('If you have not already done so apply' ); dbms_output.put_line('the latest recommended patches right now'); IF (C.version < '11.2.0.2') THEN dbms_output.put_line('AND set _external_scn_rejection_threshold_hours=' ||'24 after apply.'); END IF; END IF; ELSE IF C.indicator<=&MIDTHRESHOLD THEN dbms_output.put_line('Result: C - SCN Headroom is low'); dbms_output.put_line('If you have not already done so apply' ); dbms_output.put_line('the latest recommended patches right now' ); IF (C.version >= '10.1.0.5.0' and C.version <= '10.2.0.5.0' and C.version NOT LIKE '9.2%') THEN dbms_output.put_line(', set _external_scn_rejection_threshold_hours=24' || ' after apply'); END IF; dbms_output.put_line('AND contact Oracle support immediately.' ); ELSE dbms_output.put_line('Result: A - SCN Headroom is good'); dbms_output.put_line('Apply the latest recommended patches'); dbms_output.put_line('based on your maintenance schedule '); IF (C.version >= '10.1.0.5.0' and C.version <= '10.2.0.5.0' and C.version NOT LIKE '9.2%') THEN dbms_output.put_line('AND set _external_scn_rejection_threshold_hours=24' || ' after apply.'); END IF; END IF; END IF; dbms_output.put_line( 'For further information review MOS document id 1393363.1'); dbms_output.put_line( '-----------------------------------------------------' || '---------' ); END LOOP; end; / 在应用补丁之后,一个新的隐含参数 _external_scn_rejection_threshold_hours 引入,通常设置该参数为 24 小时:
_external_scn_rejection_threshold_hours=24
这个设置降低了SCN Headroom的顶部空间,以前缺省的设置容量至少为31天,降低为 24 小时,tb可以增大SCN允许增长的合理空间。
但是如果不加控制,SCN仍然可能会超过最大的合理范围,导致数据库问题。
这个问题的影响会极其严重,我们建议用户检验当前数据库的SCN使用情况,以下是检查脚本的输出范例:
-------------------------------------- ScnHealthCheck -------------------------------------- Current Date: 2012/01/15 14:17:49 Current SCN: 13194140054241 Version: 11.2.0.2.0 -------------------------------------- Result: C - SCN Headroom is low If you have not already done so apply the latest recommended patches right now AND contact Oracle support immediately. For further information review MOS document id 1393363. -------------------------------------- 这个问题已经出现在客户环境中,需要引起大家的足够重视。