海量dml表的统计问题

 对于dml频繁的统计一定要注意一点是必须分阶段实施统计,不可以一次性取很长时间的数据,因为数据库为了保证一致性,会在sql执行的时候记录一个起始时间点,然后对于之后所有修改过的块都会去回滚段去寻找,这样有两个问题:
第一:
      造成回滚段长时间占用,尤其是在分布式事务(比如dblink取数据),这个查询是以事务的方式进行占用回滚段的,回滚段不足会导致数据库正常的dml由于无回滚段使用而报错,
另外非分布式的长时间查询也会产生ora-01555错误。
第二:
      对于本身自己的查询也会造成很慢,为什么会慢呢?
下面有两个例子大家看下:
例子1:  多会话修改同一数据块不同行:
(1)SQL> drop drop sequence t1_seq;    --删除一个序列
SQL> create sequence t1_seq;        --创建一个序列t1_seq
SQL> create table t1 as                 --创建表t1 (11)行
  2  select
  3  rownum id, rpad('x',10) small_vc
  4  from  all_objects  where   rownum <= 11;
SQL> execute dbms_stats.gather_table_stats(user,'t1')    --分析表(为了去除动态采样的影响):
PL/SQL 过程已成功完成。
SQL>
SQL> select a.*,dbms_rowid.rowid_block_number(ROWID) block_no from t1 a;        ---表已经有11行数据,且在同一个块(529804)里面:
        ID SMALL_VC     BLOCK_NO
---------- ---------- ----------
         1 x              529804
         2 x              529804
         3 x              529804
         4 x              529804
         5 x              529804
         6 x              529804
         7 x              529804
         8 x              529804
         9 x              529804
        10 x              529804
        11 x              529804
已选择11行。
                                                                          ----
这个时候新起10个不同的会话:然后执行如下操作:
SQL> column seqval new_value m_seq ;
SQL> select t1_seq.nextval seqval from dual;                                 ---这里用到序列号                             
    SEQVAL
----------
         1
SQL>
SQL> update t1 set small_vc = upper(small_vc) where id = &m_seq;
原值    1: update t1 set small_vc = upper(small_vc) where id = &m_seq
新值    1: update t1 set small_vc = upper(small_vc) where id =          1
已更新 1 行。
 
。。。。。。                                                       --省略8个会话结果
SQL> column seqval new_value m_seq
SQL> select t1_seq.nextval seqval from dual;
    SEQVAL
----------
        10
SQL> update t1 set small_vc = upper(small_vc) where id = &m_seq;
原值    1: update t1 set small_vc = upper(small_vc) where id = &m_seq
新值    1: update t1 set small_vc = upper(small_vc) where id =         10                         --第10个会话
已更新 1 行。
注意:都不提交:
10个并发会话完成修改
在原来的会话执行如下操作:
SQL> execute snap_my_stats.start_snap ;                                                   ---记录统计起始点
PL/SQL 过程已成功完成。  
SQL> update t1 set small_vc = upper('small_vc') where id = 11;                       ----进行一行的更新(刚才10个会话修改的都是id为1..10,现在我们更新id为11)
已更新 1 行。
SQL> execute snap_my_stats.end_snap;                                                      ----结束统计,查看没有dml对数据库的消耗:
SQL> execute snap_my_stats.end_snap;
---------------------------------
Session stats - 14-8月  12:28:50
Interval:-  8 seconds
---------------------------------
Name                                                                     Value
----                                                                     -----
session logical reads                                                       28
db block gets                                                                1
db block gets from cache                                                     1
consistent gets                                                             27
consistent gets from cache                                                  27
consistent gets - examination                                               24
db block changes                                                             3
consistent changes                                                          10
free buffer requested                                                        1
CR blocks created                                                            1
data blocks consistent reads - undo records applied                10
cleanouts and rollbacks - consistent read gets                               1
immediate (CR) block cleanout applications                                   1
table scan rows gotten                                                      11
table scan blocks gotten                                                     1
看最后两行这个是扫秒表里面的块,只扫描了一个表里面的块,一共有11行(这个是符合我们的逻辑的),但是我们却消耗了28个逻辑读(1(db block gets)+27(consistent gets))
其中27个逻辑读有24个是检查一致性及访问其他的开销(有10个undo应用),并构造了一个cr块(注意:cr块只可以给当前会话使用):
这个是update一行的开销,比我们普通的开销(不需要读取回滚段的情况,上图情况,考虑全表读取有8个块,大概有10-12个逻辑读)要大得多。
 
例子2:一个会话修改,对另外一个会话的影响:
SQL> drop table t1 purge;  
表已删除。
SQL>
SQL> create table t1 (id number, n1 number);
表已创建。
SQL>
SQL> insert into t1 values (1,0);
已创建 1 行。
SQL>
SQL> insert into t1 values (2,0);
已创建 1 行。
SQL>
SQL> commit;
如上:构造了一个两行的表:
SQL> execute snap_my_stats.start_snap
PL/SQL 过程已成功完成。
SQL> begin
  2
  3      for i in 1..1000 loop
  4
  5          update t1 set n1 = i where id = 1;
  6
  7      end loop;
  8
  9  end;
 10
 11  /
PL/SQL 过程已成功完成。
SQL> execute snap_my_stats.end_snap
---------------------------------
Session stats - 14-8月  13:47:56
Interval:-  0 seconds
---------------------------------
Name                                                                     Value
----                                                                     -----
session logical reads                                                    8,040
CPU used when call started                                                   8
CPU used by this session                                                     8
DB time                                                                      8
db block gets                                                            1,031
db block gets from cache                                                 1,031
consistent gets                                                          7,009
consistent gets from cache                                               7,009
consistent gets - examination                                                8
db block changes                                                         2,016
free buffer requested                                                    1,015
hot buffers moved to head of LRU                                             2
switch current to new buffer                                             1,000
calls to kcmgas                                                          1,015
calls to get snapshot scn: kcmgss                                        3,005
no work - consistent read gets                                           5,001
table scan rows gotten                                                   2,000
PL/SQL 过程已成功完成。
在第一个会话update一行1千次,我们看到第一个会话使用的逻辑读为8040,这个会话先不提交,然后新开一个会话:
执行同样的操作:
SQL> execute snap_my_stats.start_snap
PL/SQL 过程已成功完成。
SQL>
SQL> begin
  2
  3      for i in 1..1000 loop
  4
  5          update t1 set n1 = i where id = 2;
  6
  7      end loop;
  8
  9  end;
 10
 11  /
PL/SQL 过程已成功完成。
SQL> set serverout on;
SQL> execute snap_my_stats.end_snap
---------------------------------
Session stats - 14-8月  13:56:30
Interval:-  28 seconds
---------------------------------
Name                                                                     Value
----                                                                     -----
session logical reads                                                1,010,041
CPU used when call started                                                 207
CPU used by this session                                                   205
DB time                                                                    209
ges messages sent                                                            4
db block gets                                                            1,031
db block gets from cache                                                 1,031
consistent gets                                                      1,009,010
consistent gets from cache                                           1,009,010
consistent gets - examination                                        1,002,010
db block changes                                                         3,016
consistent changes                                                   1,000,000
change write time                                                            2
free buffer requested                                                    1,015
dirty buffers inspected                                                      1
hot buffers moved to head of LRU                                           252
free buffer inspected                                                      607
CR blocks created                                                        1,000
SQL*Net roundtrips to/from client                                            5
PL/SQL 过程已成功完成。
这里可以看到发生了100多万次(比第一个会话的1000多个翻了1000倍)的逻辑读:这里就是为了去读取这些回滚段而消耗的资源:   
 
可以看到,对于dml非常频繁的表如果需要进行统计的话,一般要减少回滚段的访问:
对于使用createtime ,modifytime等条件访问的,可以将使用天的削减为使用小时访问,
有分区表的,需要进行全表扫描的,需要使用单分区一个个访问:
比如需要全表访问订购关系表(8个分区全表扫描的):
FOR j IN 1..8 LOOP                               ---这里使用循环,这样每次循环就是一个新的时间点了(消除了开始的一致性特征)
  l_s_partid := to_char(j); --分区id
  VarSql     := 'select usernumber,orderstatus
  from odsview.vw_os_user_service_p' || l_s_partid || '
 where serviceid = 10
   and orderstatus in (0, 2, 4)';
   VarSql1     := 'select usernumber
                  from datasync_1.mail_notify_limit a
                 where  serviceid = 10
                   and notifytype = 0';
  /*  VarSql := 'select USERNUMBER,notifytype from tmp_noti_0806';*/
  temp_hch_log_insert('tmp_notify_hch_0806_2',
                      'SQL=' || VarSql,
                      varReturnValue);
  l_s_sql := 'null';
  temp_hch_log_insert('tmp_notify_hch_0806_2',
                      'l_s_sql=' || l_s_sql,
                      varReturnValue);
  OPEN cur_cursor1 for VarSql;
  LOOP
    FETCH cur_cursor1 BULK COLLECT
      INTO list_usernumber, list_orderstatus LIMIT 60000;
    forall i in 1 .. list_usernumber.count
      insert into TMP_HCH_0806
        (USERNUMBER, Orderstatus,partid,PARTDB)
        values(list_usernumber(i), list_orderstatus(i), j,0);
    COMMIT;
    temp_hch_log_insert('tmp_notify_zqs_0806_2',
                        '----处理50000条记录' || l_s_partid || '=' ||
                        l_i_reccount,
                        varReturnValue);
   EXIT WHEN cur_cursor1%NOTFOUND OR cur_cursor1%NOTFOUND IS NULL;
  END LOOP;
  CLOSE cur_cursor1;
END LOOP;
 
同样对于使用天为单位的:需要削减为24小时访问,还有一种方式是使用rowid逻辑并行抽取数据:

详见:http://blog.csdn.net/huangchao_sky/article/details/8451077
另外如果发现使用dblink访问生产数据,执行了一个很久的查询出不来,
即使你客户端终止了(比如关闭pl/sql或者断网),远程的数据库往往还是在执行,这样很容易对生产数据库造成undo占用的影响,这个时候请及时联系dba处理。
 
                            
 
 

你可能感兴趣的:(海量数据,ora-1555,大表DML)