Oracle 如何查找真正占用大量temp表空间的sql

收到一些库有会话占用了大量temp表空间的告警,通常的做法是查询v$sort_usage找到对应sql_id

select * from   
(select username,session_addr,sql_id,contents,segtype,blocks*8/1024/1024 gb   
from v$sort_usage order by blocks desc)   
where rownum<=20;

但有时会发现找出来的sql特别简单,怎么看都不像能占用几十几百G temp表空间的样子

为什么从v$sort_usage找到的sql_id有时是错的?应该怎么找正确的sql_id?

关于第一个问题,查找了网上的一些文章,发现v$sort_usage的sql_id并不对应v$session的sql_id,它对应的是v$session的prev_sql_id

有一个V$FIXED_VIEW_DEFINITION视图可以查看固定视图定义,查询可以看到v$sort_usage定义如下

select * from V$FIXED_VIEW_DEFINITION where view_name='V$SORT_USAGE';

--输出如下
select  USERNAME , "USER" , SESSION_ADDR , SESSION_NUM , SQLADDR , SQLHASH, SQL_ID, TABLESPACE , CONTENTS , SEGTYPE , SEGFILE# , SEGBLK# ,EXTENTS , BLOCKS , SEGRFNO# from GV$SORT_USAGE where inst_id = USERENV('Instance');

可以看到它来自GV$SORT_USAGE,再查看GV$SORT_USAGE的定义

select * from V$FIXED_VIEW_DEFINITION where view_name='GV$SORT_USAGE';

--输出如下
select x$ktsso.inst_id,
       username,
       username,
       ktssoses,
       ktssosno,
       prev_sql_addr,
       prev_hash_value,
       prev_sql_id,  ----------- 注意这个
       ktssotsn,
       decode(ktssocnt, 0, 'PERMANENT', 1, 'TEMPORARY'),
       decode(ktssosegt,
              1,
              'SORT',
              2,
              'HASH',
              3,
              'DATA',
              4,
              'INDEX',
              5,
              'LOB_DATA',
              6,
              'LOB_INDEX',
              'UNDEFINED'),
       ktssofno,
       ktssobno,
       ktssoexts,
       ktssoblks,
       ktssorfno
  from x$ktsso, v$session
 where ktssoses = v$session.saddr
   and ktssosno = v$session.serial#;

可以看到v$sort_usage的sql_id对应的是v$session的prev_sql_id。如果当前sql是最新正在执行的语句,查出来的结果是对的;如果过后执行了其它sql,看到就是错误的语句。

下面有一个例子,来自   老熊的三分地-Oracle及数据恢复 » 一次临时表空间大量占用问题的处理老熊的三分地-Oracle及数据恢复 » 一次临时表空间大量占用问题的处理

--根据dba_objects建一个测试表T1,使其数据量达到2000万行。  
 select count(*) from t1;  
  
   COUNT(*)  
-----------  
   20171200  
   
--然后将SQL工作区设置为手动模式,设置sort内存大小限制为200M:  
 alter session set workarea_size_policy=manual;  
 alter session set sort_area_size=209715200;  
  
--查询得到当前的会话sid:  
 select sid from v$mystat where rownum< =1;  
  
        SID  
-----------  
       2111  
  
--执行这下面的代码:  
 declare  
       v_object_name varchar2(100);  
       v_dummy varchar2(100);  
    begin  
      for rec in (select * from t1 order by object_id,object_name) loop  
         select object_type into v_dummy from t1 where rownum<=1;  
         select object_name into v_object_name from dba_objects where object_id=rec.object_id;  
         dbms_lock.sleep(60*10);  
         exit;  
     end loop;  
   end;  
   /  

--这段代码会打开一个游标,对2000万的数据量进行排序,然后在循环中只取一条数据,然后就进入sleep。在另一个窗口中监控到2111这个会话的event变成了PL/SQL lock timer,就去查询v$sort_usage:  
select a.sql_id sort_sql_id,b.sql_id,b.prev_sql_id, contents,segtype,blocks*8/1024/1024 gb   
from v$sort_usage a,v$session b   
where a.session_addr=b.saddr and b.sid=2111;  
  
SORT_SQL_ID   SQL_ID        PREV_SQL_ID   CONTENTS  SEGTYPE            GB  
------------- ------------- ------------- --------- --------- -----------  
fabh24prgk2sj bhzf316mdc07w fabh24prgk2sj TEMPORARY SORT      1.444824219  

--可以看到v$sort_usage中的SQL_ID(即上述结果中SORT_SQL_ID)与v$session中的pre_sql_id一致,这条SQL是:  
@sqlbyid fabh24prgk2sj  

SQL_FULLTEXT  
--------------------------------------------------------  
SELECT OBJECT_NAME FROM DBA_OBJECTS WHERE OBJECT_ID=:B1  
  
--而实际上当前正在执行的SQL是:  
@sqlbyid bhzf316mdc07w  
  
SQL_FULLTEXT  
---------------------------------------------------------------------------  
declare  
   v_object_name varchar2(100);  
   v_dummy varchar2(100);  
begin  
  for rec in (select * from t1 order by object_id,object_name) loop  
     select object_type into v_dummy from t1 where rownum<=1;  
     select object_name into v_object_name from dba_objects where object_id=rec.object_id;  
     dbms_lock.sleep(60*10);  
     exit;  
  end loop;  
end;

这个问题记录在了mos文档 Bug 17834663 - Include SQL ID for statement that created a temporary segment in GV$SORT_USAGE (文档 ID 17834663.8) 可以看到12.2以下版本都受该bug影响(查了下19c的定义还是用的prev_sql_id)

Affects:
Product (Component)Oracle Server (Rdbms)Range of versions believed to be affectedVersions BELOW 12.2Versions confirmed as being affected

 11.2.0.3https://support.oracle.com/epmos/faces/DocumentDisplay?parent=DOCUMENT&sourceId=17834663.8&id=245840.1#AFFECTS_11.2.0.3
Platforms affectedGeneric (all / most platforms affected)

Fixed:
The fix for 17834663 is first included in

 12.2.0.1 (Base Release)https://support.oracle.com/epmos/faces/DocumentDisplay?parent=DOCUMENT&sourceId=17834663.8&id=245840.1#FIXED_12.2.0.1
 12.1.0.2 (Server Patch Set)https://support.oracle.com/epmos/faces/DocumentDisplay?parent=DOCUMENT&sourceId=17834663.8&id=245840.1#FIXED_12.1.0.2

根据文档,从11.2.0.2版本之后,为v$sort_usage的基表增加了KTSSOSQLID列,它真正代表消耗高temp表空间的sql

x$ktsso is the base table that stores the temp tablespace usage information. With 11.2.0.1 or earlier, it stores no SQLIDs. From 11.2.0.2, a new column KTSSOSQLID had been added.
Oracle 如何查找真正占用大量temp表空间的sql_第1张图片

所以真正查询消耗高temp表空间的sql语句应该为

SELECT S.sid,
       S.serial# sid_serial,
       S.username,
       S.osuser,
       P.spid,
       S.module,
       S.program,
       SUM(T.blocks) * TBS.block_size / 1024 / 1024 mb_used,
       T.tablespace,
       COUNT(*) sort_ops
  FROM v$sort_usage T, v$session S, dba_tablespaces TBS, v$process P
 WHERE T.session_addr = S.saddr
   AND S.paddr = P.addr
   AND T.tablespace = TBS.tablespace_name
 GROUP BY S.sid,
          S.serial#,
          S.username,
          S.osuser,
          P.spid,
          S.module,
          S.program,
          TBS.block_size,
          T.tablespace
HAVING SUM(blocks) > 10000 -- 80MB
 ORDER BY sid_serial;

--将sid代入下面sql查出真正的sql_id
select ktssosqlid from x$ktsso, v$session where ktssoses = v$session.saddr and ktssosno = v$session.serial# and v$session.sid=2111;

 或者直接利用下面的sql

select k.inst_id "INST_ID",  
       ktssoses "SADDR",  
       sid,  
       ktssosno "SERIAL#",  
       username "USERNAME",  
       osuser "OSUSER",   
       ktssosqlid "SQL_ID",  
       ktssotsn "TABLESPACE",  
       decode(ktssocnt, 0, 'PERMANENT', 1, 'TEMPORARY') "CONTENTS",  
       --注意在12c的v$sort_usage定义中TABLESPACE和CONTENTS已经发生变化了。  
       decode(ktssosegt, 1, 'SORT', 2, 'HASH', 3, 'DATA', 4, 'INDEX',   
          5, 'LOB_DATA', 6, 'LOB_INDEX' , 'UNDEFINED') "SEGTYPE",  
       ktssofno "SEGFILE#",  
       ktssobno "SEGBLK#",  
       ktssoexts "EXTENTS",  
       ktssoblks "BLOCKS",  
       round(ktssoblks*p.value/1024/1024, 2) "SIZE_MB",  
       ktssorfno "SEGRFNO#"  
from x$ktsso k, v$session s,   
     (select value from v$parameter where name='db_block_size') p   
where ktssoses = s.saddr  
  and ktssosno = s.serial#;

 

v$sort_usage中的SEGTYPE列含义

  • SORT:SQL排序使用的临时段,包括order by、group by、union、distinct、窗口函数、建索引等产生的排序。
  • DATA:临时表(Global Temporary Table)存储数据使用的段。
  • INDEX:临时表上建的索引使用的段。
  • HASH:hash算法所使用的临时段。
  • LOB_DATA和LOB_INDEX:临时LOB使用的临时段。

另外可以通过以下语句查看一段时间前占用temp表空间高的sql

select *
  from (select t.sample_time,
               s.PARSING_SCHEMA_NAME,
               t.sql_id,
               t.sql_child_number as sql_child,
               round(t.temp_space_allocated / 1024 / 1024 / 1024, 2) || ' G' as temp_used,
               round(t.temp_space_allocated /
                     (select sum(decode(d.autoextensible, 'YES', d.maxbytes, d.bytes))
                        from dba_temp_files d),
                     2) * 100 || ' %' as temp_pct,
               t.program,
               t.module,
               s.SQL_TEXT
          from v$active_session_history t, v$sql s
         where t.sample_time > to_date('2019-07-15 22:00:00', 'yyyy-mm-dd hh24:mi:ss')
           and t.sample_time < to_date('2019-07-15 23:00:00', 'yyyy-mm-dd hh24:mi:ss')
           and t.temp_space_allocated is not null
           and t.sql_id = s.SQL_ID
         order by t.temp_space_allocated desc)
 where rownum < 50
 order by temp_used desc;

参考

Bug 17834663 - Include SQL ID for statement that created a temporary segment in GV$SORT_USAGE (文档 ID 17834663.8)

老熊的三分地-Oracle及数据恢复 » 一次临时表空间大量占用问题的处理

你可能感兴趣的:(oracle,sql,数据库)