如何查询Oracle性能监控

转摘:http://www.cnblogs.com/preftest/archive/2010/11/14/1876856.html

1、监控等待事件
select event,sum(decode(wait_time,0,0,1)) prev, sum(decode(wait_time,0,1,0)) curr,count(*)
from v$session_wait
group by event order by 4;

Lock wait occur when a session attempts to acquire a lock that is already held by another session. A session will be blocked until the blocking session releases the lock. Locks are designed to ensure data integrity by limiting simultaneous data access.

Multi-user database locking generally consists of two levels: exclusive locks and share locks. You want to watch out for exclusive locks (that is, TX) as they prohibit resource sharing. For example, the first transaction that exclusively locks a resource is the only one that can alter the resource (except for the DBA) until the exclusive lock is released. Share locks, unlike exclusive locks, allow a resource to be shared.

Deadlocking is commonly seen in multi-user systems. It typically occurs when all the hung users are waiting to access a table that another user has locked. This situation causes a deadlock, because each user (transaction) is waiting for resources to be freed by the other user (the blocker). Often, many developers attempt to update the same table and many users attempting to update or select from the same table.

Most locking issues are application-specific and can be addressed by tuning the concurrency logic in the application.

也可利用v$system_event视图执行下面的查询查看数据库中某些常见的等待事件:
select * from v$system_event
where event in ('buffer busy waits',
'db file sequential read',
'db file scattered read',
'enqueue',
'free buffer waits',
'latch free',
'log file parallel write',
'log file sync',
'enq: TX - row lock contention');

接着,利用下面对v$session_event和v$session视图进行的查询,研究具有对上面显示的内容有贡献的等待事件的会话:
select se.sid,s.username,se.event,se.total_waits,se.time_waited,se.average_wait
from v$session s,v$session_event se
where s.sid = se.sid
and se.event not like 'SQL*Net%'
and s.status = 'ACTIVE'
and s.username is not null;

还可以组合v$session和v$session_wait视图进行查询:
select sw.sid,s.username,sw.event,sw.wait_time,sw.state,sw.seconds_in_wait SEC_IN_WAIT
from v$session s,v$session_wait sw
where s.sid = sw.sid
and sw.event not like 'SQL*Net%'
and s.username is not null
order by sw.wait_time desc;

查询具体会话等待事件的详细信息
select sid,event,p1text,p1,p2text,p2,p3text,p3
from v$session_wait
where sid between &1 and &2
and event not like '%SQL%'
and event not like '%rdbms%';

在查出会话执行了什么SQL语句发生等待事件:
select s1.sid,s1.event,s2.sql_text
from v$session s1,v$sql s2
where s1.sid = &sid_in
and s1.event in('enq: TX - row lock contention')
and s1.SQL_ID = s2.sql_id ;

 


2、监控表空间的I/O比例:
select df.tablespace_name name,df.file_name "file",f.phyrds pyr,f.phyblkrd pbr,f.phywrts pyw,
f.phyblkwrt pbw
from v$filestat f,dba_data_files df
where f.file#=df.file_id

诊断:
If the number of physical block reads is significantly higher than the number of physical reads, this is an indication that the indexes on these tables may need to be reviewed, or there may be full table scans being performed on the tables within the tablespace.  In general, if the number of block reads is equal to the number of reads, the tables in the tablespace were being accessed by a ROWID, requiring the database to read only one data block.  

If one of the data files is getting a majority of the reads and writes, you may be able to improve performance by creating multiple data files on seperate disks or by striping the data file across multiple disks.

 


3、查询是否有长时间的操作
同时满足以下几个条件,操作信息才会出现在V$SESSION_LONGOPS中:
1)、操作是以下几种操作之一
# Table scan;
# Index Fast Full Scan;
# Hash join;
# Sort/Merge;
# Sort Output;
# Rollback;
# Gather Table's Index Statistics
2)、操作时间大于6秒
3)、读取的block数目大于一定量
如果是TABLE FULL SCAN,读取的block数目至少大于10000
如果是Index Fast Full Scan,读取的block数目至少大于1000
其他操作读取block的数目不明


实验:
create table tt as select * from all_objects;

commit;

Set timing on;

select * from tt order by 1,2,3,4;


用以下语句找出长时间操作的SQL语句:
select longops.sid,longops.elapsed_seconds,longops.opname,sql.sql_text from
 v$session_longops longops , v$sql sql where longops.elapsed_seconds>6 and longo
ps.sql_id=sql.sql_id;

或者:
SELECT SE.SID,
 OPNAME,
 TRUNC(SOFAR / TOTALWORK * 100, 2) || '%' AS PCT_WORK,
 ELAPSED_SECONDS ELAPSED,
 ROUND(ELAPSED_SECONDS * (TOTALWORK - SOFAR) / SOFAR) REMAIN_TIME,
 SQL_TEXT
 FROM V$SESSION_LONGOPS SL, V$SQLAREA SA, V$SESSION SE
 WHERE SL.SQL_HASH_VALUE = SA.HASH_VALUE
 AND SL.SID = SE.SID
 AND SOFAR != TOTALWORK
 ORDER BY START_TIME
;


调整PGA优化排序:
首先查看Oracle的v$pga_target_advice:
SELECT ROUND(pga_target_for_estimate/1024/1024) AS target_mb,
 estd_pga_cache_hit_percentage AS hit_ratio,
 estd_overalloc_count
FROM v$pga_target_advice
ORDER BY target_mb;
然后调整PGA
alter system set pga_aggregate_target=150M;

在OLTP系统中,典型PGA内存设置应该是总内存的较小部分(例如20%),剩下80%分配给SGA。
OLTP:PGA_AGGREGATE_TARGET = (total_mem * 80%) * 20%
在DSS系统中,由于会运行一些很大的查询,典型的PGA内存最多分配70%的内存。
DSS:PGA_AGGREGATE_TARGET = (total_mem * 80%) * 50%

Oracle的排序操作:
服务器首先在sort_area_size指定大小的内存区域里排序,如果所需的空间超过sort_area_size,排序会在临时表空间里进行。在专用服务器模式下,排序空间在PGA中,在共享服务器模式下,排序空间在UGA中。如果没有建立large pool,UGA处于shared pool中,如果建立了large pool,UGA就处于large pool中,而PGA不在sga中,它是与每个进程对应单独存在的。
PGA:program global area,为单个进程(服务器进程或后台进程)保存数据和控制信息的内存区域。PGA与进程一一对应,且只能被起对应的进程读写,PGA在用户登录数据库创建会话的时候建立。

排序诊断1:
Select * from v$sysstat where name like '%sort%';
--Sort(disk):要求IO去临时表空间的排序数目
--Sort(memory):完全在memory中完成的排序数目
--Sort(rows):被排序的行数合计
Sort(disk)/ Sort(memory)<5%, 如果超过5%,增加sort_area_size的值(调整PGA)。

SELECT disk.Value disk,mem.Value mem,(disk.Value/mem.Value)*100 ratio
FROM v$sysstat disk,v$sysstat mem
WHERE mem.NAME='sorts (memory)' AND disk.NAME='sorts (disk)';


*排序操作需要大量CPU时间和内存
Oracle的排序过程分析:
当待排序数据集不是很大时,服务器在内存(排序区)完成排序操作,如果排序需要更过的内存空间,服务器将进行如下处理:
(1)将数据分成多个小的集合,对每一集合进行排序
(2)服务器向磁盘申请临时空间,将排好序的中间结果写入临时段,再对另外的集成进行排序。
(2)在所有的集合均排好序后,服务器再将它们进行合并得到最终的结果,如果排序区尺寸太小,合并无法一次完成时,将分多次进行。


排序诊断2(监控临时表空间的使用情况及其配置):
Select tablespace_name,current_users,total_extents,
used_extents,extent_hits,max_used_blocks,max_sort_blocks
FROM v$sort_segment;

CURRENT_USERS: Number of active users
TOTAL_EXTENTS: Total number of extents
USED_EXTENTS: Extents currently allocated to sorts
EXTENT_HITS: Number of times an unused extent was found in the pool
MAX_USED_BLOCKS: Maximum number of used blocks
MAX_SORT_BLOCKS: Maximum number of blocks used by an individual sort

临时表空间的配置:
A、initial/next设置为sort_area_size的整数倍,允许额外的一个block作为segment的header
B、pctincrease=0
C、基于不同的排序需要建立多个临时表空间
D、将临时表空间文件分散到多个磁盘上
 

 

 

4、临时表空间使用的监控

如果大量排序操作发生,就有可能动用到临时表空间

通过动态性能视图v$sort_usage还可以查询使用排序段的用户与会话信息:
select username,segtype,segfile#,segblk#,extents,segrfno# from v$sort_usage;

从segfile#可以找到动用了哪个表空间:
select tm.file# Fnum ,tf.tfafn AFN,tm.name FName from v$tempfile tm,x$kcctf tf where tm.file# = tf.tfnum;

参考:
http://www.eygle.com/archives/2006/03/tempfile_and_sort_usage.html


如果程序中使用了临时的LOB类型变量,Oracle会分配临时空间,如果并发很高,初始区很大,那么数据库可能产生严重的TEMP表空间的不足问题.
可以通过测试轻易再现这种状况,在多个Session中执行如下代码:
declare
  A CLOB;
BEGIN
  A:='ABC';
  DBMS_LOCK.SLEEP(120);
END;

查询v$sort_usage视图,可以获得临时表空间的使用情况(哪个用户、哪个Session因为什么原因使用了多少临时表空间):
select s.username, s.sid, u.tablespace, u.contents, u.segtype,
round(u.blocks*8192/1024/1024,2) MB
from v$session s, v$sort_usage u
where s.saddr = u.session_addr
and u.contents = 'TEMPORARY'
order by MB DESC ;

SEGTYPE=SORT的是因为排序而用到临时表空间的。
SEGTYPE=LOB_DATA是因为使用了临时的LOB类型变量而用到临时表空间的。

参考:
http://www.eygle.com/archives/2006/03/lob_and_temporary_tablespace.html

实验:
SQL> select max(sid) from v$mystat;
 
  MAX(SID)
----------
        45
 
SQL> DECLARE
  2  a clob;
  3  BEGIN
  4  dbms_lob.createtemporary(a, TRUE,dbms_lob.call);
  5  dbms_lob.freetemporary(a);
  6  END;
  7  /
 
PL/SQL procedure successfully completed.
 
SQL> select count(*) from dual;
 
  COUNT(*)
----------
         1
 
SQL> SELECT se.inst_id,
  2         se.username username,
  3         se.SID sid,
  4         se.status status,
  5         se.sql_hash_value,
  6         se.prev_hash_value,
  7         su.TABLESPACE tablespace,
  8         su.segtype,
  9         su.CONTENTS CONTENTS,
 10         round(su.blocks * 8192 / 1024 / 1024, 2) MB
 11  FROM gv$session se,
 12       gv$sort_usage su
 13  WHERE se.saddr=su.session_addr
 14  AND   se.inst_id=su.inst_id
 15  ORDER BY MB;
 
INST_ID USERNAME SID STATUS SQL_HASH_VALUE PREV_HASH_VALUE TABLESPACE SEGTYPE  CONTENTS  MB
------- -------- --- ------ -------------- --------------- ---------- -------- --------- --
      1 SYS       45 INACTI      317853294       317853294 TEMP       LOB_DATA TEMPORARY  1

此时CLOB占用的TEMP空间不会自动释放,需要等待会话断开,才能释放。但这个空间,在本会话中,还是可以重用的,只是不供其它会话使用。

在10.2.0.3以前,只能让会话退出,以释放这部份空间,在10.2.0.4中当作一个BUG(Bug:5723140)来修复,但默认不激活,需要通过设置60025事件才可以释放这些lob的TEMP空间。
参考:
http://www.dbaroad.me/archives/2009/09/lob_temp.html


LOB类型变量:
数据库中提供了两种字段类型 Blob 和 Clob 用于存储大型字符串或二进制数据(如图片)。 Blob 采用单字节存储,适合保存二进制数据,如图片文件。 Clob 采用多字节存储,适合保存大型文本数据。


临时表空间优化:
(一)、创建用户时要记得为用户创建临时表空间。
(二)、合理设置PGA,减少临时表空间使用的几率。
(三)、要为临时表空间保留足够的硬盘空间。
参考:
http://database.51cto.com/art/200907/132965.htm


查看临时表空间占用率:
select * from v$temp_space_header;

重建临时表空间的方法:
Temporary tablespace是不能直接drop默认的临时表空间的,不过我们可以通过以下方法达到。
查看目前的Temporary Tablespace
SQL> select name from v$tempfile;

NAME
———————————————————————
D:\ORACLE\ORADATA\ORCL\TEMP01.DBF

SQL> select username,temporary_tablespace from dba_users;
USERNAME TEMPORARY_TABLESPACE
------------------------------ ------------------------------
MGMT_VIEW TEMP
SYS TEMP
SYSTEM TEMP
DBSNMP TEMP
SYSMAN TEMP

创建中转临时表空间
create temporary tablespace TEMP1 TEMPFILE 'D:\ORACLE\ORADATA\ORCL\temp02.DBF' SIZE 512M REUSE AUTOEXTEND ON NEXT 1M MAXSIZE UNLIMITED;
改变缺省临时表空间 为刚刚创建的新临时表空间temp1
alter database default temporary tablespace temp1;
删除原来临时表空间
drop tablespace temp including contents and datafiles;
重新创建临时表空间
create temporary tablespace TEMP TEMPFILE 'D:\ORACLE\ORADATA\ORCL\temp01.DBF' SIZE 512M REUSE AUTOEXTEND ON NEXT 1M MAXSIZE UNLIMITED;
重置缺省临时表空间为新建的temp表空间
alter database default temporary tablespace temp;
删除中转用临时表空间
drop tablespace temp1 including contents and datafiles;

以上的方法只是暂时释放了临时表空间的磁盘占用空间,是治标但不是治本的方法,真正的治本的方法是找出数据库中消耗资源比较大的sql语句,然后对其进行优化处理。
参考:
http://lanmh.javaeye.com/blog/643676

 

 

5、监控LibraryCache
SELECT (SUM(PINS - RELOADS)) / SUM(PINS) "LIB CACHE" FROM V$LIBRARYCACHE;

SELECT (SUM(GETS - GETMISSES - USAGE - FIXED)) / SUM(GETS) "ROW CACHE" FROM V$ROWCACHE;

SELECT SUM(PINS) "EXECUTIONS", SUM(RELOADS) "CACHE MISSES WHILE EXECUTING" FROM V$LIBRARYCACHE;

后者除以前者,此比率小于1%,接近0%为好。
SELECT SUM(GETS) "DICTIONARY GETS",SUM(GETMISSES) "DICTIONARY CACHE GET MISSES"
FROM V$ROWCACHE

查找不能被充分共享利用的SQL语句(查询LibraryCache中执行次数偏低的SQL语句):
SELECT sql_text FROM v$sqlarea WHERE executions < 5 ORDER BY UPPER(sql_text);

查找SQL执行次数和SQL解释次数(hard parse),对比两个值的差:
SELECT sql_text , parse_calls , executions FROM v$sqlarea ORDER BY parse_calls;

查询v$librarycache视图的Reloads值(reparsing)的值,值应该接近0,否则应该考虑调整shared pool size
invalidations的值也应该接近0
select namespace,gethitratio,pinhitratio,reloads,invalidations from v$librarycache;

重点关注SQL的命中率:
SELECT gethitratio FROM v$librarycache WHERE namespace = 'SQL AREA';
查看指定某条SQL语句的执行情况(执行次数、加载次数等):
SELECT sql_text , users_executing , executions , loads FROM v$sqlarea where sql_text like 'select * from hr.tt';

*SQL语句运行过程
1).使用hash算法得到sql语句的hash_value值
2).如果hash_value值在内存中,叫做命中执行软解析
3).如果hash_value值不存在,执行硬解析
4).语法解析,查看是否有错误
5).语意解析,查看权限是否符合
6).若有视图,取出视图的定义
7).进行sql语句的自动改写,如将子查询改写为连接
8).选择最优的执行计划
9).变量绑定
10).运行执行计划
11).返回结果给用户
因为软解析是从此11步骤中第9步开始的,因此软解析比硬解析节约大量的系统开销,应该尽量降低硬解析的次数


诊断:
1) 检查v$librarycache中sql area的gethitratio是否超过90%,如果未超过90%,应该检查应用代码,提高应用代码的效率:
Select gethitratio from v$librarycache where namespace='SQL AREA';
2) v$librarycache中reloads/pins的比率应该小于1%,如果大于1%,应该增加参数shared_pool_size的值:
Select sum(pins) "executions", sum(reloads) "cache misses",sum(reloads)/sum(pins) from v$librarycache;
reloads/pins>1%有两种可能,一种是library cache空间不足,一种是sql中引用的对象不合法。
3)查看某个session的hard parse个数:
select a.sid,a.value from v$sesstat a,v$session b ,v$statname c where a.sid=b.sid and a.statistic#=c.statistic# and a.sid = 137 and c.name='parse count (hard)';


调优方法:
1)、调整shared_pool_size
SELECT shared_pool_size_for_estimate AS pool_size,estd_lc_size,estd_lc_time_saved FROM v$shared_pool_advice;

Alter System set shared_pool_size=120M;

2)、书写程序是尽量使用变量不要过多的使用常量
实验:
创建表格
SQL>CREATE TABLE  m(x int);
创建存储过程proc1,使用绑定变量
SQL>CREATE OR REPLACE PROCEDURE proc1
AS
BEGIN
  FOR i IN 1..10000
  LOOP
Execute immediate
  'INSERT INTO m VALUES(:x)' USING i;
  END LOOP;
END;
/
创建存储过程proc2,不使用绑定变量
SQL>CREATE OR REPLACE PROCEDURE proc2
AS
BEGIN
  FOR i IN 1..10000
  LOOP
Execute immediate
  'INSERT INTO m VALUES('||i||')' ;
  END LOOP;
END;
/
执行proc2和proc1,对比执行效率
SQL>SET TIMING ON
SQL> exec proc2;
PL/SQL procedure successfully completed.
Elapsed: 00:00:08.93
SQL> select count(*) from m;
  COUNT(*)
----------
     10000
Elapsed: 00:00:00.01
SQL> TRUNCATE TABLE m;
Table truncated.
Elapsed: 00:00:01.76
SQL> exec proc1;
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.85
SQL> select count(*) from m;
  COUNT(*)
----------
     10000
Elapsed: 00:00:00.00

3)、修改cursor_sharing参数为similar,让类似的SQL语句不做hard parse:
有时候我们的应用程序没有使用绑定变量,而修改程序可能有点困难,我们可能需要设置CURSOR_SHARING=SIMILAR来强制ORACLE使用绑定变量。
Show parameter cursor
Alter system set cursor_sharing=SIMILAR
参考:
http://blog.csdn.net/biti_rainy/archive/2004/07/12/39466.aspx
http://space.itpub.net/519536/viewspace-562987
http://wiki.oracle.com/page/CURSOR_SHARING
实验:
SQL> show parameter cursor_sharing
 cursor_sharing string EXACT
SQL> select * from test where object_id=1;
no rows selected
SQL> select sql_text,parse_calls from v$sqlarea where sql_text like 'select * from test%';
select * from test where object_id=:"SYS_B_0" 2
select * from test where object_id=1 1
SQL> alter system flush shared_pool;
System altered.
SQL> alter system flush shared_pool;
System altered.
SQL> alter session set cursor_sharing=similar; ----second
Session altered.
SQL> select * from test where object_id=1;
no rows selected
SQL> select sql_text,parse_calls from v$sqlarea where sql_text like 'select * from test%';
select * from test where object_id=:"SYS_B_0" 1
SQL> select * from test where object_id=2;
no rows selected
SQL> select sql_text,parse_calls from v$sqlarea where sql_text like 'select * from test%';
select * from test where object_id=:"SYS_B_0" 2

4)、大对象保留
查找没有保存在library cache中的大对象:
Select * from v$db_object_cache where sharable_mem>10000 and type in ('PACKAGE','PROCEDURE','FUNCTION','PACKAGE BODY') and kept='NO';
将这些对象保存在library cache中:
Execute dbms_shared_pool.keep('package_name');
对应脚本:dbmspool.sql

参考:
http://database.51cto.com/art/201004/194003.htm

 

 

6、找使用CPU多的用户session
select a.sid,spid,status,substr(a.program,1,40) prog,a.terminal,osuser,value/60/100 value
from v$session a,v$process b,v$sesstat c
where c.statistic#=12 and c.sid=a.sid and a.paddr=b.addr order by value desc;
*12是cpu used by this session

再找出使用CPU多的SQL语句:
查找指定SPID正在执行的SQL语句:
SELECT P.pid pid,S.sid sid,P.spid spid,S.username username,
S.osuser osname,P.serial# S_#,P.terminal,P.program  program,
P.background,S.status,RTRIM(SUBSTR(a.sql_text, 1, 80))  SQL
FROM v$process P, v$session S,v$sqlarea A WHERE P.addr = s.paddr
AND S.sql_address = a.address (+)  AND P.spid LIKE '%&1%';

*在linux环境可以通过ps查看进程信息包括pid,windows中任务管理器的PID与v$process中pid不能一一对应。windows是多线程服务器,每个进程包含一系列线程。这点于unix等不同,Unix每个Oralce进程独立存在,在Nt上所有线程由Oralce进程衍生。

指定SID查看正在执行的SQL语句:
SELECT P.pid pid,S.sid sid,P.spid spid,S.username username,
S.osuser osname,P.serial# S_#,P.terminal,P.program  program,
P.background,S.status,RTRIM(SUBSTR(a.sql_text, 1, 80))  SQL
FROM v$process P, v$session S,v$sqlarea A WHERE P.addr = s.paddr
AND S.sql_address = a.address (+)  AND s.sid = '136';

 

7、回滚段的争用情况:
select name,waits,gets,waits/gets ratio from v$rollstat a,v$rollname b where a.usn=b.usn;

对含有回滚段块的缓冲区的争用也会影响到对回滚段的争用。这可以通过查询动态性能表V$WAITSTAT来检测是否存在对回滚段的争用,例如:
SELECT class,count FROM V$WAITSTAT 
WHERE class IN('system undo header','system undo block','undo header','undo block');
其中参数含义如下:
◆ system undo header:对含有SYSTEM回滚段标题块的缓冲区的等待次数。
◆ system undo block:对含有SYSTEM回滚段非标题块的缓冲区的等待次数。
◆ undo header:对含有非SYSTEM回滚段标题块的缓冲区的等待次数。
◆ undo block:对含有非SYSTEM回滚段非标题块的缓冲区的等待次数。

如果任何等待次数大于总请求数的1%,则应创建更多的回滚段来减少竞争,可以周期性地检查这些统计数字,并将它与总的请求数据的次数作比较。总的请求数据次数可用如下语句求出:
SELECT SUM(value) FROM V$SYSSTAT 
WHERE name IN('db block gets','consistent gets');

8.查询 Buffer 命中率

 select 1 - ((physical.value - direct.value - lobs.value) / logical.value)
"Buffer Cache Hit Ratio"
from v$sysstat physical,v$sysstat direct,v$sysstat lobs,v$sysstat logical
where physical.name = 'physical reads'
and direct.name='physical reads direct'
and lobs.name='physical reads direct (lob)'
and logical.name='session logical reads';

 

你可能感兴趣的:(oracle)