awr中DB CPU过低的原因分析

前几天在做巡检的时候发现有个库的负载在某一个时间段内极高,高达100倍。一个10分钟的awr报告,得到的db time 却有1000分钟。
Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 27745 13-Jan-15 06:00:02 3500 6.4
End Snap: 27746 13-Jan-15 06:10:03 3512 6.4
Elapsed:   10.01 (mins)    
DB Time:   1,010.21 (mins)    
发生的时间段是早上6点左右。想必不大可能是人为操作的原因。
但是查看time model的时候却让人感觉很奇怪。
Time Model Statistics
Statistic Name Time (s) % of DB Time
sql execute elapsed time 7,303.18 12.05
DB CPU 1,233.88 2.04
PL/SQL execution elapsed time 111.37 0.18
parse time elapsed 14.47 0.02
inbound PL/SQL rpc elapsed time 11.08 0.02
RMAN cpu time (backup/restore) 5.57 0.01
hard parse elapsed time 4.08 0.01
connection management call elapsed time 2.41 0.00
sequence load elapsed time 2.03 0.00
repeated bind elapsed time 0.03 0.00
failed parse elapsed time 0.02 0.00
hard parse (sharing criteria) elapsed time 0.01 0.00
hard parse (bind mismatch) elapsed time 0.01 0.00
PL/SQL compilation elapsed time 0.00 0.00
DB time 60,612.63  
background elapsed time 1,770.76  
background cpu time 40.97  

OS级别的统计信息如下。
Statistic Value End Value
BUSY_TIME 197,613  
IDLE_TIME 4,611,782  
IOWAIT_TIME 10,540  
NICE_TIME 0  
SYS_TIME 34,788  
USER_TIME 150,433  
LOAD 10 1

Host CPU (CPUs: 80 Cores: 40 Sockets: 4)

Load Average Begin Load Average End %User %System %WIO %Idle
9.92 10.06 3.1 0.7 0.2 95.9

Instance CPU

%Total CPU %Busy CPU %DB time waiting for CPU (Resource Manager)
2.7 64.5 0.0

OS Busy% = 197613/( 197613+4611782 ) = 25%
Inst CPU% (using DB CPU) = 1 2 34*100/ ( 197613 + 4611782)=2.57%

查看等待事件的时候发现
Top 5 Timed Foreground Events
Event Waits Time(s) Avg wait (ms) % DB time Wait Class
db file sequential read 1,286,918 4,395 3 7.25 User I/O
DB CPU   1,234   2.04  
direct path read 68,781 870 13 1.43 User I/O
SQL*Net more data from dblink 2,860 438 153 0.72 Network
db file parallel read 15,649 315 20 0.52 User I/O

Inst CPU% (using db file sequential read) = 4395*1000/ ( 197613 + 4611782) = 9% 

可以很明显的看到top 5的等待事件只占用了将近10%左右的DB time,剩下的90%左右的到哪去了呢?

DB CPU+Foreground events=1234+4395+870+438+315=7252 而DB time= 1010.21*60=60612, 差别实在是太大了。
对于这个问题,需要首先关注的就是sql部分,DB time是会把程序等待的时间计算进去的,但是DB cpu不会。在短短的这十分钟里看来还是有不少的文章的。
查看sql的统计情况时,可以明显看到executions为0的sql语句。但是占用的总体比例看起来很低,是因为这部分的时间没有算到DB cpu里面去。
Elapsed Time (s) Executions Elapsed Time per Exec (s) %Total %CPU %IO SQL Id SQL Module SQL Text
602.35 0   0.99 2.39 98.75 9yn2xdw7uy0bq JDBC Thin Client select /*+ index(s SERVICE_AGR...
528.06 0   0.87 2.48 98.51 6kaqc0d7sgvx8 SQL Developer select bc.charge_seq_no, bc.se...
460.43 1 460.43 0.76 1.78 5.04 280b97jfdzvcm   DECLARE job BINARY_INTEGER := ...
460.36 1 460.36 0.76 1.78 5.03 f4s5k0j1whjwx   create table RATED_EVENT NOLOG...
458.74 1 458.74 0.76 13.08 86.54 d6078fp7h1jb6 JDBC Thin Client SELECT CD.TARGET_PCN, SE.AGREE...
446.97 1 446.97 0.74 11.19 90.35 cm7r7w9vvbcbd PL/SQL Developer declare cnt number; begin cnt ...
446.97 1 446.97 0.74 12.18 89.21 fayqg3gtg6ygs PL/SQL Developer declare cnt number; begin cnt ...
446.97 1 446.97 0.74 11.28 90.19 3gbq16harc68h PL/SQL Developer declare cnt number; begin cnt ...
446.94 1 446.94 0.74 11.54 89.88 asc03yk438k67 PL/SQL Developer declare cnt number; begin cnt ...
378.29 8 47.29 0.62 40.37 57.10 00jrngwmz2vxx JDBC Thin Client /* AR9_AR9GetPaymentSMS_select...

继续查看其它的部分,就会发现executions为0的sql远不止那两个。
Buffer Gets Executions Gets per Exec %Total Elapsed Time (s) %CPU %IO SQL Id SQL Module SQL Text
20,877,607 304 68,676.34 15.86 146.80 99.85 0.00 aty7a3bvqfxxx JDBC Thin Client SELECT COUNT(1) FROM (/* null_...
15,363,930 5 3,072,786.00 11.67 10.84 102.02 0.00 1hg2wcuapy3y3 JDBC Thin Client select bl1_run_request.run_mod...
7,445,354 8 930,669.25 5.66 378.29 40.37 57.10 00jrngwmz2vxx JDBC Thin Client /* AR9_AR9GetPaymentSMS_select...
6,700,190 53 126,418.68 5.09 48.73 99.88 0.00 4pbpdtw14s7bg JDBC Thin Client /* null_CL9CommActivity_update...
6,491,924 1 6,491,924.00 4.93 446.97 12.18 89.21 fayqg3gtg6ygs PL/SQL Developer declare cnt number; begin cnt ...
5,813,177 1 5,813,177.00 4.42 446.94 11.54 89.88 asc03yk438k67 PL/SQL Developer declare cnt number; begin cnt ...
5,350,250 1 5,350,250.00 4.06 446.97 11.19 90.35 cm7r7w9vvbcbd PL/SQL Developer declare cnt number; begin cnt ...
5,305,308 1 5,305,308.00 4.03 446.97 11.28 90.19 3gbq16harc68h PL/SQL Developer declare cnt number; begin cnt ...
5,029,500 1 5,029,500.00 3.82 458.74 13.08 86.54 d6078fp7h1jb6 JDBC Thin Client SELECT CD.TARGET_PCN, SE.AGREE...
4,440,670 1 4,440,670.00 3.37 302.84 12.34 86.66 7s8bcw9axdjhy JDBC Thin Client SELECT CD.TARGET_PCN, SE.AGREE...
4,424,502 35 126,414.34 3.36 31.27 99.87 0.00 2uqmgpqwnbu8y JDBC Thin Client /* null_CL9CommActivity_select...
4,179,264 5,709 732.05 3.17 17.78 95.09 3.50 g1p5bjdgv0wkq JDBC Thin Client /* AR9_SearchInvoices_searchIn...
4,162,849 5,631 739.27 3.16 46.95 40.39 59.88 axjaa2n7rx9b0 JDBC Thin Client SELECT COUNT(1) FROM (/* AR9_S...
4,101,017 280,418 14.62 3.12 358.90 9.67 91.72 182gmhv5f5zqv PL/SQL Developer UPDATE TABLE_BPM_STEP_INST SET...
3,916,711 272,996 14.35 2.98 347.54 9.69 91.71 bjqz15a4rqcf4 PL/SQL Developer UPDATE TABLE_BPM_STEP_INST SET...
3,806,950 268,600 14.17 2.89 334.82 9.89 91.30 ad50sx1wjnsxv PL/SQL Developer UPDATE TABLE_BPM_STEP_INST SET...
3,806,852 268,649 14.17 2.89 334.17 9.79 91.54 7499mkmfmu1fr PL/SQL Developer UPDATE TABLE_BPM_STEP_INST SET...
3,792,292 31 122,332.00 2.88 177.30 13.10 85.91 98dyqu3xsgyf5 JDBC Thin Client /* null_CL9CrdMntrExtract_sele...
2,563,257 33 77,674.45 1.95 65.68 8.82 91.91 67mpfp9dfdcvn JDBC Thin Client SELECT /*+ ordered use_nl (bl1...
2,552,705 33 77,354.70 1.94 4.19 99.46 0.17 a331y50gxz8gb JDBC Thin Client SELECT COUNT(*) FROM (SELECT /...
2,388,043 0   1.81 81.54 16.81 85.32 dyn656kdc8w98 PL/SQL Developer SELECT /*+ PARALELL(cl1_coll_e...
2,159,680 389 5,551.88 1.64 7.30 100.02 0.00 cysbbg2h86xc6 JDBC Thin Client /* MSD_SubLogSelect_selectNonQ...
1,967,921 32,264 60.99 1.49 7.24 97.49 0.00 bwjxuug65z19n JDBC Thin Client /* BL_LogicalDate_selectOnLine...
1,893,601 0   1.44 93.11 13.01 89.11 0afh4ddt9bxu7 PL/SQL Developer SELECT /*+ PARALELL(cl1_coll_e...
1,777,723 0   1.35 528.06 2.48 98.51 6kaqc0d7sgvx8 SQL Developer select bc.charge_seq_no, bc.se...
1,540,694 0   1.17 106.64 10.81 91.84 5f43xqgtmz3f2 PL/SQL Developer SELECT /*+ PARALELL(cl1_coll_e...
1,495,550 0   1.14 105.95 10.76 91.94 5z436m8azttg1 PL/SQL Developer SELECT /*+ PARALELL(cl1_coll_e...
1,319,463 30 43,982.10 1.00 5.39 99.97 0.00 gy7n3dzhh90y4 Ac1FtcManager@ccbstbpr1 (TNS V1-V3) /* */ UPDATE /*+ INDEX (AC1_CO...
同时这部分的信息都指向了一个program,都是通过客户端pl/sql Deverloper或者SQL Developer连接的。
查看相应的sql都使用了并行,并行度都在4以上,这样问题似乎就清晰很多了。这部分executions为0的语句通过客户端程序连接,手工启用了并行,这些时间都没有计算进DB CPU中。
稍后和现场,开发部门的同事进行了沟通和确认,他们说在昨天有一个严重的问题,最后由开发提供了脚本进行了修复。
因为持续的时间较短,只有10分钟,所以对系统的负载没有造成严重的影响,最后就在内部做一个总结和分析。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/23718752/viewspace-1407225/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/23718752/viewspace-1407225/

你可能感兴趣的:(awr中DB CPU过低的原因分析)