一次数据库硬解析的分析全过程

ps:其实这个case很常见,而且在正规的开发流程之下也不会出现这种情况,更不会出现在上线之后,这篇文章主要是体现一个发现问题和分析问题解决问题的思路。希望对大家能有多帮助。

说明:

经过开发方面的努力,以及对问题sql的优化之后,截止目前,之前发现的问题sql已经得到了缓解。之后,我们对数据库2个节点进行了整体分析,目前得出以下问题。

依据材料:2个节点的awr报告,见附件


经过对数据库2个节点的整体分析,发现2个节点的资源承载相差较大。以下为2个节点的对比:

 

负载情况比对

节点1


Per Second Per Transaction
Redo size: 289,642.11 2,459.01
Logical reads: 20,588.01 174.79
Block changes: 1,452.02 12.33
Physical reads: 7.15 0.06
Physical writes: 47.59 0.40
User calls: 730.83 6.20
Parses: 268.36 2.28
Hard parses: 2.26 0.02
Sorts: 211.82 1.80
Logons: 0.51 0.00
Executes: 471.03 4.00
Transactions: 117.79  

节点2


Per Second Per Transaction
Redo size: 115,937.47 4,102.67
Logical reads: 40,494.67 1,432.98
Block changes: 334.81 11.85
Physical reads: 19.53 0.69
Physical writes: 18.26 0.65
User calls: 4,874.41 172.49
Parses: 1,242.13 43.96
Hard parses: 2.03 0.07
Sorts: 203.80 7.21
Logons: 0.66 0.02
Executes: 4,298.22 152.10
Transactions: 28.26  

 

说明:

无论从redosize还是transactions明显可以看出节点1比节点2忙碌,虽然2个节点的hard parses相差貌似不是很大,但是从其executes数以及parses分析,初步断定是节点1的大量硬解析导致的。

 

实例效率比对

节点1

Buffer Nowait %: 99.94 Redo NoWait %: 100.00
Buffer Hit %: 99.97 In-memory Sort %: 100.00
Library Hit %: 57.85 Soft Parse %: 99.16
Execute to Parse %: 43.03 Latch Hit %: 99.94
Parse CPU to Parse Elapsd %: 20.39 % Non-Parse CPU: 98.76

节点2

Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 99.95 In-memory Sort %: 100.00
Library Hit %: 99.92 Soft Parse %: 99.84
Execute to Parse %: 71.10 Latch Hit %: 99.96
Parse CPU to Parse Elapsd %: 2.62 % Non-Parse CPU: 98.64

 

说明:

通过library hit%parse cpu to parse elapsd%来看,基本能够确定是节点1的大量硬解析导致的。

 

经过以上分析,我们在sql统计中,节点1如下所示:

(注:SQL ordered by Elapsed Time

Elapsed Time (s) CPU Time (s) Executions Elap per Exec (s) % Total DB Time SQL Id SQL Module SQL Text
8,866 7 1 8865.61 20.41 4y67m1n5ru9h9 timeTask@xjitnmapp2 (TNS V1-V3) begin PKP_XJ_BUSI_MONITOR_UPD...
2,017 0 11 183.32 4.64 at30z235zpx7a timeTask@xjitnmapp2 (TNS V1-V3) begin xj_db_inas_proc; end;
1,355 1,312 0   3.12 ax33m7btymb2r PL/SQL Developer begin PKP_XJ_CSD_TEST.agent_pe...
1,189 365 716,555 0.00 2.74 765mqyxhskk9d ValveMgmt@xjitnmapp2 (TNS V1-V3) insert into ne_perf_msg (ne_pe...
1,049 0 10 104.90 2.41 cyyyswcsmvfqx timeTask@xjitnmapp2 (TNS V1-V3) SELECT COUNT(1), 'IOM-N' FROM ...
1,022 398 716,629 0.00 2.35 9qfdgm3c702rs ValveMgmt@xjitnmapp2 (TNS V1-V3) begin Proc_NE_Perf_Msg_UpdateR...
940 0 10 94.00 2.16 dzt58v2bnamc4 timeTask@xjitnmapp2 (TNS V1-V3) SELECT COUNT(1), 'T' FROM INAS...
923 935 24 38.45 2.12 4uxbs8fhb6yv5 timeTask@xjitnmapp2 (TNS V1-V3) begin pkp_maint_job.hastenNot...
923 935 24 38.45 2.12 g2zqp9kt9fhhx timeTask@xjitnmapp2 (TNS V1-V3) SELECT A.JOB_NAME, B.DUTY_ID, ...
916 929 32,717 0.03 2.11 657k3nvp6d77r timeTask@xjitnmapp2 (TNS V1-V3) SELECT COUNT(*) FROM (SELECT :...
914 293 716,627 0.00 2.10 607t3jmh1rjgy ValveMgmt@xjitnmapp2 (TNS V1-V3) UPDATE NE_PERF_MSG_REAL SET PE...
517 0 1 517.19 1.19 3v8usax04tqx6 timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
517 0 1 516.51 1.19 f3d7w5j1khh59 timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
490 0 1 490.14 1.13 2aw24jjhsbz1z timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
482 0 1 482.15 1.11 frgk27g6p6auh timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
482 0 1 481.85 1.11 2mjp53agwxb10 timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
479 0 1 479.35 1.10 4ha87wy9k784t timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
479 0 1 478.68 1.10 c8utt9uwda7mv timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
466 0 1 466.07 1.07 g2s1s0vubjuba timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
466 0 1 465.99 1.07 57pq01hujfa6d timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
464 0 1 463.93 1.07 4z5gn1sh2h3p9 timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
463 0 1 463.24 1.07 3xtjfvgq9hw3u timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
454 0 1 453.65 1.04 cyc352n70sndr timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
447 0 1 447.39 1.03 6dz4h3cpngvfk timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
445 0 1 445.12 1.02 3maa9dy6ybfq0 timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
445 0 1 444.96 1.02 5pabqbfux2bdd timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
443 0 1 443.12 1.02 cu3f3fwntq6q5 timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
441 0 1 440.93 1.02 5v6z9vx2anap4 timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
438 0 1 437.93 1.01 3skc5wbqaa13d timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...
437 0 1 437.22 1.01 ffzq6ucvadp5n timeTask@xjitnmapp2 (TNS V1-V3) SELECT /*+parallel(t, 12)*/t.s...

说明:在上图(节点1)中发现,有大量的以下语句出现:

SELECT /*+parallel(t, 12)*/t.s…..

具体分析后,这种情况在节点2中不存在。

对上边的大量相同的sql具体分析如下:

SELECT /*+parallel(t, 12)*/

 t.source_id,

 t.start_date,

 SUM(t.num_cdrs),

 SUM(t.settlement_duration),

 SUM(t.sett_fee)

  FROM (SELECT d.source_id,

               substr(d.start_date, 1, 8) start_date,

               d.num_cdrs,

               d.settlement_duration,

               d.sett_fee

          FROM zhjs_app.tl_y_voice_sett_201208@xjzhjs d,

               (SELECT element_id

                  FROM zhjs_param.tp_relation_element@xjzhjs

                 WHERE rpt_code = '15'

                   AND tnode IN

                       ('415102', '415105', '415108', '416103', '416106')) c

         WHERE d.account_item = c.element_id

        UNION ALL

        SELECT d.source_id,

               substr(d.start_date, 1, 8) start_date,

               d.num_cdrs,

               d.settlement_duration,

               d.sett_fee

          FROM zhjs_app.tl_y_voice_sett_201209@xjzhjs d,

               (SELECT element_id

                  FROM zhjs_param.tp_relation_element@xjzhjs

                 WHERE rpt_code = '15'

                   AND tnode IN

                       ('415102', '415105', '415108', '416103', '416106')) c

         WHERE d.account_item = c.element_id) t

 GROUP BY t.source_id, t.start_date;

 

说明:

在节点1,存在这样大量的sql,其中除了在in 子句中的对象个数不同外,其他基本全部一致。这也是导致节点1大量硬解析,消耗主机资源,拖低节点1各种性能指标的原因。

 

建议:

对以上语句进行绑定变量处理。

 

总结:

这次对XXXX数据库性能的监控主要从其整体结构来进行分析的,并不是将全部精力放在盯某个或者某类sql上,我们认为这样可以避免瞎子摸象的问题产生,至于最后还是落在具体sql上,这个只是一种分析的结果,而非以往的一种惯性“有性能问题就找sql”这样的分析过程。

         最后,希望在和开发等同事的一起努力下,XXXX性能有所大的提高。

ps:很不习惯写文档,而且还那种给人家找事的文档,基本形成了格式,首先肯定人家前期的努力和结果,其次不但提出自己的见解,还得表明自己的分析依据和分析过程,完了再加以一定的鼓励和前瞻性。哎..真劳神,其实也就一句话的事儿..

Normal 0 false 7.8 磅 0 2 false false false EN-US ZH-CN X-NONE

转载于:http://blog.itpub.net/26143577/viewspace-745046/

你可能感兴趣的:(一次数据库硬解析的分析全过程)