一个 sql优化 ([精华] 一个查询优化的分析调整全过程!很值得一看 )

见   http://www.itpub.net/forum.php?mod=viewthread&tid=239011




Web翻页优化实例
提交时间: 2004-6-18 15:37:49      回复    发消息 


环境:
Linux version 2.4.20-8custom (root@web2) (gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)) #3 SMP Thu Jun 5 22:03:36 CST 2003
Mem:  2113466368 
Swap: 4194881536 
CPU:两个超线程的Intel(R) Xeon(TM) CPU 2.40GHz
  
优化前语句在mysql里面查询15秒左右出来,转移到oracle后进行在不调整索引和语句的情况下执行时间大概是4-5秒,调整后执行时间小于0.5秒。
 
翻页语句:
SELECT * FROM  (SELECT T1.*, rownum as linenum FROM  (
SELECT /*+ index(a ind_old)*/
a.category FROM auction_auctions a WHERE a.category =' 170101 ' AND a.closed='0' AND ends > sysdate AND (a.approve_status>=0)  ORDER BY a.ends) T1  WHERE rownum < 18681) WHERE linenum >= 18641
  
被查询的表:auction_auctions(产品表)
表结构:
SQL> desc auction_auctions;                         
   Name                                      Null?    Type
   ----------------------------------------- -------- ----------------------------
  ID                                        NOT NULL VARCHAR2(32)
   USERNAME                                           VARCHAR2(32)
   TITLE                                              CLOB
   GMT_MODIFIED                              NOT NULL DATE
   STARTS                                    NOT NULL DATE
   DESCRIPTION                                        CLOB
   PICT_URL                                           CLOB
   CATEGORY                                  NOT NULL VARCHAR2(11)
   MINIMUM_BID                                        NUMBER
   RESERVE_PRICE                                      NUMBER
   BUY_NOW                                            NUMBER
   AUCTION_TYPE                                       CHAR(1)
   DURATION                                           VARCHAR2(7)
   INCREMENTNUM                              NOT NULL NUMBER
   CITY                                               VARCHAR2(30)
   PROV                                               VARCHAR2(20)
   LOCATION                                           VARCHAR2(40)
   LOCATION_ZIP                                       VARCHAR2(6)
   SHIPPING                                           CHAR(1)
   PAYMENT                                            CLOB
   INTERNATIONAL                                      CHAR(1)
   ENDS                                      NOT NULL DATE
   CURRENT_BID                                        NUMBER
   CLOSED                                             CHAR(2)
   PHOTO_UPLOADED                                     CHAR(1)
   QUANTITY                                           NUMBER(11)
   STORY                                              CLOB
   HAVE_INVOICE                              NOT NULL NUMBER(1)
   HAVE_GUARANTEE                            NOT NULL NUMBER(1)
   STUFF_STATUS                              NOT NULL NUMBER(1)
   APPROVE_STATUS                            NOT NULL NUMBER(1)
   OLD_STARTS                                NOT NULL DATE
   ZOO                                                VARCHAR2(10)
   PROMOTED_STATUS                           NOT NULL NUMBER(1)
   REPOST_TYPE                                        CHAR(1)
   REPOST_TIMES                              NOT NULL NUMBER(4)
   SECURE_TRADE_AGREE                        NOT NULL NUMBER(1)
   SECURE_TRADE_TRANSACTION_FEE                       VARCHAR2(16)
   SECURE_TRADE_ORDINARY_POST_FEE                     NUMBER
   SECURE_TRADE_FAST_POST_FEE                         NUMBER

表记录数及大小
SQL> select count(*) from auction_auctions;
  
    COUNT(*)
----------
537351
  
SQL> select segment_name,bytes,blocks from user_segments where segment_name ='AUCTION_AUCTIONS';
  
SEGMENT_NAME          BYTES     BLOCKS
AUCTION_AUCTIONS      1059061760     129280
  
表上原有的索引
create index ind_old on auction_auctions(closed,approve_status,category,ends) tablespace tbsindex compress 2;
  
SQL> select segment_name,bytes,blocks from user_segments where segment_name = 'IND_OLD';
  
SEGMENT_NAME           BYTES     BLOCKS
IND_OLD                   20971520       2560
表和索引都已经分析过,我们来看一下sql执行的费用
SQL> set autotrace trace;
SQL> SELECT * FROM  (SELECT T1.*, rownum as linenum FROM  (SELECT a.* FROM auction_auctions a WHERE a.category like '18%' AND a.closed='0' AND ends  > sysdate AND (a.approve_status>=0)  ORDER BY a.ends) T1  WHERE rownum <18681) WHERE linenum >= 18641;
  
40 rows selected.
  
Execution Plan
----------------------------------------------------------
     0      SELECT STATEMENT Optimizer=CHOOSE (Cost=19152 Card=18347 Byt
            es=190698718)
  
     1    0   VIEW (Cost=19152 Card=18347 Bytes=190698718)
     2    1     COUNT (STOPKEY)
     3    2       VIEW (Cost=19152 Card=18347 Bytes=190460207)
     4    3         TABLE ACCESS (BY INDEX ROWID) OF 'AUCTION_AUCTIONS'
            (Cost=19152 Card=18347 Bytes=20860539)
  
     5    4           INDEX (RANGE SCAN) OF 'IND_OLD' (NON-UNIQUE) (Cost
            =810 Card=186003)
  
Statistics
----------------------------------------------------------
            0  recursive calls
            0  db block gets
        19437  consistent gets
        18262  physical reads
            0  redo size
       114300  bytes sent via SQL*Net to client
        56356  bytes received via SQL*Net from client
          435  SQL*Net roundtrips to/from client
            0  sorts (memory)
            0  sorts (disk)
           40  rows processed
  
我们可以看到这条sql语句通过索引范围扫描找到最里面的结果集,然后通过两个view操作最后得出数据。其中18502  consistent gets,17901  physical reads

我们来看一下这个索引建的到底合不合理,先看下各个查寻列的distinct值
select count(distinct ends) from auction_auctions;
  
COUNT(DISTINCTENDS)
-------------------
               338965
  
SQL> select count(distinct category) from auction_auctions;
  
COUNT(DISTINCTCATEGORY)
-----------------------
                     1148
  
SQL> select count(distinct closed) from auction_auctions;
  
COUNT(DISTINCTCLOSED)
---------------------
                      2
SQL> select count(distinct approve_status) from auction_auctions;
  
COUNT(DISTINCTAPPROVE_STATUS)
-----------------------------
                              5
  
页索引里列平均存储长度
SQL> select avg(vsize(ends)) from auction_auctions;
  
AVG(VSIZE(ENDS))
----------------
                 7
  
SQL> select avg(vsize(closed)) from auction_auctions;
  
AVG(VSIZE(CLOSED))
------------------
                   2
  
SQL> select avg(vsize(category)) from auction_auctions;
  
AVG(VSIZE(CATEGORY))
--------------------
            5.52313106
  
SQL> select avg(vsize(approve_status)) from auction_auctions;
  
AVG(VSIZE(APPROVE_STATUS))
--------------------------
                  1.67639401

我们来估算一下各种组合索引的大小,可以看到closed,approve_status,category都是相对较低集势的列(重复值较多),下面我们来大概计算下各种页索引需要的空间
 
column               distinct num        column len
ends                  338965              7 
category               1148                5.5
closed                 2                   2 
approve_status          5                   1.7 
  
index1: (ends,closed,category,approve_status) compress 2
ends:distinct number---338965
closed: distinct number---2
index size=338965*2*(9+2)+ 537351*(1.7+5.5+6)=14603998
  
index2: (closed,category,ends,approve_status)
closed: distinct number---2
category: distinct number---1148
index size=2*1148*(2+5.5)+537351*(7+1.7+6)=7916279
  
index3: (closed,approve_status,category,ends)
closed: distinct number---2
approve_status: distinct number―5
index size=2*5*(2+1.7)+537351*(7+5.5+6)=9941030
  
结果出来了,index2: (closed,category,ends,approve_status)的索引最小

我们再来看一下语句
SELECT * FROM  (SELECT T1.*, rownum as linenum FROM  (SELECT a.* FROM auction_auctions a WHERE a.category like '18%' AND a.closed='0' AND ends  > sysdate AND (a.approve_status>=0)  ORDER BY a.ends) T1  WHERE rownum <18681) WHERE linenum >= 18641;
可以看出这个sql语句有很大优化余地,首先最里面的结果集SELECT a.* FROM auction_auctions a WHERE a.category like '18%' AND a.closed='0' AND ends  > sysdate AND (a.approve_status>=0)  ORDER BY a.ends,这里的话会走index range scan,然后table scan by rowid,这样的话如果符合条件的数据多的话相当耗资源,我们可以改写成
SELECT a.rowid FROM auction_auctions a WHERE a.category like '18%' AND a.closed='0' AND ends  > sysdate AND (a.approve_status>=0)  ORDER BY a.ends
这样的话最里面的结果集只需要index fast full scan就可以完成了,再改写一下得出以下语句
 
select * from auction_auctions where rowid in (SELECT rid FROM  (
SELECT T1.rowid rid, rownum as linenum FROM  
(SELECT a.rowid FROM auction_auctions a WHERE a.category like '18%' AND a.closed='0' AND ends > sysdate AND 
(a.approve_status>=0)  ORDER BY a.ends) T1  WHERE rownum < 18681) WHERE linenum >= 18641)
  
下面我们来测试一下这个索引的查询开销
 
select * from auction_auctions where rowid in (SELECT rid FROM  (
SELECT T1.rowid rid, rownum as linenum FROM  
(SELECT a.rowid FROM auction_auctions a WHERE a.category like '18%' AND a.closed='0' AND ends > sysdate AND 
(a.approve_status>=0)  ORDER BY a.closed,a.ends) T1  WHERE rownum < 18681) WHERE linenum >= 18641)
Execution Plan
----------------------------------------------------------
     0      SELECT STATEMENT Optimizer=CHOOSE (Cost=18698 Card=18344 Byt
            es=21224008)
  
     1    0   NESTED LOOPS (Cost=18698 Card=18344 Bytes=21224008)
     2    1     VIEW (Cost=264 Card=18344 Bytes=366880)
     3    2       SORT (UNIQUE)
     4    3         COUNT (STOPKEY)
     5    4           VIEW (Cost=264 Card=18344 Bytes=128408)
     6    5             SORT (ORDER BY STOPKEY) (Cost=264 Card=18344 Byt
            es=440256)
  
     7    6               INDEX (FAST FULL SCAN) OF 'IDX_AUCTION_BROWSE'
             (NON-UNIQUE) (Cost=159 Card=18344 Bytes=440256)
  
     8    1     TABLE ACCESS (BY USER ROWID) OF 'AUCTION_AUCTIONS' (Cost
            =1 Card=1 Bytes=1137)
  
Statistics
----------------------------------------------------------
            0  recursive calls
            0  db block gets
         2080  consistent gets
         1516  physical reads
            0  redo size
       114840  bytes sent via SQL*Net to client
        56779  bytes received via SQL*Net from client
          438  SQL*Net roundtrips to/from client
            2  sorts (memory)
            0  sorts (disk)
           40  rows processed
  
可以看到consistent gets从19437降到2080,physical reads从18262降到1516,查询时间也丛4秒左右下降到0。5秒,可以来说这次sql调整取得了预期的效果。

又修改了一下语句,
 
SQL> select * from auction_auctions where rowid in 
    2  (SELECT rid FROM  (                                              
    3  SELECT T1.rowid rid, rownum as linenum FROM                                                                 
    4  (SELECT a.rowid FROM auction_auctions a 
    5     WHERE a.category like '18%' AND a.closed='0' AND ends > sysdate AND 
a.approve_status>=0
    6    7  ORDER BY a.closed,a.category,a.ends) T1  
    8  WHERE rownum < 18600) WHERE linenum >= 18560)    ;
  
40 rows selected.
  
Execution Plan
----------------------------------------------------------
     0      SELECT STATEMENT Optimizer=CHOOSE (Cost=17912 Card=17604 Byt
            es=20367828)
  
     1    0   NESTED LOOPS (Cost=17912 Card=17604 Bytes=20367828)
     2    1     VIEW (Cost=221 Card=17604 Bytes=352080)
     3    2       SORT (UNIQUE)
     4    3         COUNT (STOPKEY)
     5    4           VIEW (Cost=221 Card=17604 Bytes=123228)
     6    5             INDEX (RANGE SCAN) OF 'IDX_AUCTION_BROWSE' (NON-
            UNIQUE) (Cost=221 Card=17604 Bytes=422496)
  
     7    1     TABLE ACCESS (BY USER ROWID) OF 'AUCTION_AUCTIONS' (Cost
            =1 Card=1 Bytes=1137)
  
Statistics
----------------------------------------------------------
            0  recursive calls
            0  db block gets
          550  consistent gets
           14  physical reads
            0  redo size
       117106  bytes sent via SQL*Net to client
        56497  bytes received via SQL*Net from client
          436  SQL*Net roundtrips to/from client
            1  sorts (memory)
            0  sorts (disk)
           40  rows processed
  
在order by里加上索引前导列,消除了
    6    5             SORT (ORDER BY STOPKEY) (Cost=264 Card=18344 Byt
            es=440256)
,把consistent gets从2080降到550



你可能感兴趣的:(sql)