[20190827]函数索引与选择率.txt

[20190827]函数索引与选择率.txt

--//一般情况下查询条件使用函数,选择率1%.如果建立虚拟列并且分析的情况下,能够获得比较准确的选择率。
--//如果建立对应的函数索引呢,情况会怎么呢?最近遇到的问题,让我注意其中一些细节。

1.环境:
SCOTT@test01p> @ ver1
PORT_STRING                    VERSION        BANNER                                                                               CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0           12.2.0.1.0     Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production              0

SCOTT@test01p> create table t1 as select rownum id,dbms_random.string('U',10) v1 from dual connect by level<=1e4;
Table created.

--//分析略.Method_Opt => 'FOR ALL COLUMNS SIZE 1 '
SCOTT@test01p> select * from t1 where rownum=1;
        ID V1
---------- --------------------
         1 KCIXTFQWHZ

2.测试:
SCOTT@test01p> alter session set statistics_level = all;
Session altered.

SCOTT@test01p> select * from t1 where lower(v1)='kcixtfqwhz';
        ID V1
---------- --------------------
         1 KCIXTFQWHZ

SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  ds718dy9422mr, child number 1
-------------------------------------
select * from t1 where lower(v1)='kcixtfqwhz'
Plan hash value: 3617692013
--------------------------------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |       |    10 (100)|          |      1 |00:00:00.01 |      41 |
|*  1 |  TABLE ACCESS FULL| T1   |      1 |    100 |  1500 |    10   (0)| 00:00:01 |      1 |00:00:00.01 |      41 |
--------------------------------------------------------------------------------------------------------------------
--//E-Rows=100,
--//选择率100/10000 = .01.

3.如果建立对应的函数索引呢?

SCOTT@test01p> create index if_t1_v1 on t1( lower(v1));
Index created.

SCOTT@test01p> select * from t1 where lower(v1)='kcixtfqwhz';
        ID V1
---------- --------------------
         1 KCIXTFQWHZ

SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  ds718dy9422mr, child number 1
-------------------------------------
select * from t1 where lower(v1)='kcixtfqwhz'
Plan hash value: 544604454
---------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |
---------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |          |      1 |        |       |     9 (100)|          |      1 |00:00:00.01 |       4 |      1 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1       |      1 |    100 |  1500 |     9   (0)| 00:00:01 |      1 |00:00:00.01 |       4 |      1 |
|*  2 |   INDEX RANGE SCAN                  | IF_T1_V1 |      1 |     40 |       |     1   (0)| 00:00:01 |      1 |00:00:00.01 |       3 |      1 |
---------------------------------------------------------------------------------------------------------------------------------------------------
--//E-Rows=40.
--//选择率40/10000 = .004,这个是我以前没有注意到的,我一直以为选择率还是1%.不知道这个缺省的选择率如何确定的.

4.继续探究:
--//实际上建立函数索引时有1个细节不容忽视,就是该隐含字段缺乏统计信息,我曾经在这里栽过跟头.

SCOTT@test01p> SCOTT@test01p> @ tab_lh scott t1  ''

DISPLAY TABLE_NAME OF COLUMN_NAME INFORMATION.
INPUT   OWNER TABLE_NAME COLUMN
SAMPLE  : @ TAB_LH TABLE_NAME [COLUMN_NAME]
IF NOT INPUT COLUMN_NAME ,USE "" .

COLUMN_NAME   DATA_TYPE DATA_LENGTH N NUM_DISTINCT    DENSITY SAMPLE_SIZE TRANS_LOW  TRANS_HIGH NUM_NULLS NUM_BUCKETS LAST_ANALYZED       HISTOGRAM DATA_DEFAULT
------------- --------- ----------- - ------------ ---------- ----------- ---------- ---------- --------- ----------- ------------------- --------- -------------
ID            NUMBER             22 Y        10000      .0001       10000 1          10000              0           1 2019-08-27 21:35:55 NONE
V1            VARCHAR2         4000 Y        10000      .0001       10000 AAAXTDLLNT ZZZBEIPJYY         0           1 2019-08-27 21:35:55 NONE
SYS_NC00003$  VARCHAR2         4000 Y                                                                                                     NONE       LOWER("V1")
--//必须要重新分析表.

execute sys.dbms_stats.gather_table_stats ( OwnName => nvl('',user),TabName => 't1',Estimate_Percent => NULL,Method_Opt => 'FOR ALL COLUMNS SIZE 1 ',Cascade => True ,No_Invalidate => false)

SCOTT@test01p> @ tab_lh scott t1  SYS_NC00003$
DISPLAY TABLE_NAME OF COLUMN_NAME INFORMATION.
INPUT   OWNER TABLE_NAME COLUMN
SAMPLE  : @ TAB_LH TABLE_NAME [COLUMN_NAME]
IF NOT INPUT COLUMN_NAME ,USE "" .
COLUMN_NAME  DATA_TYPE DATA_LENGTH N NUM_DISTINCT    DENSITY SAMPLE_SIZE TRANS_LOW  TRANS_HIGH NUM_NULLS NUM_BUCKETS LAST_ANALYZED       HISTOGRAM DATA_DEFAULT
------------ --------- ----------- - ------------ ---------- ----------- ---------- ---------- --------- ----------- ------------------- --------- ------------
SYS_NC00003$ VARCHAR2         4000 Y        10000      .0001       10000 aaaxtdllnt zzzbeipjyy         0           1 2019-08-27 21:46:06 NONE      LOWER("V1")

SCOTT@test01p> select * from t1 where lower(v1)='kcixtfqwhz';
        ID V1
---------- --------------------
         1 KCIXTFQWHZ

SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  ds718dy9422mr, child number 1
-------------------------------------
select * from t1 where lower(v1)='kcixtfqwhz'

Plan hash value: 544604454

------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |          |      1 |        |       |     2 (100)|          |      1 |00:00:00.01 |       4 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1       |      1 |      1 |    26 |     2   (0)| 00:00:01 |      1 |00:00:00.01 |       4 |
|*  2 |   INDEX RANGE SCAN                  | IF_T1_V1 |      1 |      1 |       |     1   (0)| 00:00:01 |      1 |00:00:00.01 |       3 |
------------------------------------------------------------------------------------------------------------------------------------------
--//这样E-Rows估算就比较准确了.

5.附上tab_lh.sql脚本:
/* Formatted on 2014/10/19 20:53:41 (QP5 v5.227.12220.39754) */
PROMPT
PROMPT DISPLAY TABLE_NAME OF COLUMN_NAME INFORMATION.
PROMPT INPUT   OWNER TABLE_NAME COLUMN
PROMPT SAMPLE  : @ TAB_LH TABLE_NAME [COLUMN_NAME]
PROMPT IF NOT INPUT COLUMN_NAME ,USE "" .
PROMPT

column trans_low format a32
column trans_high format a32
column data_default format a20
column column_name format a24
SELECT
--owner,
--         table_name,
         column_name,
         data_type,
         data_length,
         nullable,
         num_distinct,
         density,
         sample_size,
         CASE
            WHEN data_type IN ('CHAR', 'VARCHAR2')
            THEN
               UTL_RAW.cast_to_varchar2 (low_value)
            WHEN data_type = 'NUMBER'
            THEN
               TO_CHAR (UTL_RAW.cast_to_number (low_value))
            WHEN data_type = 'DATE'
            THEN
               RTRIM (
                     LTRIM (
                        TO_CHAR (
                               100
                             * (  TO_NUMBER (SUBSTR (low_value, 1, 2), 'XX')
                                - 100)
                           + (TO_NUMBER (SUBSTR (low_value, 3, 2), 'XX') - 100),
                           '0000'))
                  || '-'
                  || LTRIM (
                        TO_CHAR (TO_NUMBER (SUBSTR (low_value, 5, 2), 'XX'),
                                 '00'))
                  || '-'
                  || LTRIM (
                        TO_CHAR (TO_NUMBER (SUBSTR (low_value, 7, 2), 'XX'),
                                 '00'))
                  || ' '
                  || LTRIM (
                        TO_CHAR (
                           TO_NUMBER (SUBSTR (low_value, 9, 2), 'XX') - 1,
                           '00'))
                  || ':'
                  || LTRIM (
                        TO_CHAR (
                           TO_NUMBER (SUBSTR (low_value, 11, 2), 'XX') - 1,
                           '00'))
                  || ':'
                  || LTRIM (
                        TO_CHAR (
                           TO_NUMBER (SUBSTR (low_value, 13, 2), 'XX') - 1,
                           '00')))
         END
            trans_low,
         CASE
            WHEN data_type IN ('CHAR', 'VARCHAR2')
            THEN
               UTL_RAW.cast_to_varchar2 (high_value)
            WHEN data_type = 'NUMBER'
            THEN
               TO_CHAR (UTL_RAW.cast_to_number (high_value))
            WHEN data_type = 'DATE'
            THEN
               RTRIM (
                     LTRIM (
                        TO_CHAR (
                               100
                             * (  TO_NUMBER (SUBSTR (high_value, 1, 2), 'XX')
                                - 100)
                           + (TO_NUMBER (SUBSTR (high_value, 3, 2), 'XX') - 100),
                           '0000'))
                  || '-'
                  || LTRIM (
                        TO_CHAR (TO_NUMBER (SUBSTR (high_value, 5, 2), 'XX'),
                                 '00'))
                  || '-'
                  || LTRIM (
                        TO_CHAR (TO_NUMBER (SUBSTR (high_value, 7, 2), 'XX'),
                                 '00'))
                  || ' '
                  || LTRIM (
                        TO_CHAR (
                           TO_NUMBER (SUBSTR (high_value, 9, 2), 'XX') - 1,
                           '00'))
                  || ':'
                  || LTRIM (
                        TO_CHAR (
                           TO_NUMBER (SUBSTR (high_value, 11, 2), 'XX') - 1,
                           '00'))
                  || ':'
                  || LTRIM (
                        TO_CHAR (
                           TO_NUMBER (SUBSTR (high_value, 13, 2), 'XX') - 1,
                           '00')))
         END
            trans_high,
         num_nulls,
         num_buckets,
         last_analyzed,
         histogram,
        data_default
    FROM dba_tab_cols
   WHERE  owner = decode('&1','',user,upper('&1'))
AND table_name = upper('&2')
AND column_name = decode('&&3','',column_name,upper('&&3'))
ORDER BY column_id
/

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/267265/viewspace-2655338/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/267265/viewspace-2655338/

你可能感兴趣的:([20190827]函数索引与选择率.txt)