MultiColumn Statistics

   When the WHERE clause of a query specifies multiple columns from a single table (multiple single column predicates), the relationship between the columns can strongly affect the combined selectivity for the column group.For example, consider the customers table in the SH schema. The columns cust_state_province and country_id are related, with cust_state_province determining the country_id for each customer. Suppose you query the customers table where the cust_state_province is California:
SQL> SELECT COUNT(*)
  2  FROM   sh.customers
  3  WHERE  cust_state_province = 'CA';

  COUNT(*)
----------
      3341

SQL>  SELECT COUNT(*)
  2    FROM   sh.customers
  3  WHERE  cust_state_province = 'CA'
  4   AND    country_id=52790;

  COUNT(*)
----------
      3341
SQL>  SELECT COUNT(*)
  2  FROM   sh.customers
  3  WHERE  cust_state_province = 'CA'
  4   AND    country_id=52775;

  COUNT(*)
----------
         0

With individual column statistics, the optimizer has no way of knowing that the cust_state_province and the country_id columns are related. By gathering statistics on these columns as a group (column group), the optimizer has a more accurate selectivity value for the group, instead of having to generate the value based on the individual column statistics.

You can create column groups manually by using the DBMS_STATS package. You can use this package to create a column group, get the name of a column group, or delete a column group from a table.


1
Creating a Column Group

Use the create_extended_statistics function to create a column group. The create_extended_statistics function returns the system-generated name of the newly created column group. Table 1 lists the input parameters for this function.

Table 1 Parameters for the create_extended_statistics Function

Parameter Description

owner

Schema owner. NULL indicates current schema.

tab_name

Name of the table to which the column group is added.

extension

Columns in the column group.


For example, to add a column group consisting of the cust_state_province and country_id columns to the customers table in SH schema, run the following PL/SQL block:

SQL> DECLARE
  2    cg_name varchar2(30);
  3  BEGIN
  4    cg_name := dbms_stats.create_extended_stats('sh','customers',
  5               '(CUST_CITY,cust_state_province,country_id)');
  6  END;
  7  /

PL/SQL procedure successfully completed.

2 Getting a Column Group

Use the show_extended_stats_name function to obtain the name of the column group for a given set of columns. Table 2 lists the input parameters for this function.

Table 2 Parameters for the show_extended_stats_name Function

Parameter Description

owner

Schema owner. NULL indicates current schema.

tab_name

Name of the table to which the column group belongs.

extension

Name of the column group.


For example, use the following query to obtain the column group name for a set of columns on the customers table:

SQL> select sys.dbms_stats.show_extended_stats_name('sh','customers','(CUST_CITY,CUST_STATE_PROVINCE,COUNTRY_ID)') cg_name  from dual;

CG_NAME
--------------------------------------------------------------------------------
SYS_STUMZ$C3AIHLPBROI#SKA58H_N


3 Monitoring Column Groups

Use the dictionary table user_stat_extensions to obtain information about MultiColumn statistics:

SQL>  Select extension_name, extension  from dba_stat_extensions  where table_name='CUSTOMERS';

EXTENSION_NAME                 EXTENSION
------------------------------ --------------------------------------------------------------------------------
SYS_NC00028$                   (UPPER("CUST_LAST_NAME"))
SYS_NC00029$                   (UPPER("CUST_FIRST_NAME"))
SYS_STUMZ$C3AIHLPBROI#SKA58H_N ("CUST_CITY","CUST_STATE_PROVINCE","COUNTRY_ID")

Use the following query to find the number of distinct values and find whether a histogram has been created for a column group:
SQL> select e.extension col_group, t.num_distinct, t.histogram
  2     from dba_stat_extensions e, dba_tab_col_statistics t
  3     where e.extension_name=t.column_name
  4     and e.table_name=t.table_name
  5     and t.table_name='CUSTOMERS';

COL_GROUP                                                                        NUM_DISTINCT HISTOGRAM
-------------------------------------------------------------------------------- ------------ ---------------
("CUST_CITY","CUST_STATE_PROVINCE","COUNTRY_ID")                                          620 HEIGHT BALANCED
(UPPER("CUST_FIRST_NAME"))                                                                170 NONE
(UPPER("CUST_LAST_NAME"))                                                                 176 NONE


4 Gathering Statistics on Column Groups

The METHOD_OPT argument of the DBMS_STATS package enables you to gather statistics on column groups. If you set the value of this argument to FOR ALL COLUMNS SIZE AUTO, then the optimizer gathers statistics on all existing column groups. To collect statistics on a new column group, specify the group using FOR COLUMNS. The column group is automatically created as part of statistic gathering.

For example, the following statement creates a new column group for the customers table on the columns cust_state_province, country_id and gathers statistics (including histograms) on the entire table and the new column group:

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_CITY,CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY');

PL/SQL procedure successfully completed.






来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/13750068/viewspace-2145015/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/13750068/viewspace-2145015/

你可能感兴趣的:(MultiColumn Statistics)