[2023-09-13]使用EXPDP/IMPDP迁移数据库后统计信息引起的性能问题

问题描述:

        客户在使用expdp/impdp迁移数据库完成后,在新环境收集统计信息,但是在迁移完成的当天中午,好多SQL语句执行变慢,执行计划发生了改变,下面通过案例来说明。

1、准备数据

scott用户下创建test表,插入9999行数据,并且把id>2的全部更新成99,这样id列数据就会出现严重倾斜。


conn scott/tiger
drop table test purge;
create table test (id int, name varchar2(20));
insert into test select level lv, dbms_random.string('l',20) from dual connect by level < 10000;
update test set id = 99 where id > 2;
commit;

2、 根据ID列查询TEST表(这样ID列会记录在col_usage$视图里面)


SQL> select * from test where id = 1;

        ID NAME
---------- ------------------------------------------------------------
         1 whkefbijsvipefdgnoez

SQL> 

#####根据test表对应的object_id查询col_usage$视图
SQL> exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO;

PL/SQL procedure successfully completed.

SQL> select * from col_usage$ where obj# = (select object_id from dba_objects where object_name = 'TEST' and owner='SCOTT');

      OBJ#    INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS
---------- ---------- -------------- -------------- -----------------
RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
----------- ---------- ---------- ---------------
     97981          1              1              0                 0
          1          0          0 13-SEP-23

3、对TEST表收集统计信息,统计信息参数都不用写,用默认值



SQL> exec DBMS_STATS.GATHER_TABLE_STATS(ownname=>'SCOTT',tabname=>'TEST');

PL/SQL procedure successfully completed.

SQL> 

4、查看TEST表列的统计信息


可以看到ID列上面有直方图信息

SQL> set lin 200
SQL> col COLUMN_NAME for a30
SQL> select a.column_name,
  2  b.num_rows,
  3  a.num_distinct distinct_num,
  4  round(a.num_distinct / b.num_rows * 100, 2) selectivity,
  5  a.histogram,
  6  a.num_buckets
  7  from dba_tab_col_statistics a, dba_tables b
  8  where a.owner = b.owner
  9  and a.table_name = b.table_name
 10  and a.owner = 'SCOTT'
 11  and a.table_name = 'TEST';

COLUMN_NAME                      NUM_ROWS DISTINCT_NUM SELECTIVITY HISTOGRAM                                     NUM_BUCKETS
------------------------------ ---------- ------------ ----------- --------------------------------------------- -----------
NAME                                 9999         9999         100 NONE                                                    1
ID                                   9999            3         .03 FREQUENCY                                               3

SQL> 

5、导出TEST


统计信息也一起导出

[oracle@11g ~]$ cat expdp_full_data.par 
userid="/ as sysdba"
directory=MY_DIR
dumpfile=expdp_full_data_%U.dmp
logfile=expdp_full_data.log
PARALLEL=16
CLUSTER=N
tables=scott.test
compression=all
[oracle@11g ~]$ expdp parfile=expdp_full_data.par

Export: Release 11.2.0.4.0 - Production on Wed Sep 13 14:57:55 2023

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
FLASHBACK automatically enabled to preserve database integrity.
Starting "SYS"."SYS_EXPORT_TABLE_01":  /******** AS SYSDBA parfile=expdp_full_data.par 
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 384 KB
. . exported "SCOTT"."TEST"                              129.0 KB    9999 rows
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Master table "SYS"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_TABLE_01 is:
  /home/oracle/dir/expdp_full_data_01.dmp
  /home/oracle/dir/expdp_full_data_02.dmp
Job "SYS"."SYS_EXPORT_TABLE_01" successfully completed at Wed Sep 13 14:58:03 2023 elapsed 0 00:00:06

[oracle@11g ~]$ 

6、DROP TEST表


SQL> drop table test purge;

Table dropped.

SQL> 

7、重新导入TEST表


[oracle@11g ~]$ cat impdp_full_data.par 
userid="/ as sysdba"
directory=MY_DIR
dumpfile=expdp_full_data_%U.dmp
logfile=impdp_full_data.log
PARALLEL=16
CLUSTER=N
full=y
[oracle@11g ~]$ impdp parfile=impdp_full_data.par

Import: Release 11.2.0.4.0 - Production on Wed Sep 13 15:00:23 2023

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYS"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
Starting "SYS"."SYS_IMPORT_FULL_01":  /******** AS SYSDBA parfile=impdp_full_data.par 
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "SCOTT"."TEST"                              129.0 KB    9999 rows
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Job "SYS"."SYS_IMPORT_FULL_01" successfully completed at Wed Sep 13 15:00:27 2023 elapsed 0 00:00:03

[oracle@11g ~]$ 

8、查看TEST表统计信息


上面还有直方图信息

SQL> set lin 200
SQL> col COLUMN_NAME for a30
SQL> select a.column_name,
  2  b.num_rows,
  3  a.num_distinct distinct_num,
  4  round(a.num_distinct / b.num_rows * 100, 2) selectivity,
  5  a.histogram,
  6  a.num_buckets
  7  from dba_tab_col_statistics a, dba_tables b
  8  where a.owner = b.owner
  9  and a.table_name = b.table_name
 10  and a.owner = 'SCOTT'
 11  and a.table_name = 'TEST';

COLUMN_NAME                      NUM_ROWS DISTINCT_NUM SELECTIVITY HISTOGRAM                                     NUM_BUCKETS
------------------------------ ---------- ------------ ----------- --------------------------------------------- -----------
NAME                                 9999         9999         100 NONE                                                    1
ID                                   9999            3         .03 FREQUENCY                                               3

9、重新收集统计信息


SQL> select * from col_usage$ where obj# = (select object_id from dba_objects where object_name = 'TEST' and owner='SCOTT');


no rows selected

SQL> SQL> exec DBMS_STATS.GATHER_TABLE_STATS(ownname=>'SCOTT',tabname=>'TEST');

PL/SQL procedure successfully completed.

SQL> 

9、再次确认TEST表的统计信息


发现ID的直方图信息没有了,数据倾斜没有直方图信息,可能导致执行计划不准确。

SQL> set lin 200
SQL> col COLUMN_NAME for a30
SQL> select a.column_name,
  2  b.num_rows,
  3  a.num_distinct distinct_num,
  4  round(a.num_distinct / b.num_rows * 100, 2) selectivity,
  5  a.histogram,
  6  a.num_buckets
  7  from dba_tab_col_statistics a, dba_tables b
  8  where a.owner = b.owner
  9  and a.table_name = b.table_name
 10  and a.owner = 'SCOTT'
 11  and a.table_name = 'TEST';

COLUMN_NAME                      NUM_ROWS DISTINCT_NUM SELECTIVITY HISTOGRAM                                     NUM_BUCKETS
------------------------------ ---------- ------------ ----------- --------------------------------------------- -----------
NAME                                 9999         9999         100 NONE                                                    1
ID                                   9999            3         .03 NONE                                                    1

SQL> 

##########

前面步骤都一样,但是如果导入数据之后,使用repeat的方式收集,直方图信息还在


SQL> set wrap off
SQL> set lin 200
SQL> set lin 200
SQL> col COLUMN_NAME for a30
SQL> select a.column_name,
  2  b.num_rows,
  3  a.num_distinct distinct_num,
  4  round(a.num_distinct / b.num_rows * 100, 2) selectivity,
  5  a.histogram,
  6  a.num_buckets
  7  from dba_tab_col_statistics a, dba_tables b
  8  where a.owner = b.owner
  9  and a.table_name = b.table_name
 10  and a.owner = 'SCOTT'
 11  and a.table_name = 'TEST';

COLUMN_NAME                      NUM_ROWS DISTINCT_NUM SELECTIVITY HISTOGRAM                                     NUM_BUCKETS
------------------------------ ---------- ------------ ----------- --------------------------------------------- -----------
NAME                                 9999         9999         100 NONE                                                    1
ID                                   9999            3         .03 FREQUENCY                                               3

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(ownname=>'SCOTT',tabname=>'TEST',ESTIMATE_PERCENT=>10,method_opt=>'for all columns size repeat',cascade=>true,force=>true,degree=>8);

PL/SQL procedure successfully completed.

SQL> set lin 200
SQL> col COLUMN_NAME for a30
SQL> select a.column_name,
  2  b.num_rows,
  3  a.num_distinct distinct_num,
  4  round(a.num_distinct / b.num_rows * 100, 2) selectivity,
  5  a.histogram,
  6  a.num_buckets
  7  from dba_tab_col_statistics a, dba_tables b
  8  where a.owner = b.owner
  9  and a.table_name = b.table_name
 10  and a.owner = 'SCOTT'
 11  and a.table_name = 'TEST';

COLUMN_NAME                      NUM_ROWS DISTINCT_NUM SELECTIVITY HISTOGRAM                                     NUM_BUCKETS
------------------------------ ---------- ------------ ----------- --------------------------------------------- -----------
NAME                                10009        10009         100 NONE                                                    1
ID                                  10009            3         .03 FREQUENCY                                               3

SQL> 

[2023-09-13]使用EXPDP/IMPDP迁移数据库后统计信息引起的性能问题_第1张图片

所以,使用expdp/impdp迁移数据库后,如果使用默认的方式收集统计信息,会导致列上面的直方图信息丢失,造成SQL执行计划和原库存在差异,SQL执行效率变低,随着数据库运行一段时间后, col_usage$表中记录的列越来越多,使用默认的方式(for all columns size auto)的方式也会逐渐把列的直方图收集。如果生产中SQL遇到了问题,需要手动收集统计信息(因为SQL已经运行过,where条件中用到的列已经记录到col_usage$中,所以auto的方式也会把where中的列收集直方图)。

你可能感兴趣的:(数据库,oracle)