Oracle Real Application Testing之DB Replay实践案例

一、Oracle Real Application testing介绍
RAT 作用是在系统改变后,在测试环境上进行了全面评估.RAT有两个重要的组件:Database Replay、SQL Performance Analyzer 

Database Replay SQL Performance Analyzer
What is ......? Replays real database workload on test system Predicts SQL performance deviations before end-users can be impacted
What is the Purpose of .....? Assess impact of change on workload throughput Assess impact of change on SQL response time
How does ..... work?

Database Replay Workflow:

  1. Workload Capture
  2. Workload Processing
  3. Workload Replay – including DML and SQL queries
  4. Analysis and Reporting

SQL Performance Analyzer Workflow:

  1. Capture the SQL workload that you want to analyze with SPA
  2. Measure the performance of the workload before a change by executing SPA on the SQL tuning set
  3. Make the change, such as database upgrade or gathering optimizer statistics
  4. Measure the performance of the workload after the change by executing SPA on The SQL Tuning Set again
  5. Compare performance of the two executions of the SQL tuning set to identify the SQL statements that have regressed, improved, or were unchanged
When should I use .....?

Comprehensive testing of all sub-systems of the database server using real production workload, including:

-Database and operating system upgrades
- Configuration changes, such as conversion of a database from a single instance to an Oracle Real   Application Clusters (Oracle RAC) environment     
- Storage, network, and interconnect changes
- Operating system and hardware migrations

Unit testing of SQL with the goal to identify the set of SQL statements with improved/regressed performance, including:

- Database upgrade
- Configuration changes to the operating system or  hardware
- Schema changes
- Changes to database initialization parameters
- Refreshing optimizer statistics
- SQL tuning actions

Database Replay Workflow 




二.使用DB Replay环境需要
1.可以使用DB Replay的数据版本及PATCH包的需要
参考文档:Using Real Application Testing Functionality in Earlier Releases (文档 ID 560977.1)
在11g 之前安装Real Application testing,Opatch 工具应该被使用,如果你想要使用DB replay或SQL Performance Analyzer 仅打上对应的补丁即可;如果想同时使用DB Replay或SPA就需要把对应的被丁都要打上;
Database Replay
升级数据库版本到11g或11g版本以上,Database Replay 的capture功能在以下表所列出的早期版本中已生效;
Note: The replay of the captured workload can only be done on Oracle Database 11g and higher. 
注:Replay功能仅仅可以在11G及以后版本可用;如果是12.1.0.1或更高版本则都不需要任何强制性的PATCH
Table 1: Database Replay Availability Information for All Platforms except WindowsNote:
  • These are mandatory patches.
  • Table may not show all possible combinations.
  • Any new Fixes, patches or merge patches for Real Application Testing will be provided as per Oracle’s Error Correction Support Policy documented in Database, FMW, EM Grid Control, and OCS Software Error Correction Support Policy Document 209768.1.

Source DB Upgrade from release Destination DB Upgrade to any release What patch you need to apply? Download Information Comments
9.2.0.8.0 >=11.1.0.6.0 9.2.0.8.0 + one-off patch 9373986

For AIX platform one-off patch 13370576 (it includes 9373986)
One-off patch can be downloaded or requested from MOS One-off patch can be downloaded from MOS
9.2.0.8.0 >=11.1.0.7.0 9.2.0.8.0 + one-off patch 9373986 +

For AIX platform one-off patch 13370576 (it includes 9373986)


one-off patch
8712466 on top of 11.1.0.7.0
One-off patch can be downloaded or requested from MOS 8712466 is a merge patch on top of 11.1.0.7.0
9.2.0.8.0 >=11.2.0.1.0 9.2.0.8.0 + one-off patch 9373986

For AIX platform one-off patch 13370576 ( it includes 9373986)
One-off patch can be downloaded or requested from MOS One-off patch can be downloaded from MOS
9.2.0.8.0 >=11.2.0.2.0

9.2.0.8.0 + one-off patch 9373986

For AIX platform one-off patch 13370576 (it includes 9373986)


and 11.2.0.2.0 + patch 13947480

One-off patch can be downloaded from MOS

One-off patch can be downloaded from MOS

 

9.2.0.8.0 >=11.2.0.3.0 9.2.0.8.0 + one-off patch 9373986

For AIX platform one-off patch 13370576 ( it includes 9373986)

and 11.2.0.4.0 + patch 17411249
One-off patch can be downloaded or requested from MOS

One-off patch can be downloaded or requested from MOS

 

9.2.0.8.0 >=11.2.0.4.0

9.2.0.8.0 + one-off patch 9373986

For AIX platform one-off patch 13370576 ( it includes 9373986)

and 11.2.0.4.0 + patch 17411249

One-off patch can be downloaded or requested from MOS

One-off patch can be downloaded or requested from MOS

10.2.0.2.0 >=11.1.0.6.0 10.2.0.2.0 + one-off patch 9373986 One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
10.2.0.2.0 >=11.1.0.7.0

10.2.0.2.0 + one-off patch 9373986 and

one-off patch 8712466 on top of 11.1.0.7.0

One-off patch can be downloaded or requested from MOS 8712466 is a merge patch on top of 11.1.0.7.0
10.2.0.2.0 >=11.2.0.1.0 10.2.0.2.0 + one-off patch 9373986 One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
10.2.0.2.0 >=11.2.0.2.0

10.2.0.2.0 + one-off patch 9373986

and 11.2.0.2.0 + patch 13947480

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
10.2.0.2.0 >=11.2.0.3.0

10.2.0.2.0 + one-off patch 9373986

and 11.2.0.3.0 + patch 17411249

One-off patch can be downloaded or requested from MOS

One-off patch can be downloaded or requested from MOS

10.2.0.2.0  >=11.2.0.4.0

10.2.0.2.0 + one-off patch 9373986

and 11.2.0.4.0 + patch 17411249

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
10.2.0.3.0 >=11.1.0.6.0 10.2.0.3.0 + one-off patch 9373986 One-off patch can be downloaded from MOS One-off patch can be downloaded from MOS
10.2.0.3.0 >=11.1.0.7.0 10.2.0.3.0 + one-off patch 9373986+ one-off patch 8712466 on top of 11.1.0.7.0 One-off patch can be downloaded from MOS 8712466 is a merge patch on top of 11.1.0.7.0
10.2.0.3.0 >=11.2.0.1.0 10.2.0.3.0 + one-off patch 9373986 One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
10.2.0.3.0 >=11.2.0.2.0

10.2.0.3.0 + one-off patch 9373986

and 11.2.0.2.0 + patch 13947480

One-off patch can be downloaded from MOS One-off patch can be downloaded from MOS
10.2.0.3.0 >=11.2.0.3.0

10.2.0.3.0 + one-off patch 9373986

and 11.2.0.3.0 + patch 17411249

One-off patch can be downloaded or requested from MOS

One-off patch can be downloaded or requested from MOS

 

10.2.0.3.0 >=11.2.0.4.0

10.2.0.3.0 + one-off patch 9373986

and 11.2.0.4.0 + 17411249

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
10.2.0.4.0 >= 11.1.0.6.0 10.2.0.4.0 Patchset + one-off patch 10239989 Patchset can be downloaded from MOS Functionality already exists in Patchset, download from MOS
Enable workload capture by following instructions below
10.2.0.4.0 >=11.1.0.7.0 10.2.0.4.0 patchset+ one-off patch 10239989 + one-off patch 8712466 on top of 11.1.0.7.0 Patchset can be downloaded from MOS

Functionality already exists in Patchset, download from Metalink


Enable workload capture by following instructions below

8712466 is a merge patch on top of 11.1.0.7.0

10.2.0.4.0 >=11.2.0.1.0 10.2.0.4.0 Patchset + one-off patch 10239989 Patchset can be downloaded from MOS Functionality already exists in Patchset, download from MOS


Enable workload capture by following instructions below
10.2.0.4.0 >=11.2.0.2.0

10.2.0.4.0 Patchset + one-off patch 10239989

and 11.2.0.2.0 + patch 13947480

Patchset can be downloaded from MOS Functionality already exists in Patchset, download from MOS

Enable workload capture by following instructions below
10.2.0.4.0 >=11.2.0.3.0

10.2.0.4.0 Patchset + one-off patch 10239989

and

11.2.0.3.0 + patch 17411249

Patchset can be downloaded from MOS

Functionality already exists in Patchset, download from MOS

Enable workload capture by following instructions below

10.2.0.4.0 >=11.2.0.4.0

10.2.0.4.0 Patchset + one-off patch 1023998

and

11.2.0.4.0 + patch 17411249

Patchset can be downloaded from MOS

Functionality already exists in Patchset, download from MOS

Enable workload capture by following instructions below

10.2.0.5.0 >=11.2.0.1.0

10.2.0.5.0 Patchset + one-off patch 9373986

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
10.2.0.5.0 >=11.2.0.2.0 10.2.0.5.0 Patchset + one-off patch 9373986
and
11.2.0.2.0 + patch 13947480
One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
10.2.0.5.0 >=11.2.0.3.0

10.2.0.5.0 Patchset + one-off patch 9373986


and
11.2.0.3.0 + patch 17411249

One-off patch can be downloaded or requested from MOS

One-off patch can be downloaded or requested from MOS

 

10.2.0.5.0 >=11.2.0.4.0

10.2.0.5.0 Patchset + one-off patch 9373986
and

11.2.0.4.0 + patch 17411249

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.1.0.6.0 >=11.2.0.1.0 11.1.0.6.0  + one-off patch 8712466 + 9373986 One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.1.0.6.0 >=11.2.0.2.0 11.1.0.6.0 + one-off patch 8712466 + 9373986

and

11.2.0.2.0 + patch 13947480
One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.1.0.6.0 >=11.2.0.3.0

11.1.0.6.0 + one-off patch 8712466 + 9373986

and

11.2.0.3.0 + patch  17411249

One-off patch can be downloaded or requested from MOS

One-off patch can be downloaded or requested from MOS

 

11.1.0.6.0 >=11.2.0.4.0

11.1.0.6.0 + one-off patch 8712466 + 9373986

and

11.2.0.4.0 + patch 17411249

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.1.0.7.0 >=11.2.0.1.0 11.1.0.7.0 + one-off patch 8712466 + 9373986 One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.1.0.7.0 >=11.2.0.2.0 11.1.0.7.0 + one-off patch 8712466 + 9373986

and

11.2.0.2.0 + patch 13947480
One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.1.0.7.0 >=11.2.0.3.0

11.1.0.7.0 + one-off patch 8712466 + 9373986

and

11.2.0.3.0 + patch 17411249

One-off patch can be downloaded or requested from MOS

One-off patch can be downloaded or requested from MOS

 

11.1.0.7.0 >=11.2.0.4.0

11.1.0.7.0 + one-off patch 8712466 + 9373986

and

11.2.0.4.0 + patch 17411249

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.2.0.1.0 >=11.2.0.2.0

11.2.0.1.0 + one-off patch 9373986

and

11.2.0.2.0 +  patch 13947480

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.2.0.1.0 >=11.2.0.3.0

11.2.0.1.0 + one-off patch 9373986

and

11.2.0.3.0 + patch 17411249

One-off patch can be downloaded or requested from MOS

One-off patch can be downloaded or requested from MOS

 

11.2.0.1.0 >=11.2.0.4.0

11.2.0.1.0 + one-off patch 9373986

and

11.2.0.4.0 + patch 17411249

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.2.0.2.0 >= 11.2.0.2.0

Capture side

No capture side patches are needed for 11.2.0.2.0

and

Replay Side

11.2.0.2.0 + patch 13947480

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.2.0.2.0 >=11.2.0.3.0

Capture side

No capture side patches are needed for 11.2.0.2.0

and

Replay Side

11.2.0.3.0 + patch 17411249

One-off patch can be downloaded or requested from MOS

One-off patch can be downloaded or requested from MOS

 

11.2.0.2.0 >=11.2.0.4.0

Capture side

No capture side patches are needed for 11.2.0.2.0

and

Replay Side

11.2.0.4.0 + patch 17411249
One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.2.0.3.0 >=11.2.0.3.0

Capture side

No capture side patch are needed for 11.2.0.3.0

and

Replay side

11.2.0.3.0 + patch 17411249

One-off patch can be downloaded or requested from MOS

One-off patch can be downloaded or requested from MOS

 

11.2.0.3.0 >=11.2.0.4.0

Capture side

No capture side patch are needed for 11.2.0.3.0

and

Replay side

11.2.0.4.0 + patch 17411249

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
11.2.0.4.0 >=11.2.0.4.0

Capture side

No capture side patch is needed for 11.2.0.4.0

and

Replay side

11.2.0.4.0 + 17411249

One-off patch can be downloaded or requested from MOS One-off patch can be downloaded or requested from MOS
Table 2: Database Replay Availability Information Windows Platform

Note:

  • These are mandatory patches.
  • Table may not show all possible combinations.
  • Any new Fixes, patches or merge patches for Real Application Testing will be provided as per Oracle’s Error Correction Support Policy documented in Database, FMW, EM Grid Control, and OCS Software Error Correction Support Policy Document 209768.1.
Source DB Upgrade from release Destination DB Upgrade to any release What patch you need to apply? Download Information Comments
9.2.0.8.0 For Windows 32-bit

>=11.1.0.6.0

including 11.2.X

9.2.0.8.0 + patch bundle 7047008 Patch bundle can be downloaded from MOS Patch Available as part of Bundle 18.
9.2.0.8.0 For Windows Itanium 64-bit

>=11.1.0.6.0

including 11.2.X

9.2.0.8.0 + patch bundle 7047015 Patch bundle can be downloaded from MOS Patch Available as part of Bundle 18.
10.2.0.3.0 For Windows 32-bit

>=11.1.0.6.0

including 11.2.X

10.2.0.3.0 + patch bundle 6998002 Patch bundle can be downloaded from MOS Patch Available as part of Bundle 23.
10.2.0.3.0 For Windows Itanium 64-bit

>=11.1.0.6.0

including 11.2.X

10.2.0.3.0 + patch bundle 6998003 Patch bundle can be downloaded from MOS Patch Available as part of Bundle 23.
10.2.0.3.0
For Windows 64-bit AMD64 AND Intel EM64T XP AND 2003

>=11.1.0.6.0

including 11.2.X

10.2.0.3.0 + patch bundle 6998004 Patch bundle can be downloaded from MOS Patch Available as part of Bundle 23.
10.2.0.4.0
(Windows 32-bit, Windows Itanium Windows 64-bit AMD64 AND Intel EM64T XP AND 2003)

>= 11.1.0.6.0

including 11.2.X

10.2.0.4.0 Patchset + one-off patch 8542772 Patchset can be downloaded from MOS: 6810189

Functionality already exists in Patchset, download from MOS.


One-off patch 8542772 needs to be requested.

Enable workload capture by following instructions

10.2.0.5.0 ( all windows platform )

>=11.1.0.6.0

including 11.2.X

No mandatory patches needed.

Functionality already exists in Patchset.
Patchset can be downloaded from MOS Functionality already exists in Patchset, download from MOS.

本案例我们的版本源及目标端均为11.2.0.3,因此我们只需要在目标端打PATCH  17411249即可
检查目标端是否打PATCH 
opatch lsinventory |grep 17411249

2.开启Workload Capture功能(11G版本是开启的,不需要单独设置PRE_11G_ENABLE_CAPTURE=TURE)
Enabling Workload Capture (Only Required for 10.2.0.4.0)

By default, the workload capture is disabled on the pre-11g database releases. Database workload capture functionality is enabled on the system by specifying the PRE_11G_ENABLE_CAPTURE initialization parameter. In order to enable workload capture on 10.2.0.4  run the wrrenbl.sql script at the SQL prompt as SYS or SYSTEM as follows:

@$ORACLE_HOME/rdbms/admin/wrrenbl.sql

The wrrenbl.sql script calls the ALTER SYSTEM SQL statement to set the PRE_11G_ENABLE_CAPTURE initialization parameter to TRUE.
This step is NOT required for version lower then 10.2.0.4.0
Please check the OTN documentation mentioned above for more details.


3.检查组件
通过下述方法确认Real Application Testing组件已经安装:
(1): 检查inventory:
$ORACLE_HOME/OPatch/opatch lsinventory -details|grep "Oracle Real Application Testing"
(2): 使用aix archiver (ar)命令:
cd $ORACLE_HOME/rdbms/lib 
$ ar -X64 -t libknlopt.a | grep -c kecwr.o 
1
ar命令返回0表示RAT被禁用,返回> 0表示启用
(3): 检查数据字典:
select value from v$option where parameter = 'Real Application Testing';
(4).检查Package:
select object_name, object_type, status from dba_objects where object_name like '%DBMS_WORKLOAD_%';

4.Workload Catpure的限制

The following types of client requests are not supported.

  • Direct path load of data from external files using utilities such as SQL*Loader
  • Non-PL/SQL based Advanced Queuing (AQ)
  • Flashback queries
  • Oracle Call Interface (OCI) based object navigations
  • Non SQL-based object access
  • Distributed transactions (any distributed transactions that are captured will be replayed as local transactions)
  • Oracle Streams/Advanced Replication workload is not supported prior to 11.2.
  • Database session migration
  • Database Resident Connection Pooling ( DRCP )
  • XA transactions
  • Workloads having Object Out Bind
注:不支持的类型将可能在Replay时失败,因此最好的方式是我们在捕获时过滤掉这此不满足条件的客户端请求,默认情况下,捕获的时候全部捕获,但在Replay的时候不支持的类型将不重放;

三、实施DB Replay 
参数文档:Using Workload Capture and Replay (文档 ID 445116.1)
3.1 实施Database Replay,要对生产环境主机、数据库,以及测试环境主机、数据库做必要的准备:
生产环境c4oyy3a/c4oyy3b主机、数据库准备:
1. 安装主机性能监控软件,如OSW或NMON,用于Database Replay对负载捕获阶段和负载重演阶段的主机负载指标做对比;
2. 在2台RAC数据库主机上划分500GB(预估可满足3小时抓取要求?)的文件系统,用于存储负载记录文件、AWR的DUMP文件等,该文件系统必须配置为共享的文件系统,例如CFS或NFS;(NAS存储提供几十TB的空间?);
3. 生产环境负载捕捉的起始时间是Database Replay的基准时间,必需准确记录这一时间点,在这一时间点开始BCV同步保留一份生产环境数据库的镜像(该镜像在Database Replay测试期间要一直保留);初步定这个时间是c4oyy3a/c4oyy3b的5月12:30
4. 如果无法保留BCV镜像,就必须保留一份老的全库备份集和相关归档,这个全库备份集必需是生产环境负载捕捉的起始时间点之前完成的全库备份集,确保将来可以根据这个备份集和归档把数据库恢复到生产环境负载捕捉的起始时间点;
5. 负载捕捉多长时间支持两种模式,一种是通过参数duration设置时间长度,到时间后自动停止捕捉;另一种是设置duration参数为NULL或者忽略该参数,在捕捉启动后的任意时间点通过手工停止;

测试环境toyy3a_cs/toyy3b_cs主机、数据库准备:
1. 安装主机性能监控软件,如OSW或NMON,用于Database Replay对负载捕获阶段和负载重演阶段的主机负载指标做对比;
2. 在测试环境的4台RAC数据库主机上划分500GB的共享文件系统,用于存放从生产环境传输过来的负载记录文件、AWR的DUMP文件,以及预处理文件、重演时的日志文件等;该文件系统必须配置为共享的文件系统,例如CFS或NFS;
3. 在测试环境的4台RAC数据库主机上划分500GB的共享文件系统,做为数据库的Flash Recovery Area,用于启用Flashback Database功能,在每次负载重演开始时(假设会多次重演)可以把数据库回退到一致的初始状态;该文件系统必须配置为共享的文件系统,例如CFS或NFS;   如果采用BCV 反向复制回退数据库,则此处要求的Flash Recovery Area文件系统可不配置;
4. 测试环境的4台RAC数据库主机在每次负载重演开始时都会调整时间为生产环境负载捕捉的起始时间,确保这样的时间调整不会影响其他系统;
5. 应用软件开发商检查、整理数据库的外部访问,例如DB LINK、外部表、URL、目录对象等等,以及应用访问的连接串等,避免测试环境误连误改其他生产系统;把所有DB LINK全部DROP掉、或者不配置tnsnames.ora;ezconnect host1:1521/orcl
6. 数据库的AWR保留周期要满足负载捕捉窗口的要求,避免捕捉期间的AWR数据在导出之前就被PURGE掉;该项准备已经完成

3.2 DB Replay 数据捕获

任一时刻,每套数据库只能启动一个负载捕获作业(RAC环境也只能在一个节点上开启)。为了确保捕获的负载准确、有效,要考虑如下因素:
1.设置捕获过滤器。缺省情况下,所有用户的会话产生的负载都会被捕捉。使用包含过滤器或排除过滤器(inclusion filters or exclusion filters)用于限定针对哪些用户、应用、模块等等的负载进行捕获;例如,下列演示了添加、删除一个过滤器:
SQL> conn /as sysdba;
SQL> BEGIN
  DBMS_WORKLOAD_CAPTURE.ADD_FILTER (
                           fname => 'user_ichan',
                           fattribute => 'USER',
                           fvalue => 'ICHAN');
END;
/


SQL> BEGIN
  DBMS_WORKLOAD_CAPTURE.DELETE_FILTER (fname => 'user_ichan');
END;
/




2.创建、配置捕获目录。确保该目录是空的,且空间足够用于存放负载记录文件,对于RAC数据库,建议该目录创建于集群共享文件系统;例如:
$ mkdir –p /dbreplay
SQL> conn /as sysdba;
drop directory db_replay;
SQL> create directory db_replay as '/dbreplay/capture';
SQL> grant read,write on directory db_replay to public;


3.官方建议在捕获之前把数据库重启到RESTRICT模式(这一过程虽然不是强制的,但这样做可以保证捕获的事务完整性及重演时的准确性),然后开始负载捕获,缺省情况下,一旦负载捕获开始,数据库会自动从RESTRICTED模式转换为UNRESTRICTED模式,业务用户可以正常连接。由于申请停机时间窗口困难,此次的负载捕获将省略这一过程;
4.负载捕捉多长时间支持两种模式,一种是通过参数duration设置时间长度,到时间后自动停止捕捉;另一种是设置duration参数为NULL或者忽略该参数,在捕捉启动后的任意时间点通过手工停止;例如,下列演示了手工启动、手工停止:
需要要注意的地方:
o The directory name should always be upper case. 
o It is possible to start the capture without restricted mode using the argument no_restart_mode=TRUE but this is not advisable as it relies on transactional integrity
--开启捕获
SQL> conn /as sysdba;

SQL> BEGIN
  DBMS_WORKLOAD_CAPTURE.START_CAPTURE (name => 'db_replay_peak', 
                           dir => 'DB_REPLAY',
                           duration => null);
END;
/

--结束捕获
SQL> BEGIN
  DBMS_WORKLOAD_CAPTURE.FINISH_CAPTURE (); 
END;
/

--删除捕获(如有多余的捕获信息)
BEGIN
DBMS_WORKLOAD_CAPTURE.DELETE_CAPTURE_INFO(15);
END;
/
5.导出生产环境数据库的AWR数据,用于对负载捕捉期间和负载重演期间的数据库性能做详细比对。例如,下列演示了导出某个负载捕捉作业ID对应的AWR数据:
SQL> conn /as sysdba;
SQL> select id,AWR_BEGIN_SNAP,AWR_END_SNAP from dba_workload_captures;
SQL> BEGIN
  DBMS_WORKLOAD_CAPTURE.EXPORT_AWR (capture_id => 14);
END;
/
6.监控捕捉
负载捕捉的过程中,可以通过视图DBA_WORKLOAD_CAPTURES和DBA_WORKLOAD_FILTERS进行监控,还可以生成负载捕捉报告,对捕捉总体情况进行概览。例如:
SQL> conn /as sysdba;
SQL> select id,name,status,start_time,end_time,connects,user_calls,dir_path from dba_workload_captures where id = (select max(id) from dba_workload_captures) ;

SQL> set pagesize 0 long 30000000 longchunksize 1000
SQL> select dbms_workload_capture.report(1,'TEXT') from dual;

3.3 Database Replay (重放)

1.预处理
一旦负载捕获完成,且测试环境搭建完以后,就可以开始预处理负载记录文件。预处理是创建元数据用于负载重演的必要前提步骤。
预处理是一个耗资源、耗时间的过程,需要从生产环境把所有负载记录文件传输到测试环境的一个单一的目录中进行集中预处理。对于测试环境的RAC数据库,选择其中一个实例做预处理。例如,下列演示了导出某个负载捕获作业ID对应的AWR数据:
$ mkdir –p /cluster_file_system/db_replay
SQL> conn /as sysdba;
SQL> create directory db_replay as '/cluster_file_system/db_replay';
SQL> grant read,write on directory db_replay to public;
SQL> BEGIN
  DBMS_WORKLOAD_REPLAY.PROCESS_CAPTURE (capture_dir => 'DB_REPLAY');
END;
/

注:在11.2.0.3时会遇到一个 Bug 14500154,该BUG会导致system表空间快速增加,且DB replay处理完了也不释放;
参考文档:Pre-Processing Using DBMS_WORKLOAD_REPLAY Uses a Lot of System Tablespace Space (文档 ID 1497607.1)
SQL> select sum(bytes/1024/1024) from dba_segments where segment_name='WRR$_REPLAY_DEPENDENCIES_TMP';
SUM(BYTES/1024/1024)
--------------------
                3443

Solution

1. Bug 14500154 is fixed in 12.1.

2. The Workaround is to drop all these temporary tables after preprocessing.
The selects from dba_segments before and after are simply to help show the space that has been freed.

select sum(blocks) from dba_segments where segment_name like 'WRR$%';

drop table sys.WRR$_REPLAY_COMMITS_TMP cascade constraints purge;
drop table sys.WRR$_REPLAY_CONN_DATA_TMP cascade constraints purge;
drop table sys.WRR$_REPLAY_DATA_TMP cascade constraints purge;
drop table sys.WRR$_REPLAY_DEPENDENCIES_TMP cascade constraints purge;
drop table sys.WRR$_REPLAY_DEP_GRAPH_TMP cascade constraints purge;
drop table sys.WRR$_REPLAY_LOGIN_QUEUE_TMP cascade constraints purge;
drop table sys.WRR$_REPLAY_REFERENCES_TMP cascade constraints purge;
drop table sys.WRR$_REPLAY_SCN_ORDER_TMP cascade constraints purge;
drop table sys.WRR$_REPLAY_SEQ_DATA_TMP cascade constraints purge;

select sum(blocks) from dba_segments where segment_name like 'WRR$%';



2.重演负载
预处理完成后,重演负载可参考如下步骤执行:
(1).初始化重演数据,该步骤是将预处理获取的元数据装载入库的过程。例如:
SQL> conn /as sysdba;
SQL> BEGIN
  DBMS_WORKLOAD_REPLAY.INITIALIZE_REPLAY (replay_name => 'db_replay_102',
                           replay_dir => 'DB_REPLAY');
END;
/
2。连接串重新映射,该步骤是把生产环境的连接串影射为测试环境的连接串。例如:
SQL> conn /as sysdba;
SQL> BEGIN
  DBMS_WORKLOAD_REPLAY.REMAP_CONNECTION (connection_id => 101,
                           replay_connection => 'dlsun244:3434/bjava21');
END;
/
例如:
col CAPTURE_CONN for a50
col REPLAY_CONN for a50
set lines 200
select * from DBA_WORKLOAD_CONNECTION_MAP where rownum<2;


 REPLAY_ID    CONN_ID CAPTURE_CONN                                       REPLAY_CONN                                                                                    
                                
---------- ---------- -------------------------------------------------- --------------------------------------------------                                             
                                
         1         97 (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(Host=oyy3a.or                                                                                                
                                
                      a.boss)(Port=1521))(CONNECT_DATA=(SERVICE_NAME=ora                                                                                                
                                
                      yy3)(INSTANCE_NAME=oyy3a)(server=dedicated)(FAILOV                                                                                                
                                
                      ER_MODE=(BACKUP=oyy3b)(TYPE=NONE))(CID=(PROGRAM=ci                                                                                                
                                
                      csas@c4m3a)(HOST=c4m3a)(USER=cics))))                                                                                                             
                                
                                                                                                                                                                        
                                
         1         98 (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(Host=oyy3a.or                                                                                                
                                
                      a.boss)(Port=1521))(CONNECT_DATA=(SERVICE_NAME=ora                                                                                                
                                
                      yy3)(INSTANCE_NAME=oyy3a)(server=dedicated)(FAILOV                                                                                                
                                
                      ER_MODE=(BACKUP=oyy3b)(TYPE=NONE))(CID=(PROGRAM=ci                                                                                                
                                
                      csas@c4m3b)(HOST=c4m3b)(USER=cics))))  
通过如下方式,把字符串连接信息,放到根据conn_id,找到capture_conn串,针对性的指定相应的测试环境实例(如捕获实例是A,同样建议设置REPLAY环境也是测试机的A节点)                 
BEGIN
  DBMS_WORKLOAD_REPLAY.REMAP_CONNECTION (connection_id => 97,
                           replay_connection => '(description_list=(load_balance=off)(failover=on)                                             
            (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(Host=10.19.243.23)(Port=1521)))    
                         (CONNECT_DATA=(SERVICE_NAME=orayy3)(INSTANCE_NAME=oyy3a)(server=dedicated)
                                       (FAILOVER_MODE=(BACKUP=toyy3b_cs)(TYPE=SESSION))))              
            (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(Host=10.19.243.25)(Port=1521)))    
                         (CONNECT_DATA=(SERVICE_NAME=orayy3)(INSTANCE_NAME=oyy3b)(server=dedicated)
                                       (FAILOVER_MODE=(BACKUP=toyy3a_cs)(TYPE=SESSION)))))');
END;
/



(3) 设置重演选项,该步骤是通过设置重演参数(参考前面的“重演条件”章节对于这些参数的说明)为重演做准备。例如:
SQL> conn /as sysdba;
--使用该种方式,是按照捕获时候的顺序来执行的;也是默认的
execute DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY(SYNCHRONIZATION => 'OBJECT_ID'); 
--The most common changes occur in the sync settings where FALSE/'OFF' or 'SCN' may provide better results.
--查看replay状态(只有在PAREPARE时,才可以进行REPLAY)
SQL> select name,status from  dba_workload_replays;
SQL> select name,status from  dba_workload_replays;


NAME                 STATUS
-------------------- --------------------
test_capture_1       PREPARE

select name,STATUS,CAPTURE_ID,PREPARE_TIME,START_TIME,NUM_CLIENTS THINK_TIME_SCALE,SYNCHRONIZATION,PREPROCESSING_ID from WRR$_REPLAYS;

注:
在prepare replay时,11.2.0.3环境如果没有打相应的PATCH,很可能会遇到BUG。正常情况下捕获5个小时,PREPARE创建也就是3-5分钟的左右,BUG现象执行时间需要10多个小时:
解决方案:需要创建如下两个索引来
create index wrr$_replay_dep_calls ON wrr$_replay_dep_graph(file_id,call_ctr, sync_point,file_id_dep,call_ctr_dep); 
create index wrr$_replay_dep_commits ON wrr$_replay_dep_graph(file_id,call_ctr_dep,file_id_dep,call_ctr,sync_point); 
SQL> exec DBMS_WORKLOAD_REPLAY.prepare_replay (synchronization => FALSE);
PL/SQL procedure successfully completed.

(4) 定义过滤器和过滤集,该步骤是使用包含过滤器或排除过滤器(inclusion filters or exclusion filters)用于限定针对哪些用户、应用、模块等等的负载进行捕获。例如,下例演示了添加、删除过滤器,以及定义过滤集,启用过滤集:
SQL> conn /as sysdba;
SQL> BEGIN
  DBMS_WORKLOAD_REPLAY.ADD_FILTER (
                           fname => 'user_ichan',
                           fattribute => 'USER',
                           fvalue => 'ICHAN');
END;
/


BEGIN
  DBMS_WORKLOAD_REPLAY.DELETE_FILTER (fname => 'user_ichan');
END;
/


BEGIN
  DBMS_WORKLOAD_REPLAY.CREATE_FILTER_SET (
                           replay_dir => 'DB_REPLAY',
                           filter_set => 'replayfilters',
                           default_action => 'INCLUDE');
END;
/


BEGIN
  DBMS_WORKLOAD_REPLAY.USE_FILTER_SET (filter_set => 'replayfilters');
END;
/
(5)评估需要多少客户端来满足REPLAY条件(在新的测试环境下)
注:为了更准备的模拟测试环境,需要单独准备一个装有oracle client机器来提供所有WRC的连接;本案例评估需要97个客户端
wrc mode=calibrate replaydir=/dbreplay/capture
Report for Workload in: /dbreplay/capture
-----------------------
Recommendation:
Consider using at least 97 clients divided among 25 CPU(s)  
You will need at least 135 MB of memory per client process.
If your machine(s) cannot match that number, consider using more clients.
Workload Characteristics:
- max concurrency: 3449 sessions
- total number of sessions: 286891
Assumptions:
- 1 client process per 50 concurrent sessions
- 4 client process per CPU
- 256 KB of memory cache per concurrent session
- think time scale = 100
- connect time scale = 100
- synchronization = TRUE

(6)开启客户端连接
   由于本案例是RAC环境,本次wrc需要通过SCAN TNS连接到测试库的SCAN 监听,再由SCAN监听来根据DBA_REPLAY_CONNECTION_MAP视图来决定客户端信息要连接到哪个实例;
 6.1 在测试库环境下配置SCAN 监听,确保SCAN监听可用:
 参考文档 11.2 Scan and Node TNS Listener Setup Examples (文档 ID 1070607.1)
 6.2 测试库客户端
 配置TNS:
  toyy3_scan =(DESCRIPTION =(ADDRESS_LIST =(ADDRESS =(PROTOCOL= TCP)(HOST= SCAN-IP)(PORT=1521)))(CONNECT_DATA =(SERVICE_NAME = orayy3)))  
 
 6.3 开启客户端连接
注:修改操作系统时间为捕获时间
Before we can start the replay, we need to calibrate and start a replay client using the "wrc" utility. The calibration step tells us the number of replay clients and hosts necessary to faithfully replay the workload.
如果测试是RAC环境需要使用SCAN TNS连接,来根据dba_replay_connection_map视图里的会话信息来连接数据库实例
$nohup  wrc system/system#123@toyy3_scan mode=replay replaydir=/dbreplay/capture &
$nohup  wrc system/system#123@toyy3_scan mode=replay replaydir=/dbreplay/capture &



(7).启动、暂停、继续、取消重演,当完成负载记录文件预处理、重演初始化、设置重演选项、启动重演客户端以后,就可以开始负载重演。例如,下例演示了启动、暂停、继续、取消重演:
SQL> conn /as sysdba;
SQL> BEGIN
  DBMS_WORKLOAD_REPLAY.START_REPLAY ();
END;
/


BEGIN
  DBMS_WORKLOAD_REPLAY.PAUSE_REPLAY ();
END;
/


BEGIN
  DBMS_WORKLOAD_REPLAY.RESUME_REPLAY ();
END;
/


BEGIN
  DBMS_WORKLOAD_REPLAY.CANCEL_REPLAY ();
END;
/

(8)诊断DB REPLAY慢的脚本=============
SCRIPT


Query to Obtain Information on Blocking Sessions 


--------------------------------------
--              Query 1             --
--------------------------------------


connect sys/ as sysdba 


col inst_id format 99999 
col sid format 99999 
col spid format a6 
col blocking_session_status format a6 heading 'BS' 
col blocking_instance format 99 heading 'BI' 
col blocking_session format 99999 heading 'BLKSID' 
col session_type format a11 
col event format a31 
col file_name format a21 
col file_id format 9999999999999999999 
col call_counter format 9999999 
col wait_for_scn format 99999999999999 heading 'WAITING FOR' 
col wfscn format 99999999999999 heading 'WFSCN' 
col commit_wait_scn format 99999999999999 heading 'CWSCN' 
col post_commit_scn format 99999999999999 heading 'PCSCN' 
col clock format 99999999999999999999 heading 'CLOCK' 
col next_ticker format 999999999999999999999 heading 'NEXT TICKER' 


select wrt.inst_id, wrt.sid, wrt.serial#, wrt.spid, 
s.BLOCKING_SESSION_STATUS, s.BLOCKING_INSTANCE, 
s.blocking_session, 
wrt.session_type, wrt.event, 
wrt.file_name, wrt.file_id, wrt.call_counter, 
wrt.wait_for_scn, 
greatest(wrt.dependent_scn, wrt.statement_scn) as wfscn, 
wrt.commit_wait_scn, wrt.post_commit_scn, 
wrt.clock, wrt.next_ticker 
from gv$workload_replay_thread wrt, gv$session s 
where wrt.sid = s.sid 
and wrt.serial# = s.serial# 
order by inst_id, sid 
;set linesize 500 pagesize 200 




 


Query to Monitor the Approximate Progress of the Replay


--------------------------------------
--              Query 2             --
--------------------------------------


set echo off 


connect sys/ as sysdba 


set serveroutput on 


DECLARE 
my_next_ticker NUMBER; 
clock NUMBER; 
wait_for_scn NUMBER; 
counts NUMBER; 
replay_id NUMBER; 
thr_failure NUMBER; 
start_time DATE; 
num_tickers NUMBER; 
min_scn NUMBER; 
max_scn NUMBER; 
done NUMBER; 
total_time INTERVAL DAY TO SECOND; 


CURSOR get_next_ticker(my_next_ticker NUMBER) IS 
SELECT spid, event, inst_id, wrc_id, client_pid 
FROM gv$workload_replay_thread 
WHERE file_id = my_next_ticker; 


BEGIN 
dbms_output.put_line('********************************'); 
dbms_output.put_line('* Replay Status Report *'); 
dbms_output.put_line('********************************'); 


----------------------------------------- 
-- Make sure that a replay is in progress 
----------------------------------------- 
SELECT count(*) INTO counts 
FROM dba_workload_replays 
WHERE status='IN PROGRESS'; 


if (counts = 0) then 
dbms_output.put_line('No replay in progress!'); 
return; 
end if; 


------------------- 
-- Get replay state 
------------------- 
SELECT id,start_time INTO replay_id, start_time 
FROM dba_workload_replays 
WHERE status='IN PROGRESS'; 


SELECT count(*) INTO counts 
FROM gv$workload_replay_thread 
WHERE session_type = 'REPLAY'; 


SELECT min(wait_for_scn), max(next_ticker), max(clock) 
INTO wait_for_scn, my_next_ticker, clock 
FROM v$workload_replay_thread 
WHERE wait_for_scn <> 0 
AND session_type = 'REPLAY'; 


dbms_output.put_line('Replay has been running for: ' || 
to_char(systimestamp - start_time)); 
dbms_output.put_line('Current clock is: ' || clock); 
dbms_output.put_line('Replay is waiting on clock: ' || 
wait_for_scn); 
dbms_output.put_line(counts || ' threads are currently being 
replayed.'); 


---------------------------------------- 
-- Find info about the next clock ticker 
---------------------------------------- 
num_tickers := 0; 
for rec in get_next_ticker(my_next_ticker) loop 
-- We only want the next clock ticker 
num_tickers := num_tickers + 1; 
exit when num_tickers > 1; 


dbms_output.put_line('Next ticker is process ' || rec.spid || 
' (' || rec.wrc_id || ',' || rec.client_pid || 
') in instance ' || rec.inst_id || 
' and is waiting on '); 
dbms_output.put_line(' ' || rec.event); 


end loop; 


--------------------------------------------------------------------------------------- 
-- Compute the replay progression and estimate the time left 
-- Note: This is an estimated time only, not an absolute value as it is based on SCN.
--------------------------------------------------------------------------------------- 
SELECT min(post_commit_scn), max(post_commit_scn) 
INTO min_scn,max_scn 
FROM wrr$_replay_scn_order; 
done := (clock - min_scn) / (max_scn - min_scn); 
total_time := (systimestamp - start_time) / done; 
dbms_output.put_line('Estimated progression in replay: ' || 
to_char(100*done, '00') || '% done.'); 
dbms_output.put_line('Estimated time before completion: ' || 
((1 - done) * total_time)); 
dbms_output.put_line('Estimated total time for replay: ' || 
total_time); 
dbms_output.put_line('Estimated final time for replay: ' || 
to_char(start_time + total_time, 
'DD-MON-YY HH24:MI:SS')); 
END; 
/


 


Query to Provide Summary of Wait Events


--------------------------------------------------------------
--                           Query 3                        --
--------------------------------------------------------------- 


column event format a40 
select event, count(*), min(wait_for_scn) 
from gv$workload_replay_thread 
where session_type = 'REPLAY' 
group by event;



(9).导出测试环境数据库的AWR数据,用于对负载捕捉期间和负载重演期间的数据库性能做详细比对。例如,下列演示了导出某个负载重演作业ID对应的AWR数据:
SQL> conn /as sysdba;
SQL> BEGIN
  DBMS_WORKLOAD_REPLAY.EXPORT_AWR (replay_id => 1);
END;
/


(10).导入负载捕获期间的AWR数据,在测试环境数据库中利用dbms_workload_repository.awr_diff_report_text或dbms_workload_repository.awr_diff_report_html对负载捕捉期间和负载重演期间的数据库性能做详细比对。例如:
SQL> conn /as sysdba;
SQL> select dbms_workload_capture.import_awr(capture_id => 1, staging_schema => 'AWRRPT') from dual; 
SQL>  -- get the capture data details 
SQL>  select id, awr_begin_snap, awr_end_snap from dba_workload_captures; 

SQL>  -- get the replay data details 
SQL>  select id, awr_begin_snap, awr_end_snap from dba_workload_replays;

(11) 监控重演
在重演中,任何错误,或者出现在生产环境和测试环境之间的数据不符,都会被记录为分歧调用。获取这些分歧调用的信息,包括:SQL ID、SQL文本、绑定变量取值,都可以通过视图DBA_WORKLOAD_REPLAY_DIVERGENCE,以及GET_DIVERGENT_STATEMENT获取。例如:
--方法一
SQL> conn /as sysdba;
SQL> SELECT REPLAY_ID,STREAM_ID,CALL_COUNTER FROM DBA_WORKLOAD_REPLAY_DIVERGENCE;
--方法二
Sqlplus / as sysdba
set long 30000000 longchunksize 1000

DECLARE
r CLOB;
ls_stream_id NUMBER;
ls_call_counter NUMBER;
ls_sql_cd VARCHAR2(20);
ls_sql_err VARCHAR2(512);
CURSOR c IS
SELECT stream_id,call_counter
FROM DBA_WORKLOAD_REPLAY_DIVERGENCE
WHERE replay_id = ;
BEGIN
OPEN c;
LOOP
FETCH c INTO ls_stream_id, ls_call_counter;
EXIT when c%notfound;
DBMS_OUTPUT.PUT_LINE (ls_stream_id||''||ls_call_counter);
r:=DBMS_WORKLOAD_REPLAY.GET_DIVERGING_STATEMENT(replay_id => ,
stream_id => ls_stream_id, call_counter => ls_call_counter);
DBMS_OUTPUT.PUT_LINE (r);
END LOOP;
END;
/

同时,还可以生成负载重演报告,对重演总体情况进行概览。例如:
SQL> conn /as sysdba;
SQL> select id,name,status,start_time,end_time,num_clients,user_calls,dir_path from dba_workload_replays where id = (select max(id) from dba_workload_replays) ;

SQL> set pagesize 0 long 30000000 longchunksize 2000 
SQL> select dbms_workload_replay.report(replay_id => 1,format => 'TEXT') from dual;

此外,还可以通过如下视图获取负载捕捉和负载重演期间的更多信息:
1. DBA_WORKLOAD_CAPTURES列出所有捕获到的负载;
2. DBA_WORKLOAD_FILTERS列出所有捕获的负载定义的过滤器;
3. DBA_WORKLOAD_REPLAYS列出所有已经重演过的负载; view lists all the workload replays that have been replayed in the current database.
4. DBA_WORKLOAD_REPLAY_DIVERGENCE列出所有分歧调用,包括replay identifier, stream identifier, 以及call counter.
5. DBA_WORKLOAD_REPLAY_FILTER_SET列出所有重演的负载定义的过滤器;
6. DBA_WORKLOAD_CONNECTION_MAP列出所有重演使用的连接串的映射信息;
7. V$WORKLOAD_REPLAY_THREAD列出当前所有重演客户端的会话信息;











来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29446986/viewspace-1650269/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/29446986/viewspace-1650269/

你可能感兴趣的:(Oracle Real Application Testing之DB Replay实践案例)