Resolve contention on "workspace_lru_latch"


Problem:

In high-workload OLTP systems, contention on "SQLO_LT_sqlra_workspace_mhashcb__workspace_lru_latch" ("workspace_lru_latch" for short) could happen and slow down applications.


Symptom:

When the latch contention happens, applications are observed in state of "UOW Executing" but have worse performance than usual.

The "db2pd -latch" output indicates EDUs waiting on "SQLO_LT_sqlra_workspace_mhashcb__workspace_lru_latch", similar to the following:

===========================================
Latches:
Address Holder Waiter Filename LOC LatchType HoldCount
0x0700000070012345 100001 200001 sqlra_sql_workspace_mem.C 2012 SQLO_LT_sqlra_workspace_mhashcb__workspace_lru_latch 1
0x0700000070012345 100001 200002 sqlra_sql_workspace_mem.C 2012 SQLO_LT_sqlra_workspace_mhashcb__workspace_lru_latch 1
0x0700000070012345 100001 200003 sqlra_sql_workspace_mem.C 2012 SQLO_LT_sqlra_workspace_mhashcb__workspace_lru_latch 1
0x0700000070012345 100001 200004 sqlra_sql_workspace_mem.C 2012 SQLO_LT_sqlra_workspace_mhashcb__workspace_lru_latch 1
===========================================

Three typical types of stack traces will be observed in stack files for latch waiters:

1) EDUs trying to load static section into work space:
===========================================
getConflictComplex
getConflict
sqlra_sqlW_get_static_section
===========================================

2) EDUs trying to load dynamic section into work space:
===========================================
getConflictComplex
getConflict
sqlra_sqlW_get_dynamic_section
===========================================

3) EDUs trying to free up memory in work space, to load new static/dynamic section:
===========================================
getConflictComplex
getConflict
sqlra_sqlW_mem_freespace
sqlra_sqlW_check_for_cleanup
sqlra_sqlW_get_static_section

or

getConflictComplex
getConflict
sqlra_sqlW_mem_freespace
sqlra_sqlW_check_for_cleanup
sqlra_sqlW_get_dynamic_section
===========================================

The "LatchInformation" section in stack files indicate EDUs waiting on "SQLO_LT_sqlra_workspace_mhashcb__workspace_lru_latch".



Resolving the problem

There are some changes in DB2 V10.1, which can alleviate the contention on the latch and improve performance. However we are not going to back-port the changes to old versions.

To resolve the problem in versions prior to V10.1 (V9.5 and V9.7), we need to use DB2 registry variable named "DB2_SQLWORKSPACE_CACHE".

The default value for this registry variable is 30. Increasing it to a bigger value such as "500" will make it have a much smaller chance to hit latch contention again and significantly improve the performance.

To make the new value take effect, the DB2 instance must be recycled. For example:
db2set DB2_SQLWORKSPACE_CACHE=500
db2 terminate
db2stop force
db2start

你可能感兴趣的:(Resolve contention on "workspace_lru_latch")