在数据库监控过程中发现考勤数据库上Employees_ControlData存储过程执行时间需20分钟。这个存储过程逻辑很简单,就是打开一个游标,做LOOP循环,再删除重复数据,结构如下:
CREATE OR REPLACE PROCEDURE Employees_ControlData
IS
tmpVar NUMBER(6);
tmpVar1 NUMBER(6);
tmpVar2 NUMBER (6);
tmpVar3 NUMBER(6);
CURSOR EMP_NO
IS
SELECT WORKNO FROM ZZ_EMPLOYEES;
BEGIN
--LINE 12行
FOR USERID INEMP_NO
LOOP
......
END LOOP;
--Line128行,删除重复数据
Delete .... ;
COMMIT;
EXCEPTION
WHEN NO_DATA_FOUND
THEN
NULL;
WHEN OTHERS
THEN
-- Consider logging the error and then re-raise
RAISE;
END Employees_Controldata;
ZZ_EMPLOYEES有4万多笔数据,LOOP也会执行4万多次。AWR报告与GridControl监控都显示效能瓶颈在LOOP循环中六处SQL,再看一下LOOP循环中六处问题SQL:
--Line14行
SELECT COUNT(*)
INTO tmpVar
FROM CPYTGL.EMPLOYEE_CONTROL
WHERE EMP_NO = USERID.WORKNO
ANDTO_CHAR(INOROUT_TIME,'YYYY/MM/DD')=
TO_CHAR(SYSDATE-2,'YYYY/MM/DD')
AND TYPE='F'
AND INOROUT_TIME >SYSDATE-3
AND ROWNUM<6;
--Line38行
INSERT INTOEMPLOYEE_CONTROL_EXCEPTION
SELECT DISTINCT *
FROM EMPLOYEE_CONTROL
WHERE EMP_NO = USERID.WORKNO
AND TO_CHAR(INOROUT_TIME,'YYYY/MM/DD')=
TO_CHAR(SYSDATE-2,'YYYY/MM/DD')
AND TYPE='F'
AND INOROUT_TIME >SYSDATE-3;
--Line49行
INSERT INTOEMPLOYEE_CONTROL_EXCEPTION
SELECT DISTINCT *
FROM EMPLOYEE_CONTROL
WHERE EMP_NO = USERID.WORKNO
AND TO_CHAR(INOROUT_TIME,'YYYY/MM/DD')=
TO_CHAR(SYSDATE-2,'YYYY/MM/DD')
AND TYPE='F'
AND INOROUT_TIME >SYSDATE-3
AND INOROUT_TIME NOTIN
(SELECT INOROUT_TIME
FROM EMPLOYEE_CONTROL_EXCEPTION
WHERE EMP_NO = USERID.WORKNO
AND TO_CHAR(INOROUT_TIME,
'YYYY/MM/DD')=
TO_CHAR(SYSDATE-2,
'YYYY/MM/DD')
AND TYPE='F'
AND INOROUT_TIME >SYSDATE-3);
--Line72行
SELECT COUNT(*)
INTO tmpVar1
FROM CPYTGL.EMPLOYEE_CONTROL
WHERE EMP_NO = USERID.WORKNO
ANDTO_CHAR(INOROUT_TIME,'YYYY/MM/DD')=
TO_CHAR(SYSDATE-2,'YYYY/MM/DD')
AND TYPE='L'
AND INOROUT_TIME >SYSDATE-3
AND ROWNUM<6;
--Line82行
INSERT INTO EMPLOYEE_CONTROL_EXCEPTION
SELECT DISTINCT *
FROM EMPLOYEE_CONTROL
WHERE EMP_NO = USERID.WORKNO
AND TO_CHAR(INOROUT_TIME,'YYYY/MM/DD')=
TO_CHAR(SYSDATE-2,'YYYY/MM/DD')
AND TYPE='L'
AND INOROUT_TIME >SYSDATE-3;
--Line87行
INSERT INTO EMPLOYEE_CONTROL_EXCEPTION
SELECT DISTINCT *
FROM EMPLOYEE_CONTROL
WHERE EMP_NO = USERID.WORKNO
AND TO_CHAR(INOROUT_TIME,'YYYY/MM/DD')=
TO_CHAR(SYSDATE-2,'YYYY/MM/DD')
AND TYPE='L'
AND INOROUT_TIME >SYSDATE-3
AND INOROUT_TIME NOTIN(SELECTINOROUT_TIME
FROM EMPLOYEE_CONTROL_EXCEPTION
WHERE EMP_NO = USERID.WORKNO
AND TO_CHAR(INOROUT_TIME,
'YYYY/MM/DD')=
TO_CHAR(SYSDATE-2,
'YYYY/MM/DD')
AND TYPE='L'
AND INOROUT_TIME >SYSDATE-3);
这六处SQL都查询了EMPLOYEE_CONTROL考勤信息表。此表已有近两亿笔数据,根据INOROUT_TIME字段进行分区,并对相关字段建立了索引。
CREATE INDEX EMPLOYEE_CONTROL_EMPNOINOROUT ON EMPLOYEE_CONTROL
(TO_CHAR("INOROUT_TIME",'YYYY/MM/DD'),EMP_NO) LOCAL;
SQL> SELECT column_name
2 FROM dba_part_key_columns
3 WHERE name = 'EMPLOYEE_CONTROL';
COLUMN_NAME
--------------------------------------------------------------------------------
INOROUT_TIME
SQL> SELECT num_rows
2 FROM dba_tables
3 WHERE table_name = 'EMPLOYEE_CONTROL';
NUM_ROWS
----------
193585044
EMPLOYEE_CONTROL考勤信息表至少需保留一年数据备查。六条SQL已加INOROUT_TIME >SYSDATE-3条件,执行计划中可进行分区裁剪,删减数据这条路行不通。
WHERE中的条件也正确使用了索引,似乎所有常规优化方法都已用上,如何才能进一步提升存储过程LOOP循环执行速度?
我们再分析这六条SQL,在WHERE条件中都出现了对EMPLOYEE_CONTROL表以下限定条件:
TO_CHAR(INOROUT_TIME,'YYYY/MM/DD')=
TO_CHAR(SYSDATE-2,'YYYY/MM/DD')
AND INOROUT_TIME >SYSDATE-3
AND TYPE='L'
AND TYPE='F'
存储过程实际要读取的只有一天的数据,这部分数据一般只有10万笔为热点数据。如果我们先把此部分数据单独读出,在LOOP循环中就可只读取临时表内容,避免4万次读取有两亿笔数据的EMPLOYEE_CONTROL考勤资料表。
依此思路,我们先建立一个临时表,并为临时表EMP_NO字段添加索引:
CREATE GLOBAL TEMPORARY TABLE CPYTGL.EMPLOYEE_CONTROL_TEMP
ON COMMIT DELETE ROWS
AS
SELECT *
FROMCPYTGL.EMPLOYEE_CONTROL
WHERE 1 = 0;
CREATE INDEX CPYTGL.EMPLOYEE_CONTROL_TEMP_NOON CPYTGL.EMPLOYEE_CONTROL_TEMP
(EMP_NO);
再在存储过程头部将符合条件的数据取出:
INSERT INTO cpytgl.EMPLOYEE_CONTROL_TEMP
SELECT*
FROM CPYTGL.EMPLOYEE_CONTROL
WHERE TO_CHAR (INOROUT_TIME, 'YYYY/MM/DD') =
TO_CHAR(SYSDATE - 2, 'YYYY/MM/DD')
AND INOROUT_TIME > SYSDATE- 3
AND TYPE IN('F', 'L');
这样就可改写消耗资源的六条SQL查询临时表。以第14行SQL为例,需改写为:
--Line14行
SELECT COUNT (*)
INTO tmpVar
FROM cpytgl.EMPLOYEE_CONTROL_TEMP
WHERE EMP_NO = USERID.WORKNO AND TYPE = 'F' AND ROWNUM< 6;
LOOP循环中六条SQL改为查询10万笔记录的临时表后,存储过程只需1分钟即可跑完。相较之前20分钟运行时间有大幅度提升。
此例核心为使用临时表隔离冷热数据。DBA一次调优不一定能想出最佳方法,通过对应用的不断深入观察,以及Oracle工具的合理使用,加上一点点灵光一现那些看似解决不了的难题都可一一化解。