参考:
oracle 会话状态 谷歌
一. Session 状态说明
可以通过v$session 视图的status列查看session 的状态。 关于该视图的使用,参考联机文档:
V$SESSION
http://download.oracle.com/docs/cd/E11882_01/server.112/e17110/dynviews_3016.htm#REFRN30223
STATUS |
VARCHAR2(8) |
Status of the session: ACTIVE - Session currently executing SQL INACTIVE KILLED - Session marked to be killed CACHED - Session temporarily cached for use by Oracle*XA SNIPED - Session inactive, waiting on the client |
有关状态的说明:
(1)active 处于此状态的会话,表示正在执行,处于活动状态。
官方文档说明:
Any session that is connected to the database and is waiting for an event that does not belong to the Idle wait class is considered as an active session.
(2)killed处于此状态的会话,被标注为删除,表示出现了错误,正在回滚。
当然,也是占用系统资源的。还有一点就是,killed的状态一般会持续较长时间,而且用windows下的工具pl/sql developer来kill掉,是不管用的,要用命令:alter system kill session 'sid,serial#' ;
(3)inactive 处于此状态的会话表示不是正在执行的
该状态处于等待操作(即等待需要执行的SQL语句),通常当DML语句已经完成。 但连接没有释放,这个可能是程序中没有释放,如果是使用中间件来连接的话,也可能是中间件的配置或者是bug 导致。
inactive对数据库本身没有什么影响,但是如果程序没有及时commit,那么就会造成占用过多会话。容易是DB 的session 达到极限值。
问了几个朋友,他们的做法是不处理inactive 状态的session, 如果达到了session 的最大值, 就增加processes 和 sessions 参数。 如果kill inactive session 可能会到中间件有影响。 具体中间件这块我也不太熟,等以后弄清楚了,在说。
二. 处理inactive 状态的session
在前面说不处理inactive 状态的session,但是还是有方法来解决的。 有两种方法。
2.1 在 sqlnet.ora文件中设置expire_time 参数
官网有关这个参数的说明:
http://download.oracle.com/docs/cd/B19306_01/network.102/b14213/sqlnet.htm
SQLNET.EXPIRE_TIME
Purpose
Use parameter SQLNET.EXPIRE_TIME to specify a the time interval, in minutes, to send a probe to verify that client/server connections are active. Setting a value greater than 0 ensures that connections are not left open indefinitely, due to an abnormal client termination. If the probe finds a terminated connection, or a connection that is no longer in use, it returns an error, causing the server process to exit. This parameter is primarily intended for the database server, which typically handles multiple connections at any one time.
sqlnet.expire_time 的原理:Oracle Server 发送包探测dead connection ,如果连接关闭,或者不再用,则关闭相应的server process.
Limitations on using this terminated connection detection feature are:
(1)It is not allowed on bequeathed connections.
(2)Though very small, a probe packet generates additional traffic that may downgrade network performance.
(3)Depending on which operating system is in use, the server may need to perform additional processing to distinguish the connection probing event from other events that occur. This can also result in degraded network performance.
Default :0
Minimum Value :0
Recommended Value :10
Example
SQLNET.EXPIRE_TIME=10
2.2 设置用户profile的idle_time参数
之前整理的一篇有关profile的文章:
Oracle 用户 profile 属性
http://blog.csdn.net/tianlesoftware/archive/2011/03/10/6238279.aspx
注意,要启用idle_time 要先启用RESOURCE_LIMIT参数。 该参数默认是False。 官网说明如下:
RESOURCE_LIMIT
Property |
Description |
Parameter type |
Boolean |
Default value |
false |
Modifiable |
ALTER SYSTEM |
Range of values |
true | false |
RESOURCE_LIMIT determines whether resource limits are enforced in database profiles.
Values:
TRUE: Enables the enforcement of resource limits
FALSE:Disables the enforcement of resource limits
如下blog 在这块说的比较清楚,并提供了相关的脚本:
sqlnet.expire_time and IDLE_TIME
http://space.itpub.net/10687595/viewspace-420407
IDLE_TIME Specify the permitted periods of continuous inactive time during a session, expressed in minutes. Long-running queries and other operations are not subject to this limit.
A valid database connection that is idle will respond to the probe packet causing no action on the part of the Server , whereas the resource_limit will snipe the session when idle_time is exceeded. The 'sniped' session will get disconnected when the user(or the user process) tries to communicate with the server again.
-- 通过idle_time限制session idle 时间。session idle超过设置时间,状态为sniped (v$session).,然而OS下的process并不会释放,当session(user process) 再次与server process 通讯,将关闭相应的server process.
What does 'SNIPED' status in v$session mean?
When IDLE_TIME is set in the users' profiles or the default profile. This will kill the sessions in the database (status in v$session now becomes SNIPED) and they will eventually disconnect. It does not always clean up the Unix session (LOCAL=NO sessions).
At this time all oracle resources are released but the shadow processes remains and OS resources are not released. This shadow process is still counted towards the parameters of init.ora.
This process is killed and entry from v$session is released only when user again tries to do something. Another way of forcing disconnect (if your users come in via SQL*Net) is to put the file sqlnet.ora on every client machine and include the parameter "SQLNET.EXPIRE_TIME" in it to force the close of the SQL*Net session
sqlnet.expire_time
sqlnet.expire_time actually works on a different principle and is used to detect dead connections as opposed to disconnecting(actually 'sniping') a session based on idle_time which the profile accomplishes.
But again, as you mentioned, expire_time works globally while idle_time profile works for that user. You can use both of them to make sure that the client not only gets sniped but also gets disconnected if the user process abnormally terminates.
修改示例:
SQL>alter profile default limit idle_time 10;
--需要重启下oracle
查询应用的连接数SQL:
/* Formatted on 2011/6/12 13:06:23 (QP5 v5.163.1008.3004) */
SELECT b.MACHINE, b.PROGRAM,COUNT(*)
FROM v$process a, v$session b
WHEREa.ADDR = b.PADDR AND b.USERNAME ISNOTNULL
GROUPBY b.MACHINE, b.PROGRAM
ORDERBYCOUNT(*)DESC;
====================================
STATUS
|
VARCHAR2(8) | Status of the session: ACTIVE - Session currently executing SQL INACTIVE KILLED - Session marked to be killed CACHED - Session temporarily cached for use by Oracle*XA SNIPED - Session inactive, waiting on the client |
通过我们遇到的都是ACTIVE,INACTIVE,KILLED三种状态
1、active 会话处于活动状态,当前session正在执行sql语句
2、inactive会话处于不活动状态,sql语句已经执行完毕,但是由于某种原因,会话和后台进程没有释放,通过我们在sqlplus中退出和类似plsql developer工具log out或者直接退出,都是直接关闭会话,而不是将会话置于inactive状态,下面以sqlplus命令为例
[oracle@oracle11g ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Thu May 23 23:09:30 2013
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select count(*) from v$process;
COUNT(*)
----------
30
SQL>
开启另一会话窗口,查询会话数量
SQL>select count(*) from v$session;
查询结果是27个会话
我们退出之前的会话
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@oracle11g ~]$
再次查询会话数量,结果是26。说明会话已经释放。而通常情况下inactive会话比较多的情况下是我们采用jboss,weblogic之类的中间件,建立连接池,而连接使用完毕之后并未释放连接,仍然将连接放回到连接池的情况。
我本机上采用weblogic连接池,连接到oracle,设置初始连接数量是5,我们通过查询
SQL> set linesize 150
SQL> select s.SID,s.SERIAL#,s.OSUSER,s.USERNAME,s.STATUS,s.PROGRAM from v$session s where s.STATUS='INACTIVE' and s.USERNAME='TEST';
SID SERIAL# OSUSER USERNAME STATUS PROGRAM
---------- ---------- ------------------------------ ------------------------------ -------- ------------------------------------------------
16 64 Administrator TEST INACTIVE JDBC Connect Client
21 28 Administrator TEST INACTIVE JDBC Connect Client
141 164 Administrator TEST INACTIVE JDBC Connect Client
142 93 Administrator TEST INACTIVE JDBC Connect Client
145 30 Administrator TEST INACTIVE JDBC Connect Client
由于会话的创建和释放都需要耗费一定的资源,应用端往往会通过连接池来缓存会话。如果inactive会话过多,不建议直接kill session,毕竟应用的稳定性是第一位的,可以考虑调整应用连接池的数量,或者考虑增加processes的数量。
3、killed处于此状态的会话,被标注为删除,表示出现了错误,正在回滚
当然,也是占用系统资源的。还有一点就是,killed的状态一般会持续较长时间,而且用windows下的工具pl/sql developer来kill掉,是不管用的,要用命令:alter system kill session 'sid,serial#' ; 可以通过此方式来解除死锁。
===================================ORACLE数据库会话有ACTIVE、INACTIVE、KILLED、 CACHED、SNIPED五种状态。INACTIVE状态的会话表示此会话处于非活动、空闲、等待状态。例如PL/SQL Developer连接到数据库,执行一条SQL语句后,如果不继续执行SQL语句,那么此会话就处于INACTIVE状态。一般情况下,少量的INACTVIE会话对数据库并没有什么影响,如果由于程序设计等某些原因导致数据库出现大量的会话长时间处于INACTIVE状态,那么将会导致大量的系统资源被消耗,造成会话数超过系统session的最大值,出现ORA-00018:maximum number of sessions exceeded错误。
有时候需要清理那些长时间处于INACTIVE状态的会话。人为定期检查、杀掉这类会话肯定不太现实,要定期清理那些长时间处于INACTIVE的会话,只能通过作业来实现;另外需要注意,Kill掉这些会话需要需要谨慎,稍不注意,就有可能误杀了一些正常的会话。那么我们该如何定义这类会话呢?下面是我结合业务规则定义的:
1: 会话的Status必须为INACTIVE,如果会话状态为ACTIVE、KILLED、CACHED、SNIPED状态,不做考虑。
2: 会话必须已经长时间处于INACTIVE状态。例如,处于INACTIVE状态超过了两小时的会话进程,才考虑Kill。这个视具体业务或需求决定,有可能超过半小时就可以杀掉会话进程。至于如何计算处于INACTIVE会话状态的时间,这个可以 通过V$SESSION的LAST_CALL_ET字段来判别,需要查询处于INACTIVE状态两小时或以上的会话,就可以通过查询条件S.LAST_CALL_ET >= 60*60*2实现,当然最好写成 S.LAST_CALL_ET >= 7200
3: 连接到会话的程序。比如,某个特定的应用程序产生的INACTIVE会话才要清理。例如, Toad工具、PL/SQL Developer工具。关于PROGRAM这个需要根据当前项目的具体情况设置,下面仅仅使用TOAD.EXE、W3WP.EXE举例说明。
1: SELECT SID, SERIAL#,MODULE, STATUS
2: FROM V$SESSION S
3: WHERE S.USERNAME IS NOT NULL
4: AND UPPER(S.PROGRAM) IN ('TOAD.EXE', 'W3WP.EXE')
5: AND S.LAST_CALL_ET >= 60*60*2
6: AND S.STATUS = 'INACTIVE'
7: ORDER BY SID DESC;
如果是RAC环境,那么最好使用下面SQL语句,使用全局视图GV$SESSION。
1: SELECT SID, SERIAL#, INST_ID, MODULE,STATUS
2: FROM gv$session S
3: WHERE S.USERNAME IS NOT NULL
4: AND UPPER(S.PROGRAM) IN ('TOAD.EXE', 'W3WP.EXE')
5: AND S.LAST_CALL_ET >= 2 * 60*60
6: AND S.STATUS = 'INACTIVE'
7: ORDER BY INST_ID DESC
接下来创建存储过程SYS.DB_KILL_IDLE_CLIENTS. 方便调用该功能执行kill inactive 会话。注意:xxx部分用实际业务的PROGRAM来替代。
1:
2: CREATE OR REPLACE PROCEDURE SYS.DB_KILL_IDLE_CLIENTS AUTHID DEFINER AS
3: job_no number;
4: num_of_kills number := 0;
5: BEGIN
6:
7: FOR REC IN
8: (SELECT SID, SERIAL#, INST_ID, MODULE,STATUS
9: FROM gv$session S
10: WHERE S.USERNAME IS NOT NULL
11: AND UPPER(S.PROGRAM) IN ('xxx', 'xxx.EXE')
12: AND S.LAST_CALL_ET >= 2*60*60
13: AND S.STATUS= 'INACTIVE'
14: ORDER BY INST_ID ASC
15: ) LOOP
16: ---------------------------------------------------------------------------
17: -- kill inactive sessions immediately
18: ---------------------------------------------------------------------------
19: DBMS_OUTPUT.PUT('LOCAL SID ' || rec.sid || '(' || rec.module || ')');
20: execute immediate 'alter system kill session ''' || rec.sid || ', ' ||
21: rec.serial# || '''immediate' ;
22:
23: DBMS_OUTPUT.PUT_LINE('. killed locally ' || job_no);
24: num_of_kills := num_of_kills + 1;
25: END LOOP;
26: DBMS_OUTPUT.PUT_LINE ('Number of killed xxxx system sessions: ' || num_of_kills);
27: END DB_KILL_IDLE_CLIENTS;
28: /
另外,由于kill session是直接将session kill掉,有可能出现导致事物回滚的现象,其实我们可以使用disconnect session完成当前事务并终止session(即使用disconnect session命令会(自动)提交(commit命令)了该会话里尚未提交的事务?)。这种方式比alter system kill session跟安全可靠。
1: CREATE OR REPLACE PROCEDURE SYS.DB_KILL_IDLE_CLIENTS AUTHID DEFINER AS
2: job_no number;
3: num_of_kills number := 0;
4: BEGIN
5:
6: FOR REC IN
7: (SELECT SID, SERIAL#, INST_ID, MODULE,STATUS
8: FROM gv$session S
9: WHERE S.USERNAME IS NOT NULL
10: AND UPPER(S.PROGRAM) IN ('xxxx', 'xxxx')
11: AND S.LAST_CALL_ET >= 2*60*60
12: AND S.STATUS<>'KILLED'
13: ORDER BY INST_ID ASC
14: ) LOOP
15: ---------------------------------------------------------------------------
16: -- kill inactive sessions immediately
17: ---------------------------------------------------------------------------
18: DBMS_OUTPUT.PUT('LOCAL SID ' || rec.sid || '(' || rec.module || ')');
19: execute immediate 'alter system disconnect session ''' || rec.sid || ', ' ||
20: rec.serial# || '''immediate' ;
21:
22: DBMS_OUTPUT.PUT_LINE('. killed locally ' || job_no);
23: num_of_kills := num_of_kills + 1;
24: END LOOP;
25: DBMS_OUTPUT.PUT_LINE ('Number of killed system sessions: ' || num_of_kills);
26: END DB_KILL_IDLE_CLIENTS;
27: /
然后,我们可以在作业(JOB)或Schedule里面定期调用该存储过程,也可以通过后台作业结合shell脚本实现定期清理空闲会话的功能。例如如下所示。
创建killSession.sh脚本,调用该存储过程SYS.DB_KILL_IDLE_CLIENTS
1: #!/bin/bash
2:
3:
4:
5: logfile=/home/oracle/cron/session/log/killSession.log
6:
7: echo " " >> $logfile 2>&1
8: echo "START ----`date`" >> $logfile 2>&1
9: sqlplus /nolog <<STATS
10: connect / as sysdba
11: exec sys.db_kill_idle_clients;
12: exit;
13: STATS
14:
15: echo "END ------`date`" >> $logfile 2>&1
在crontab里面配置后台作业,每隔15分钟运行一次,清理哪些满足条件的空闲会话。
0,15,30,45 * * * * /home/oracle/cron/session/bin/killSession.sh >/dev/null 2>&1