今天凌晨的时候,线上库遇到了这么个错误,相关信息如下:
这个报警时统一监控平台(192.168.1.80)发出来的,使用数据库连接RECON_102无效。
l 查看102上的报警日志文件:
可见出问题的时间在:05:51左右
WARNING: inbound connection timed out (ORA-3136)
Thu Sep 22 05:51:24 2011
WARNING: inbound connection timed out (ORA-3136)
Thu Sep 22 05:51:24 2011
WARNING: inbound connection timed out (ORA-3136)
Thu Sep 22 05:51:29 2011
WARNING: inbound connection timed out (ORA-3136)
Thu Sep 22 05:51:29 2011
WARNING: inbound connection timed out (ORA-3136)
Thu Sep 22 05:51:30 2011
WARNING: inbound connection timed out (ORA-3136)
Thu Sep 22 05:51:41 2011
WARNING: inbound connection timed out (ORA-3136)
Thu Sep 22 05:51:41 2011
WARNING: inbound connection timed out (ORA-3136)
Thu Sep 22 05:51:43 2011
WARNING: inbound connection timed out (ORA-3136)
Thu Sep 22 05:51:49 2011
WARNING: inbound connection timed out (ORA-3136)
后面1分钟后日志挖掘就正常了,数据库本身没有问题,没有宕机,也没有出现其他的ORA提示
l 查看102上的CPU情况:
可见5:51的时候,WIO很高,明显超过正常值
05:49:01 10 4 2 84 6.00
05:50:00 9 3 9 78 6.00
05:51:24 4 3 49 44 6.00
05:52:00 4 3 39 54 6.00
05:53:00 15 4 7 75 6.00
05:54:00 5 2 0 92 6.00
l 查看102上的IO情况:
可见在5:51的时候,系统所在的磁盘IO也是很高的,明显超过正常值
05:50:00 hdisk0 7 1.6 41 293 282.7 21.1
hdisk1 6 1.5 32 254 368.3 26.2
05:51:24 hdisk0 99 29.8 508 3754 451.4 31.0
hdisk1 99 27.4 509 3743 426.8 31.1
05:52:00 hdisk0 100 24.8 551 5929 293.0 28.3
hdisk1 99 20.7 562 5989 237.6 27.2
05:53:00 hdisk0 29 0.0 106 427 0.0 4.9
hdisk1 21 0.0 71 286 0.1 5.1
l 查阅网上资料发现:
i. 导致这个WARNING出现的主要原因可能是:
1)Server gets a connection request from a malicious(恶意) client which is not supposed to connect to the database , in which case the error thrown is the correct behavior. You can get the client address for which the error was thrown via sqlnet log file. 很明显,这个情况不属于我们这个案例
2)The server receives a valid client connection request but the client takes a long time to authenticate more than the default 60 seconds. //有可能
3)The DB server is heavily loaded due to which it cannot finish the client logon within the timeout specified. //有可能,毕竟当时CPU耗在等待IO上的时间比例远远高于正常值
看来Oracle和MySQL一样,都有避免恶意的攻击(MySQL里面有一个参数max_connect_errors,由于出现某台host连接错误次数等于max_connect_errors(默认10) ,主机'host_name'再次尝试时被MySQL阻止。这样可有效反的防止dos攻击,使用'mysqladmin flush-hosts'解除屏蔽),但是实现方式上还是有所不同的。
ii. 那么如何定位导致这个WARNING出现的呢?
1. Check whether local connection on the database server is sucessful & quick. //本地登陆很快速
2. If local connections are quick ,then check for underlying network delay with the help of your network administrator. //对于机器之间的网络状况,貌似我们现在没有监控吧,这个也不太敢确定就一定是网络问题
3. Check whether your Database performance has degraded by anyway.
4. Check alert log for any critical errors for eg, ORA-600 or ORA-7445 and get them resolved first.
These critical errors might have triggered the slowness of the database server.
iii. 解决方案,可以添加参数SQLNET.INBOUND_CONNECT_TIMEOUT,延长Oracle处理Client端发送过来的请求的时间(默认为60秒,如果超时,就触发警告“WARNING: inbound connection timed out (ORA-3136)”)
As a workaround to avoid only this warning messages, you can set the parameters SQLNET.INBOUND_CONNECT_TIMEOUT
and INBOUND_CONNECT_TIMEOUT_<listenername> to the value more than 60.
For e.g 120. So that the client will have more time to provide the authentication information to the database. You may have to further tune these parameter values according to your setup.
set these parameter
1. In server side sqlnet.ora file add SQLNET.INBOUND_CONNECT_TIMEOUT
For e.g
SQLNET.INBOUND_CONNECT_TIMEOUT = 120
2. In listener.ora file - INBOUND_CONNECT_TIMEOUT_<listenername> = 110
For e.g if the listener name is LISTENER then -
INBOUND_CONNECT_TIMEOUT_LISTENER = 110
--本篇文章转自:http://wangwei.cao.blog.163.com/blog/static/102362526201182292630397/