客户一套实时交易系统突然程序无法连接,查看数据库,发现运行正常,
急匆匆找我处理,发现监听无法查看状态lsnrctl status,处于hang状态,也无法停止,由于紧急,便直接kill 掉监听的进程,然后重启监听,后来程序连接恢复正常。
事后分析问题原因比较纠结,由于版本是10.2.0.4的版本,而常见的监听bug 出现子监听情况是在10.2.0.1,10.2.0.2版本上。
后来在listener.log日志里面发现有WARNING: Subscription for node down event still pending' in Listener Log 的信息, 通过这个线索,搜索metalink,发现这个文档372959.1 这个warning信息影响版本从10g到12c。
具体内容如下:
Oracle Net Services - Version 10.2.0.1 to 12.1.0.2 [Release 10.2 to 12.1]
Information in this document applies to any platform.
***Checked for relevance on 29-JAN-2014***
This issue affects only 10g and newer listeners.
SYMPTOMS
You are receiving the following warning messages in the listener log file constantly:
'WARNING: Subscription for node down event still pending'
CHANGES
This may be a new installation or a recent upgrade to 10g or newer.
CAUSE
These messages are related to the Oracle TNS Listener's default subscription to the Oracle Notification Service (ONS). In a non-RAC environment it is recommended to disable this subscription. This feature was introduced in Oracle 10g.
SOLUTION
Set the following parameter in the listener.ora:
SUBSCRIBE_FOR_NODE_DOWN_EVENT_<listener_name>=OFF
Where <listener_name> should be replaced with the actual listener name configured in the
LISTENER.ORA file.
SUBSCRIBE_FOR_NODE_DOWN_EVENT_<listener_name> parameter is to be placed by itself on an empty line.
It will be necessary to restart or reload the listener following the addition of this parameter.
This will prevent the messages from being written to the log file and may also prevent the TNS
Listener from hanging periodically. See (10g only) NOTE 340091.1 Intermittent TNS Listener Hang, New Child Listener Process Forked
Please Note: Setting SUBSCRIBE_FOR_NODE_DOWN_<listener_name> to OFF disables a necessary RAC functionality. The above workaround is recommended only for non-RAC environments.
The issue may be present in all 10g and newer installations.
Note: The use of this undocumented parameter may cause problems with the use of the Net Manager (NetMgr) configuration utility. See Note 437598.1 NetMGR May Error When Listener.ora File Contains:SUBSCRIBE_FOR_NODE_DOWN_EVENT
oracle建议对于这个问题,在listener.ora文件里面设置参数SUBSCRIBE_FOR_NODE_DOWN_EVENT_<listener_name>=OFF,来防止oracle监听间歇性hang
---------------------------------------
ps:吐槽下,这俩天真忙,好几个故障要分析,要写报告。。。
---------------------------------单调的分割线----------------------------------------------------------------------
又仔细看了下文档,对于上面那个参数oracle是这么说的:This will prevent the messages from being written to the log file and may also prevent the TNS
Listener from hanging periodically.
理解下就是设置该参数可以阻止这个警告信息写入listener.log里面,同时,oracle是用may also prevent来防止监听间歇性hang,也就是意味着,这个参数可能阻止监听hang 的情况,也可能不能够。。。
那么现在我重新整理下思绪,对于10.2.0.1版本监听hang的bug ,oracle提供的workaround 措施是设置上面的参数为off,并且将ons的配置文件给重命名,
在我理解,也就是停掉ons服务,因为停掉ons服务后,oracle监听就不会自己再次产生一个监听子进程,也就不会导致bug产生。。。
其实这么一说,岂不是直接停掉ons就行了???
cd $ORACLE_HOME/opmn/conf
mv ons.config ons.config.orig
生产环境可不敢拿来测试,只好使用SUBSCRIBE_FOR_NODE_DOWN_EVENT_<listener_name>=OFF 和重命名ons.config文件同时操作。