最近一台Windows Rac经常出现访问无法连接的问题,重启之后又可以连接,切出现的频率大概在一个月左右。
通过查看$CRS_HOME\log\diag\tnslsnr\<hostname>\listener_scan1\trace\下的日志文件:
13-7月 -2013 17:23:34 * 12531
TNS-12531: TNS:无法分配内存
Sat Jul 13 17:23:48 2013
13-7月 -2013 17:23:48 * service_update * thrly1 * 0
Sat Jul 13 17:27:28 2013
13-7月 -2013 17:27:28 * service_update * thrly 2 * 0
有大量的相关TNS-15531的错误,在任务管理器中会有大量cmd.exe正处于 hanging
启动监听状态:
C:\app\11.2.0\grid>lsnrctl start listener
LSNRCTL for 64-bit Windows: Version 11.2.0.3.0 - Production on 10-APR-2013 09:53:34
Copyright (c) 1991, 2010, Oracle. All rights reserved.
Starting tnslsnr: please wait...
TNSLSNR for 64-bit Windows: Version 11.2.0.3.0 - Production
System parameter file is C:\app\11.3.0\grid\network\admin\listener.ora
Log messages written to C:\app\11.3.0\grid\log\diag\tnslsnr\RACNODE-ORCL02\listener\alert\log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=\\.\pipe\LISTENERipc)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
D:\app\11.3.0\grid\log\diag\tnslsnr\oradb1\listener_scan1\trace
此状态无反应,重启系统,一切又恢复正常。
不明真理,先前以为网络问题,通过查看MOS资料,如下:
Description
After applying 11.2.0.2 Patch19 or 11.2.0.3 Patch7 CVU healthcheck leaves behind
cmd.exe and might report "%%i was unexpected ..." when cluvfy is executed.
Rediscovery Notes:
eg:
C:\Users\oradba>%ORACLE_HOME%\bin\cluvfy comp nodecon -n all
%%i was unexpected at this time.
The issue here seems to be that we are running out of Desktop Heap memory.
大概意思是说cluvfy命令失败的执行,会引起桌面堆栈内存超出错误,进而出现TNS-12531错误。
三种方法可以解决这个问题:
1、参考如下文档增加堆栈内存尺寸
http://support.microsoft.com/default.aspx?scid=kb;EN-US;126962
注册列表:\System\CurrentControlSet\Control\Session Manager\SubSystems
新增字符串值:
Eg:
SharedSection=1024,20480,1024
2、短期的解决办法就是重启受影响的节点。
3、Longer term: Patch against known desktop heap relatedBug:14245094 which has been fixed in patch bundle 22 and higher
涉及相关文档:
1、CRS Unable To Start The Listener (11.2 RAC) [ID 1551240.1]
2、Windows: Ora.LISTENER.lsnr Status goes to INTERMEDIATE CHECK TIMED OUT [ID 1523366.1]