Fatal NI connect error 12170错误解决办法

错误描述

最近一年时间,oracle数据库版本从11.2.0.2到11.2.0.4,从rac到单机数据库,在查看数据库告警日志时都发现有连续报以下错误信息:

Fatal NI connect error 12170.

   VERSION INFORMATION:
         TNS forLinux:Version11.2.0.4.0-Production
         OracleBequeath NT Protocol Adapter for Linux:Version 11.2.0.4.0-Production
         TCP/IP NT Protocol Adapter for Linux:Version11.2.0.4.0-Production
  Time:06-NOV-2016 18:09:13
  Tracing notturned on.
  Tns error struct:
    ns main err code:12535

TNS-12535:TNS:operation timed out
    ns secondary err code:12560
    nt main err code:505

TNS-00505:Operation timed out
    nt secondary err code:110
    nt OS err code:0
Client address:(ADDRESS=(PROTOCOL=tcp)(HOST=23.4.23.124)(PORT=51707))
Sun Nov 06 18:12:17 2016

虽然用户没有报性能错误,但是因为这个问题告警日志文件大小高达几个G,导致每次查看都有些困难,还是查看一下解决办法。

原因

在MOS上查找fatal NI connect error 12170可以找到问题原因。英文原文为

   These time out related messages are mostly informational in nature.  The messages indicate the specified client connection (identified by the 'Client address:' details) has experienced a time out. The 'nt secondary err code' identifies the underlying network transport, such as (TCP/IP) timeout limits after a client has abnormally terminated the database connection.
   The 'nt secondary err code' translates to underlying network transport timeouts for the following Operating Systems:


  For the Solaris system: nt secondary err code: 145:
     ETIMEDOUT 145 /* Connection timed out */
  For the Linux operating system: nt secondary err code: 110
     ETIMEDOUT 110 Connection timed out
  For the HP-UX system: nt secondary err code: 238:
     ETIMEDOUT 238 /* Connection timed out */
  For AIX: nt secondary err code: 78:
     ETIMEDOUT 78 /* Connection timed out */
  For Windows based platforms: nt secondary err code: 60         (which translates to Winsock Error: 10060)

  Description:  A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
  The reason the messages are written to the alert log is related to the use of the new 11g Automatic Diagnostic Repository (ADR) feature being enabled by default.  See (Doc ID 454927.1).

这段话的大致意思是特定客户端的连接经历了一次超时,这个超时可能由两个原因导致的:
- 一个连接试图连接主机,但是主机没有正确响应
- 被连接的主机没有响应导致一个已经建立的连接失败了

由于11g新的自动诊断仓库特性默认是将这些连接失败信息写入告警日志里的,故导致了告警日志里存在大量的类似错误信息。

解决办法

设置Oracle Net的配置参数使它的诊断追踪信息不再写入到告警日志文件中,需要修改两个地方:

1. 在服务端的sqlnet.ora文件中增加一行
        DIAG_ADR_ENABLED=OFF
2. 在服务端的listener.ora中增加一行(其中listenername替换为你自己的监听器名称)
       DIAG_ADR_ENABLED_=OFF
3. 使用lsnrctl命令使以上配置生效
       lsnrctl reload;(业务不会中断,如果业务不是很紧张,最好使用lsnrctl restart确保参数生效)
       或者
       lsnrclt restart;(业务会中断)

你可能感兴趣的:(oracle)