症状
IBM X3950类型为8878服务器的光通路诊断面板上点亮了LOG灯,通过服务器后面的MGMT口登录服务器http://192.168.70.125(mgmt口管理IP:192.168.70.125,用户名:USERID 密码:PASSW0RD,注意密码中0不是大写的password,是0123的0)。查看到服务器上之前启动时的错误日志,点击右下角按钮清除日志,需要关机断电将电源拨掉后才能清除光通路诊断面板上点亮了LOG灯。
清除日志后于是就在管理页面的电源管理选项中立即关闭电源了,然后将服务器的2根电源线拨掉,过了片刻再将服务器的电源线插好加电开机,服务器的光通路诊断面板上这次亮了三个灯:NMI,PCI,LOG.,且服务器的所有风扇工作在97%-100%状态,声音很大,一直不停。
再次登录MGMT管理口查看日志,发现如下报错信息:
22 WARN SERVPROC 01/03/12 19:06:39 Software NMI
23 ERR SERVPROC 01/03/12 19:06:36 Address of special cycle DPE on PCI primary Chassis#=1 Slot#=2 Bus#=4 Dev.ID=0xfd00 Vend.ID=0x10df Status=0xc238 DevFun#=0x8
24 ERR SERVPROC 01/03/12 19:06:36 System Error PCI Bus
25 ERR SERVPROC 01/03/12 19:06:36 SMI handler has reported a PCI SERR.
26 ERR SERVPROC 01/03/12 19:06:36 Uncorrectable ECC error on PCI primary Chassis#=1 Slot#=2 Bus#=4 Dev.ID=0xfd00 Vend.ID=0x10df Status=0xc238 DevFun#=0x8
27 ERR SERVPROC 01/03/12 19:06:35 Parity Error PCI Bus
28 ERR SERVPROC 01/03/12 19:06:35 SMI handler has reported a PCI PERR.
29 ERR SERVPROC 01/03/12 19:06:35 Additional uncorrectable ECC error on PCI primary Chassis#=1 Slot#=2 Bus#=4 Dev.ID=0xfd00 Vend.ID=0x10df Status=0xc238 DevFun#=0x8
30 ERR SERVPROC 01/03/12 19:06:35 Parity Error PCI Bus
31 ERR SERVPROC 01/03/12 19:06:35 SMI handler has reported a PCI PERR.
32 ERR SERVPROC 01/03/12 19:06:35 Device signaled SERR on PCI primary. Chassis#=1 Slot#=2 Bus#=4 Dev.ID=0x 2a 1 Vend.ID=0x1014 Status=0x64b0 DevFun#=0x0
33 ERR SERVPROC 01/03/12 19:06:35 System Error PCI Bus
34 ERR SERVPROC 01/03/12 19:06:35 SMI handler has reported a PCI SERR.
35 ERR SERVPROC 01/03/12 19:06:35 PCI Bus SERR# Detected Chassis#=1 Slot#=2 Bus#=4 Dev.ID=0x 2a 1 Vend.ID=0x1014 Status=0x64b0 DevFun#=0x0
36 ERR SERVPROC 01/03/12 19:06:34 System Error PCI Bus
37 ERR SERVPROC 01/03/12 19:06:34 SMI handler has reported a PCI SERR.
查找原因如下:
PCIe的不支持的请求和致命的流量控制产生的错误PCI SERR和软件NMI在RSA日志事件的报告和调查。 These events occur intermittently during manual or scheduled restarts in Microsoft Windows Server 2003.这些事件发生间歇性地在手动或计划在Microsoft Windows Server 2003重新启动。
The root cause was determined to be memory read/write requests that were inadvertently sent to the on-board Broadcom devices after the devices were already put to the PCIe D3hot low power state in preparation for the restart.根本原因被确定为内存读/写,不经意间发送到板上的Broadcom设备后,设备已经准备重新启动到PCIe D3hot低功耗状态的请求。
A fix was provided in the Broadcom driver to reject any memory requests to the onboard Broadcom devices when they are in the D3hot state.一个修复提供了Broadcom驱动程序拒绝任何内存请求,板载的Broadcom设备,当他们在D3hot状态。 The fix is included in Broadcom driver version 4.6.55 or higher as seen in the Broadcom Advanced Control Suite (BACS).该修补程序包含Broadcom驱动的Broadcom高级控制套件(BACS) 55 年 4 月 6 日 或更高版本。 See the image below for an example of how to see the driver version in BACS.
该系统错误可对任何下列IBM服务器:
· System x3850 M2, type 7141, any model任何模型的System x 3850 M2 ,键入7141,
· System x3850 M2, type 7144, any model任何模型的System x 3850 M2 ,键入7144,
· System x3850 M2, type 7233, any model任何模型的System x 3850 M2 ,键入7233,
· System x3850 M2, type 7234, any model任何模型的System x 3850 M2 ,键入7234,
· System x3950 M2, type 7141, any model的System x 3950 M2 ,7141型,任何模型
· System x3950 M2, type 7233, any model的System x 3950 M2 ,7233型,任何模型
· System x3950 M2, type 7234, any model的System x 3950 M2 ,7234型,任何模型
This tip is not option specific.这个提示是不是选项的具体。
· The Windows device driver for the on-board Broadcom 5709 is affected.板上的Broadcom 5709的Windows设备驱动程序的影响。
The system is configured with at least one of the following:该系统配置至少有以下之一:
· Microsoft Windows 2003 Server for 32-bit Servers, any service pack微软Windows 2003 Server的32位服务器,任何服务包
· Microsoft Windows 2003 Server for 64-bit Servers, any service pack Microsoft Windows 2003的64位服务器的服务器,任何服务包
· Microsoft Windows 2003 Server, EE x64, any service pack Microsoft Windows 2003服务器,EE X64,任何服务包
· Microsoft Windows 2003 Server, x64 Edition, any service pack Microsoft Windows 2003服务器,x64版,任何服务包
Note: This does not imply that the network operating system will work under all combinations of hardware and software. 注:这并不意味着网络操作系统下工作的硬件和软件的所有组合。
Please see the compatibility page for more information:更多信息,请参阅兼容性页面:
|
http://www.ibm.com/servers/eserver/serverproven/compat/us/ http://www.ibm.com/servers/eserver/serverproven/compat/us/ |
Solution解决方案
This symptom is resolved in the Broadcom Windows driver available for download at the following URL:这种症状是解决Broadcom的Windows驱动程序,可在以下网址下载:
|
http://www.ibm.com/support/docview.wss?uid=psg1MIGR-5070012 http://www.ibm.com/support/docview.wss?uid=psg1MIGR-5070012 |
是由于我通过远程管理服务器页面关闭服务器电源产生的控制流量产生的错误导致了网卡处于高功耗状态,所以风扇才会全部工作。
解决方法为需要更新网卡驱动,但在IBM官网上搜索了一下X3950 的Broadcom的网卡驱动,为无效连接,打电话给IBM400,服务器已经过保,苦逼了,于是将服务器的网卡驱动卸载再扫描一下后,将服务器上所有连接网线全部拨掉,清除掉日志后关闭服务器电源,再拨掉电源线,然后再重新启动,服务器的风扇工作正常了,光通路诊断面板上的灯也全部不亮了。