关于DELLR710/R910(rhel5.3-5.5) Broadcom netxtreme 5709网卡间歇性的停止接收数据包的问题解决方案

问题描述:rhel5.5系统在做压力测试时,因网络流量较大致使网卡间歇性的停止接收数据包的问题,经查是redhat系统自带网卡驱动有BUG,需要更新网卡驱动,以下为问题描述和驱动更新方法:

Why does the Broadcom NetXtreme 5709 NIC stop receiving packets intermittently on RHEL 5.3 and newer?

Article ID: 26837- Created on: Mar 2, 2010 9:41 PM- Last Modified:  Mar 24, 2011 5:05 PM

Issue

 In certain situations under heavy loads, the network interface card can stop accepting packets from remote devices.

 This problem has been reported on Red Hat Enterprise Linux 5.3 (RHEL 5.3) and newer when using a Broadcom NetXtreme 5709 network interface card.

Environment

  Red Hat Enterprise Linux 5.3 to 5.5

 Network Interface Cards (NIC) using the bnx2 driver including:

Broadcom Corporation NetXtreme II BCM5709S Gigabit Ethernet

Resolution

  Red Hat has released kernel-2.6.18-194.3.1.el5 which will address this issue in RHEL 5. It can be downloaded from the following link:
https://rhn.redhat.com/errata/RHSA-2010-0398.html

* in certain circumstances, under heavy load, certain network interface cards using the bnx2 driver and configured to use MSI-X, could stop processing interrupts and then network connectivity would cease.  (BZ#587799)

  If upgrading the kernel is not an option, review the following workarounds

 Disable MSI-X in the bnx2 driver. To do this, add the following line to /etc/modprobe.conf

ooptions bnx2 disable_msi=1

Disable MSI completely by booting with thepci=nomsi boot parameter. Obviously, this will disable MSI on all devices that are able to utilize it.
Note: MSI-X increases network performance so disabling it means that the performance will return to the level available before MSI-X was introduced.

 Disable C-States in BIOS. Refer to the vendor system documentation in order to learn how to do this.

Root Cause

 The kernel gets out  of sync with interrupts generated by the network  interface card which results in an inability to process interrupts,  causing packets to be dropped and ultimately, lost connectivity.

 When this situation  occurs, the rx_fw_discards counter will  keep increasing as remote devices unsuccessfully attempt to  communicate with the system via the NIC.

 It has been reported that under certain heavy traffic conditions in MSI-X mode, the bnx2 driver can lose an MSI-X vector causing all packets in the associated rx/tx ring pair to be dropped.  The problem is caused by the chip dropping the write to unmask the MSI-X vector by the kernel (when migrating the IRQ for example).This can be prevented by increasing the GRC timeout value for these register read and write operations.

  The upstream patch resolving this issue is available here.



以下是更新网卡驱动方法:


1 �C在编译网卡之前,先检查网卡使用的网卡驱动模块名。

[root@localhost ~]#cat /etc/modprobe.conf

alias eth0 bnx2      ß表明网卡驱动模块名为bnx2

alias eth1 bnx2

alias eth2 bnx2

alias eth3 bnx2



2 �C检查当前是否有已加载网卡驱动模块bnx2

[root@localhost ~]#lsmod | grep bnx2

bnx2                  179021  0    ß表明系统已有加载完网卡驱动模块

bnx2i                  40413  0

cnic                   44877  1 bnx2i

libiscsi2              42693  6 be2iscsi,ib_iser,iscsi_tcp,bnx2i,cxgb3i,libiscsi_tcp

scsi_transport_iscsi2    37709  8 be2iscsi,ib_iser,iscsi_tcp,bnx2i,cxgb3i,libiscsi2

scsi_mod              141973  15 be2iscsi,ib_iser,iscsi_tcp,bnx2i,cxgb3i,libiscsi2,scsi_transport_iscsi2,scsi_dh,sg,pvscsi,libata,mptspi,mptscsih,scsi_transport_spi,sd_mod



3 �C查看当前驱动模块信息,从而得到驱动模块的版本,以便验证稍后网卡驱动模块升级

[root@localhost redhat]#modinfo bnx2

filename:       /lib/modules/2.6.18-194.el5/kernel/drivers/net/bnx2.ko ß网卡驱动存放的路径,升级后网卡存放路径与此路径不同

version:        2.0.2   ß表明网卡模块版本为2.0.2,是系统默认网卡驱动版本

license:        GPL

description:    Broadcom NetXtreme II BCM5706/5708/5709/5716 Driver

author:         Michael Chan <[email protected]>

srcversion:     7025AAF3645EE432EAF1C00

alias:          pci:v000014E4d0000163Csv*sd*bc*sc*i*

alias:          pci:v000014E4d0000163Bsv*sd*bc*sc*i*

alias:          pci:v000014E4d0000163Asv*sd*bc*sc*i*

alias:          pci:v000014E4d00001639sv*sd*bc*sc*i*

alias:          pci:v000014E4d000016ACsv*sd*bc*sc*i*

alias:          pci:v000014E4d000016AAsv*sd*bc*sc*i*

alias:          pci:v000014E4d000016AAsv0000103Csd00003102bc*sc*i*

alias:          pci:v000014E4d0000164Csv*sd*bc*sc*i*

alias:          pci:v000014E4d0000164Asv*sd*bc*sc*i*

alias:          pci:v000014E4d0000164Asv0000103Csd00003106bc*sc*i*

alias:          pci:v000014E4d0000164Asv0000103Csd00003101bc*sc*i*

depends:      

vermagic:       2.6.18-194.el5 SMP mod_unload 686 REGPARM 4KSTACKS gcc-4.1

parm:           disable_msi:Disable Message Signaled Interrupt (MSI) (int)

parm:           enable_entropy:Allow bnx2 to populate the /dev/random entropy pool (int)

module_sig:     883f3504ba037551e1fa4939f6a62931127b30a0e5a160a7ad7a7b9b2c162b309b3316fddc41f280a0cbecbd80e777d961e16218019c365c4b328d1a8



4 �CDELL网站下载Broadcom网卡驱动包

ftp://ftp.dell.com/FOLDER29291M/1/Bcom_LAN_16.2.0_Linux_Source_A01.tar.gz。解压开始安装

[root@localhost mnt]#tar -vzxf Bcom_LAN_16.2.0_Linux_Source_A01.tar.gz   ß解压驱动包

Bcom_LAN_16.2.0_Linux_Source_A01/

Bcom_LAN_16.2.0_Linux_Source_A01/Linux_Readme/

Bcom_LAN_16.2.0_Linux_Source_A01/Linux_Readme/linux_readme.txt

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/ChangeLog

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/README.TXT

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/tg3-3.115j-1.src.rpm

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/tg3-3.115j.tar.gz

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/tg3_sup-3.115j-1.ISO.tar.gz

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/brcm_iscsi_uio-0.6.2.13.tar.gz

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/netxtreme2-6.2.23-1.src.rpm

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/netxtreme2-6.2.23.tar.gz

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/netxtreme2_sup-6.2.23-1.ISO.tar.gz

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/README

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/RELEASE.bnx2.TXT

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/RELEASE.bnx2i.TXT

Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/RELEASE.bnx2x.TXT


[root@localhost mnt]#cd Bcom_LAN_16.2.0_Linux_Source_A01   ß进入解压出来的目录


[root@localhost Bcom_LAN_16.2.0_Linux_Source_A01]#ls  ß显示文件夹内容

Linux_Readme  NetXtreme  NetXtremeII


[root@localhost Bcom_LAN_16.2.0_Linux_Source_A01]#cd NetXtremeII/  ß进入到网卡驱动资源目录


[root@localhost NetXtremeII]#lsß显示此目录的内容

brcm_iscsi_uio-0.6.2.13.tar.gz      README

netxtreme2-6.2.23-1.src.rpm         RELEASE.bnx2i.TXT

netxtreme2-6.2.23.tar.gz            RELEASE.bnx2.TXT

netxtreme2_sup-6.2.23-1.ISO.tar.gz  RELEASE.bnx2x.TXT


[root@localhost NetXtremeII]#rpm -ivh netxtreme2-6.2.23-1.src.rpm   ß安装编译网卡所需要的资源

  1:netxtreme2             ########################################### [100%]

***************************************************************************

Note

(1) �C

RPMnetxtreme2-6.2.23-1.src.rpm所包含的内容,及存放的目录

[root@localhost NetXtremeII]# rpm -qlp netxtreme2-6.2.23-1.src.rpm

/usr/src/redhat/SOURCES/netxtreme2-6.2.23.tar.bz2

/usr/src/redhat/SPECS/netxtreme2.spec

(2) -

Question

安装netxtreme2-6.2.23-1.src.rpm时会报“error: cannot create %sourcedir /usr/src/redhat/SOURCES”信息。

Cause

没有安装qlp rpm-build-4.4.2.3-18.el5.i386.rpm包。

Resolution:

安装rpm-build,同时关联binutilselfutils两个包。可解决此问题。

### /usr/src/redhat/SOURCES此目录是由rpm-build软件包安装生成###

[root@localhost redhat]# rpm -qlp rpm-build-4.4.2.3-18.el5.i386.rpm  ß检查rpm-build软件包所包含的内容

/usr/bin/rpmbuild

/usr/src/redhat

/usr/src/redhat/BUILD

/usr/src/redhat/RPMS

/usr/src/redhat/RPMS/athlon

/usr/src/redhat/RPMS/geode

/usr/src/redhat/RPMS/i386

/usr/src/redhat/RPMS/i486

/usr/src/redhat/RPMS/i586

/usr/src/redhat/RPMS/i686

/usr/src/redhat/RPMS/noarch

/usr/src/redhat/SOURCES

/usr/src/redhat/SPECS

/usr/src/redhat/SRPMS

***************************************************************************



5 �C编译网卡驱动

[root@localhost redhat]#rpmbuild -bb SPECS/netxtreme2.spec

Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.98884

+ umask 022

+ cd /usr/src/redhat/BUILD

+ LANG=C

+ export LANG

+ unset DISPLAY

+ cd /usr/src/redhat/BUILD

+ rm -rf netxtreme2-6.2.23

+ /usr/bin/bzip2 -dc /usr/src/redhat/SOURCES/netxtreme2-6.2.23.tar.bz2

+ tar -xvvf -

drwxr-xr-x root/root         0 2011-02-11 04:25:14 netxtreme2-6.2.23/

drwxr-xr-x root/root         0 2011-02-11 04:25:14 netxtreme2-6.2.23/bnx2x-1.62.15/

…… ß省略编译过程

Requires(interp): /bin/sh /bin/sh

Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1

Requires(post): /bin/sh

Requires(postun): /bin/sh

Checking for unpackaged file(s): /usr/lib/rpm/check-files /var/tmp/netxtreme2-buildroot

Wrote: /usr/src/redhat/RPMS/i386/netxtreme2-6.2.23-1.i386.rpm   ß可以看到生成的网卡驱动RPM包:netxtreme2-6.2.23-1.i386.rpm

Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.28263

+ umask 022

+ cd /usr/src/redhat/BUILD

+ cd netxtreme2-6.2.23

+ rm -rf /var/tmp/netxtreme2-buildroot /usr/src/redhat/BUILD/file.list.netxtreme2

+ exit 0

***************************************************************************

Note

(1)-

Question:

编译网卡驱动报错,退出编译过程,无法编译成功

[root@localhost redhat]# rpmbuild -bb SPECS/netxtreme2.spec

/var/tmp/rpm-tmp.9077: line 32: make: command not found

error: Bad exit status from /var/tmp/rpm-tmp.9077 (%build)

RPM build errors:

   Bad exit status from /var/tmp/rpm-tmp.9077 (%build)

Cause

是因为没有编译环境,或编译环境不全导致。

Resolution:

需要安装RPM包“kernel-devel”和RPM包组"Development Tools"

建议使用YUM环境安装以上RPM包,以解决RPM包关联性问题。安装命令如下:

[root@localhost ~]# yum groupinstall "Development Tools"

[root@localhost ~]# yum install kernel-devel

***************************************************************************



6 �C安装已编译好的网卡驱动

[root@localhost redhat]#rpm -ivh RPMS/i386/netxtreme2-6.2.23-1.i386.rpm

Preparing...                ########################################### [100%]

  1:netxtreme2             ########################################### [100%]



7 �C安装新网卡驱动RPM包后,重新检查已挂载的网卡驱动模块。

[root@localhost redhat]#modinfo bnx2

filename:       /lib/modules/2.6.18-194.el5/updates/bnx2.ko          ß可以看到网卡驱动模块地址已经更新

version:        2.0.23b                                              ß可以新网卡驱动版本从2.0.2升级到2.0.23b

license:        GPL

description:    Broadcom NetXtreme II BCM5706/5708/5709/5716 Driver

author:         Michael Chan <[email protected]>

srcversion:     6E0DD070AB24C11F50B2712

alias:          pci:v000014E4d0000163Csv*sd*bc*sc*i*

alias:          pci:v000014E4d0000163Bsv*sd*bc*sc*i*

alias:          pci:v000014E4d0000163Asv*sd*bc*sc*i*

alias:          pci:v000014E4d00001639sv*sd*bc*sc*i*

alias:          pci:v000014E4d000016ACsv*sd*bc*sc*i*

alias:          pci:v000014E4d000016AAsv*sd*bc*sc*i*

alias:          pci:v000014E4d000016AAsv0000103Csd00003102bc*sc*i*

alias:          pci:v000014E4d0000164Csv*sd*bc*sc*i*

alias:          pci:v000014E4d0000164Asv*sd*bc*sc*i*

alias:          pci:v000014E4d0000164Asv0000103Csd00003106bc*sc*i*

alias:          pci:v000014E4d0000164Asv0000103Csd00003101bc*sc*i*

depends:      

vermagic:       2.6.18-194.el5 SMP mod_unload 686 REGPARM 4KSTACKS gcc-4.1

parm:           disable_msi:Disable Message Signaled Interrupt (MSI) (int)

parm:           stop_on_tx_timeout:For debugging purposes, prevent a chip  reset when a tx timeout occurs (int)


8 �C重启系统,在使用modinfo命令,再次验收升级结果,如果显示为新模块,表明升级成功。


你可能感兴趣的:(Broadcom,redhat5,netxtreme,网卡驱动更新,5709,网卡间歇性的停止接收数据包)