问题描述:rhel5.5系统在做压力测试时,因网络流量较大致使网卡间歇性的停止接收数据包的问题,经查是redhat系统自带网卡驱动有BUG,需要更新网卡驱动,以下为问题描述和驱动更新方法:
Why does the Broadcom NetXtreme 5709 NIC stop receiving packets intermittently on RHEL 5.3 and newer?
Article ID: 26837- Created on: Mar 2, 2010 9:41 PM- Last Modified: Mar 24, 2011 5:05 PM
Issue
In certain situations under heavy loads, the network interface card can stop accepting packets from remote devices.
This problem has been reported on Red Hat Enterprise Linux 5.3 (RHEL 5.3) and newer when using a Broadcom NetXtreme 5709 network interface card.
Environment
Red Hat Enterprise Linux 5.3 to 5.5
Network Interface Cards (NIC) using the bnx2 driver including:
Broadcom Corporation NetXtreme II BCM5709S Gigabit Ethernet
Resolution
Red Hat has released kernel-2.6.18-194.3.1.el5 which will address this issue in RHEL 5. It can be downloaded from the following link:
https://rhn.redhat.com/errata/RHSA-2010-0398.html
* in certain circumstances, under heavy load, certain network interface cards using the bnx2 driver and configured to use MSI-X, could stop processing interrupts and then network connectivity would cease. (BZ#587799)
If upgrading the kernel is not an option, review the following workarounds
Disable MSI-X in the bnx2 driver. To do this, add the following line to /etc/modprobe.conf
ooptions bnx2 disable_msi=1
Disable MSI completely by booting with thepci=nomsi boot parameter. Obviously, this will disable MSI on all devices that are able to utilize it.
Note: MSI-X increases network performance so disabling it means that the performance will return to the level available before MSI-X was introduced.
Disable C-States in BIOS. Refer to the vendor system documentation in order to learn how to do this.
Root Cause
The kernel gets out of sync with interrupts generated by the network interface card which results in an inability to process interrupts, causing packets to be dropped and ultimately, lost connectivity.
When this situation occurs, the rx_fw_discards counter will keep increasing as remote devices unsuccessfully attempt to communicate with the system via the NIC.
It has been reported that under certain heavy traffic conditions in MSI-X mode, the bnx2 driver can lose an MSI-X vector causing all packets in the associated rx/tx ring pair to be dropped. The problem is caused by the chip dropping the write to unmask the MSI-X vector by the kernel (when migrating the IRQ for example).This can be prevented by increasing the GRC timeout value for these register read and write operations.
The upstream patch resolving this issue is available here.
以下是更新网卡驱动方法:
1 �C在编译网卡之前,先检查网卡使用的网卡驱动模块名。
[root@localhost ~]#cat /etc/modprobe.conf
alias eth0 bnx2 ß表明网卡驱动模块名为bnx2
alias eth1 bnx2
alias eth2 bnx2
alias eth3 bnx2
2 �C检查当前是否有已加载网卡驱动模块bnx2
[root@localhost ~]#lsmod | grep bnx2
bnx2 179021 0 ß表明系统已有加载完网卡驱动模块
bnx2i 40413 0
cnic 44877 1 bnx2i
libiscsi2 42693 6 be2iscsi,ib_iser,iscsi_tcp,bnx2i,cxgb3i,libiscsi_tcp
scsi_transport_iscsi2 37709 8 be2iscsi,ib_iser,iscsi_tcp,bnx2i,cxgb3i,libiscsi2
scsi_mod 141973 15 be2iscsi,ib_iser,iscsi_tcp,bnx2i,cxgb3i,libiscsi2,scsi_transport_iscsi2,scsi_dh,sg,pvscsi,libata,mptspi,mptscsih,scsi_transport_spi,sd_mod
3 �C查看当前驱动模块信息,从而得到驱动模块的版本,以便验证稍后网卡驱动模块升级
[root@localhost redhat]#modinfo bnx2
filename: /lib/modules/2.6.18-194.el5/kernel/drivers/net/bnx2.ko ß网卡驱动存放的路径,升级后网卡存放路径与此路径不同
version: 2.0.2 ß表明网卡模块版本为2.0.2,是系统默认网卡驱动版本
license: GPL
description: Broadcom NetXtreme II BCM5706/5708/5709/5716 Driver
author: Michael Chan <[email protected]>
srcversion: 7025AAF3645EE432EAF1C00
alias: pci:v000014E4d0000163Csv*sd*bc*sc*i*
alias: pci:v000014E4d0000163Bsv*sd*bc*sc*i*
alias: pci:v000014E4d0000163Asv*sd*bc*sc*i*
alias: pci:v000014E4d00001639sv*sd*bc*sc*i*
alias: pci:v000014E4d000016ACsv*sd*bc*sc*i*
alias: pci:v000014E4d000016AAsv*sd*bc*sc*i*
alias: pci:v000014E4d000016AAsv0000103Csd00003102bc*sc*i*
alias: pci:v000014E4d0000164Csv*sd*bc*sc*i*
alias: pci:v000014E4d0000164Asv*sd*bc*sc*i*
alias: pci:v000014E4d0000164Asv0000103Csd00003106bc*sc*i*
alias: pci:v000014E4d0000164Asv0000103Csd00003101bc*sc*i*
depends:
vermagic: 2.6.18-194.el5 SMP mod_unload 686 REGPARM 4KSTACKS gcc-4.1
parm: disable_msi:Disable Message Signaled Interrupt (MSI) (int)
parm: enable_entropy:Allow bnx2 to populate the /dev/random entropy pool (int)
module_sig: 883f3504ba037551e1fa4939f6a62931127b30a0e5a160a7ad7a7b9b2c162b309b3316fddc41f280a0cbecbd80e777d961e16218019c365c4b328d1a8
4 �C从DELL网站下载Broadcom网卡驱动包
ftp://ftp.dell.com/FOLDER29291M/1/Bcom_LAN_16.2.0_Linux_Source_A01.tar.gz。解压开始安装
[root@localhost mnt]#tar -vzxf Bcom_LAN_16.2.0_Linux_Source_A01.tar.gz ß解压驱动包
Bcom_LAN_16.2.0_Linux_Source_A01/
Bcom_LAN_16.2.0_Linux_Source_A01/Linux_Readme/
Bcom_LAN_16.2.0_Linux_Source_A01/Linux_Readme/linux_readme.txt
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/ChangeLog
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/README.TXT
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/tg3-3.115j-1.src.rpm
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/tg3-3.115j.tar.gz
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtreme/tg3_sup-3.115j-1.ISO.tar.gz
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/brcm_iscsi_uio-0.6.2.13.tar.gz
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/netxtreme2-6.2.23-1.src.rpm
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/netxtreme2-6.2.23.tar.gz
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/netxtreme2_sup-6.2.23-1.ISO.tar.gz
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/README
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/RELEASE.bnx2.TXT
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/RELEASE.bnx2i.TXT
Bcom_LAN_16.2.0_Linux_Source_A01/NetXtremeII/RELEASE.bnx2x.TXT
[root@localhost mnt]#cd Bcom_LAN_16.2.0_Linux_Source_A01 ß进入解压出来的目录
[root@localhost Bcom_LAN_16.2.0_Linux_Source_A01]#ls ß显示文件夹内容
Linux_Readme NetXtreme NetXtremeII
[root@localhost Bcom_LAN_16.2.0_Linux_Source_A01]#cd NetXtremeII/ ß进入到网卡驱动资源目录
[root@localhost NetXtremeII]#lsß显示此目录的内容
brcm_iscsi_uio-0.6.2.13.tar.gz README
netxtreme2-6.2.23-1.src.rpm RELEASE.bnx2i.TXT
netxtreme2-6.2.23.tar.gz RELEASE.bnx2.TXT
netxtreme2_sup-6.2.23-1.ISO.tar.gz RELEASE.bnx2x.TXT
[root@localhost NetXtremeII]#rpm -ivh netxtreme2-6.2.23-1.src.rpm ß安装编译网卡所需要的资源
1:netxtreme2 ########################################### [100%]
***************************************************************************
Note:
(1) �C
RPM包netxtreme2-6.2.23-1.src.rpm所包含的内容,及存放的目录
[root@localhost NetXtremeII]# rpm -qlp netxtreme2-6.2.23-1.src.rpm
/usr/src/redhat/SOURCES/netxtreme2-6.2.23.tar.bz2
/usr/src/redhat/SPECS/netxtreme2.spec
(2) -
Question:
安装netxtreme2-6.2.23-1.src.rpm时会报“error: cannot create %sourcedir /usr/src/redhat/SOURCES”信息。
Cause:
没有安装qlp rpm-build-4.4.2.3-18.el5.i386.rpm包。
Resolution:
安装rpm-build,同时关联binutils,elfutils两个包。可解决此问题。
### /usr/src/redhat/SOURCES此目录是由rpm-build软件包安装生成###
[root@localhost redhat]# rpm -qlp rpm-build-4.4.2.3-18.el5.i386.rpm ß检查rpm-build软件包所包含的内容
/usr/bin/rpmbuild
/usr/src/redhat
/usr/src/redhat/BUILD
/usr/src/redhat/RPMS
/usr/src/redhat/RPMS/athlon
/usr/src/redhat/RPMS/geode
/usr/src/redhat/RPMS/i386
/usr/src/redhat/RPMS/i486
/usr/src/redhat/RPMS/i586
/usr/src/redhat/RPMS/i686
/usr/src/redhat/RPMS/noarch
/usr/src/redhat/SOURCES
/usr/src/redhat/SPECS
/usr/src/redhat/SRPMS
***************************************************************************
5 �C编译网卡驱动
[root@localhost redhat]#rpmbuild -bb SPECS/netxtreme2.spec
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.98884
+ umask 022
+ cd /usr/src/redhat/BUILD
+ LANG=C
+ export LANG
+ unset DISPLAY
+ cd /usr/src/redhat/BUILD
+ rm -rf netxtreme2-6.2.23
+ /usr/bin/bzip2 -dc /usr/src/redhat/SOURCES/netxtreme2-6.2.23.tar.bz2
+ tar -xvvf -
drwxr-xr-x root/root 0 2011-02-11 04:25:14 netxtreme2-6.2.23/
drwxr-xr-x root/root 0 2011-02-11 04:25:14 netxtreme2-6.2.23/bnx2x-1.62.15/
…… ß省略编译过程
Requires(interp): /bin/sh /bin/sh
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Requires(post): /bin/sh
Requires(postun): /bin/sh
Checking for unpackaged file(s): /usr/lib/rpm/check-files /var/tmp/netxtreme2-buildroot
Wrote: /usr/src/redhat/RPMS/i386/netxtreme2-6.2.23-1.i386.rpm ß可以看到生成的网卡驱动RPM包:netxtreme2-6.2.23-1.i386.rpm
Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.28263
+ umask 022
+ cd /usr/src/redhat/BUILD
+ cd netxtreme2-6.2.23
+ rm -rf /var/tmp/netxtreme2-buildroot /usr/src/redhat/BUILD/file.list.netxtreme2
+ exit 0
***************************************************************************
Note:
(1)-
Question:
编译网卡驱动报错,退出编译过程,无法编译成功
[root@localhost redhat]# rpmbuild -bb SPECS/netxtreme2.spec
/var/tmp/rpm-tmp.9077: line 32: make: command not found
error: Bad exit status from /var/tmp/rpm-tmp.9077 (%build)
RPM build errors:
Bad exit status from /var/tmp/rpm-tmp.9077 (%build)
Cause:
是因为没有编译环境,或编译环境不全导致。
Resolution:
需要安装RPM包“kernel-devel”和RPM包组"Development Tools"
建议使用YUM环境安装以上RPM包,以解决RPM包关联性问题。安装命令如下:
[root@localhost ~]# yum groupinstall "Development Tools"
[root@localhost ~]# yum install kernel-devel
***************************************************************************
6 �C安装已编译好的网卡驱动
[root@localhost redhat]#rpm -ivh RPMS/i386/netxtreme2-6.2.23-1.i386.rpm
Preparing... ########################################### [100%]
1:netxtreme2 ########################################### [100%]
7 �C安装新网卡驱动RPM包后,重新检查已挂载的网卡驱动模块。
[root@localhost redhat]#modinfo bnx2
filename: /lib/modules/2.6.18-194.el5/updates/bnx2.ko ß可以看到网卡驱动模块地址已经更新
version: 2.0.23b ß可以新网卡驱动版本从2.0.2升级到2.0.23b
license: GPL
description: Broadcom NetXtreme II BCM5706/5708/5709/5716 Driver
author: Michael Chan <[email protected]>
srcversion: 6E0DD070AB24C11F50B2712
alias: pci:v000014E4d0000163Csv*sd*bc*sc*i*
alias: pci:v000014E4d0000163Bsv*sd*bc*sc*i*
alias: pci:v000014E4d0000163Asv*sd*bc*sc*i*
alias: pci:v000014E4d00001639sv*sd*bc*sc*i*
alias: pci:v000014E4d000016ACsv*sd*bc*sc*i*
alias: pci:v000014E4d000016AAsv*sd*bc*sc*i*
alias: pci:v000014E4d000016AAsv0000103Csd00003102bc*sc*i*
alias: pci:v000014E4d0000164Csv*sd*bc*sc*i*
alias: pci:v000014E4d0000164Asv*sd*bc*sc*i*
alias: pci:v000014E4d0000164Asv0000103Csd00003106bc*sc*i*
alias: pci:v000014E4d0000164Asv0000103Csd00003101bc*sc*i*
depends:
vermagic: 2.6.18-194.el5 SMP mod_unload 686 REGPARM 4KSTACKS gcc-4.1
parm: disable_msi:Disable Message Signaled Interrupt (MSI) (int)
parm: stop_on_tx_timeout:For debugging purposes, prevent a chip reset when a tx timeout occurs (int)
8 �C重启系统,在使用modinfo命令,再次验收升级结果,如果显示为新模块,表明升级成功。