网络环境:
cisco 4006交换机两台,通过2条光纤模块1/1-2配置trunk相互连接,然后连接其他网络设备或者主机。
故障现象:
cisco 4006交换机cpu利用率过高,业务时断时续,无法正常进行,交换机日志采集的信息如下:
2007 May 24 03:55:40 %SYS-4-P2_WARN: 1/Host 00:02:fd:06:d0:b0 is flapping between port 1/2 and port 1/1
2007 May 24 03:55:42 %SYS-4-P2_WARN: 1/Host 00:04:de:17:28:20 is flapping between port 1/2 and port 4/45
2007 May 24 03:55:44 %SYS-4-P2_WARN: 1/Host 00:00:0c:07:ac:01 is flapping between port 1/2 and port 4/47
2007 May 24 03:55:45 %SYS-4-P2_WARN: 1/Host 00:05:9a:20:78:20 is flapping between port 1/2 and port 4/47
2007 May 24 03:55:48 %SYS-4-P2_WARN: 1/Host 00:02:fd:06:d0:b0 is flapping between port 1/1 and port 1/2
2007 May 24 03:55:49 %SYS-4-P2_WARN: 1/Host 00:11:25:19:c3:c2 is flapping between port 1/2 and port 4/13
2007 May 24 03:55:53 %PAGP-5-PORTFROMSTPort 4/45 left bridge port 4/45
2007 May 24 03:55:54 %SYS-4-P2_WARN: 1/Host 00:06:29:ec:aa:f2 is flapping between port 1/2 and port 4/37
2007 May 24 03:55:54 %SYS-4-P2_WARN: 1/Host 00:10:5c:c5:6a:ca is flapping between port 1/1 and port 4/7
2007 May 24 03:55:54 %SYS-4-P2_WARN: 1/Host 00:09:6b:f5:0f:33 is flapping between port 1/1 and port 4/13
2007 May 24 03:55:54 %SYS-4-P2_WARN: 1/Host 00:10:5c:45:6a:ca is flapping between port 1/2 and port 1/1
2007 May 24 03:55:54 %SYS-4-P2_WARN: 1/Host 00:16:ec:7b:6c:b4 is flapping between port 1/1 and port 1/2
2007 May 24 03:55:55 %SYS-4-P2_WARN: 1/Host 00:10:5c:c5:6a:ca is flapping between port 1/1 and port 4/7
分析原因:
两台cisco 4006交换机之间出现环路,某种原因使得STP算法失效,导致网络上出现广播风暴。
处理步骤:
1、首先重启了两台cisco4006交换机(其实网络上还连接了两台IBM小型机通过HACMP做了双机,由于双机对共享资源的保护,对备机发出了 shutdown命令;正确的做法,应该先关闭一台交换机,或者将备机的hacmp停止后再关闭两台交换机),启动后,cpu利用率下降,业务得以正常进行;
2、接下来,根据报错信息上提到的各个端口检查网络中是否存在环路。经检查,出了两台4006之间有环路外,不存在其他环路,各命令检查结果正常,所用到的命令有:show spantree active,show trunk,show config,show vlan,show port 等。
3、使用端口镜像方式对流经交换机上的数据进行抓包,看是否有可疑的arp包,是否为arp病毒导致网络出现环路。检查结果未发现。使用的命令为:set span;使用的工具为:sniffer。
4、考虑到曾经遇到过cisco STP算法出现bug的情况,决定对两台交换机之间的配置做一个改动,将1/1-2两个光纤端口做成一个channel,然后在做trunk,这样既保持了两台交换机之间的连接冗余,又可以消除环路。使用的命令为:
set port channel 1/1-2 53
set port channel 1/1-2 mode on两边做完后,通过show portchannel查看状态,其中4006-2为notconnect,另一边4006-1为errdisable;在4006-1上执行命令:setport 1/1-2 enable;在使用show port channel查看,两边的状态均为connected;
在其中一台交换机上设置trunk:
set trunk 1/1-2 on 1
使用show trunk命令查看状态正常;
使用show spantree active 查看正常:
4006-2> (enable) show spantree
VLAN 1
Spanning tree enabled
Spanning tree type ieee
Designated Root 00-05-32-db-b0-00
Designated Root Priority 32768
Designated Root Cost 3
Designated Root Port 1/1-2 (agPort 13/1)
Root Max Age 20 sec Hello Time 2 sec Forward Delay 15 sec
Bridge ID MAC ADDR 00-05-32-db-b4-00
Bridge ID Priority 32768
Bridge Max Age 20 sec Hello Time 2 sec Forward Delay 15 sec
Port Vlan Port-State Cost Prio Portfast Channel_id
------------------------ ---- ------------- --------- ---- -------- ----------
1/1-2 1 forwarding 3 32 disabled 769
这样,在STP计算时,会将1/1-2当成一个端口在计算,从而消除了环路。