关于IP SLA及与EEM联动的探讨(转)

http://hi.baidu.com/tzcasha/item/c2ec6fe47916e9295b7cfb2d

http://www.zhaocs.info/sla_eem_1.html

转自:http://www.zhaocs.info/sla_eem_1.html

SLA简介

SLA (Service-Level Agreement)简单的理解,就是测量一些网络性能参数,在超过一些门限值时,结合track或者EEM它可以触发一些操作。例如:
1. 监控下一跳的可达性,如果不可达了, 则让某一静态路由失效
2. 监控领居的接口地址,如果连续三次不可达, 则将端口shutdown

SLA 应用实例

如果客户的线路质量不好,又无法改善时,我们需要一种方法来:当线路质量达到一定阀值时,直接reset端口,用重置链路来改善。
那么我们如何达到这种需求呢,这时SLA就登场了,那么如何部署SLA呢?

分析第一种方法

ip sla 2
 icmp-echo 1.1.1.2
 timeout 3000
 frequency 10<---频率设置为10S发一次ip sla schedule 2 life forever start-time now<---设置SLA的启动时间为马上,有效期为永远!
Track 1 rtr 2<---配置track, trace有up和down两种状态!
event manager session cli username "username"
event manager applet test_track_1<---EEM 配置
 event track 1 state down<---如果track 1的状态是down的, 则执行下面的操作
 action 1.0 cli command "enable"
 action 2.0 cli command "conf t"
 action 3.0 cli command "int g4/3"
 action 3.1 cli command "shut"
 action 3.2 cli command "no shut"
 action 4.0 cli command "end"

根据以上配置,如果设备会每10秒发送一个PING包, 当超时时,track 1会变为down的状态,进而触发对端口的shut/no shut,这样达到最终的目的了么?在实际网络中,偶尔丢一个包是正常的、不可避免的,与此同时,端口是正常工作的。但是,根据上面的配置,端口依然会被reset,导致业务受到影响。

为了避免这一类不必要的业务影响,我们需要对此配置进行优化,让track 1在发生真正的网络故障时才down。
最常用的判断网络故障的标准是连续的超时!
因此加入以下命令(只说明增加的命令,另外修改的命令用红色标出):

ip sla 4
 icmp-echo 1.1.1.2
 timeout 1000
 frequency 10
ip sla schedule 4 life 5 start-time pending<---并不是马上启动,有效期也只有5秒!
ip sla reaction-configuration 2 react timeout threshold-type consecutive 3 action-type trapandtrigger<---当发生连续3次timeout时,trigger另一个SLA,并发SNMP Trapip sla reaction-trigger 2 4<---SLA 2连续三次timeout就要trigger SLA 4!
track 1 rtr 4<---track SLA 4,而不是SLA 2,为什么?

如果配置track 1 rtr 2,则每次SLA 2超时时,track 1都会down,每down一次EEM都会reset端口,这样功能还是和之前一样不合理。所以,需要配置track 1 rtr 4,因为SLA 4是pending的,它在SLA 2连续三次超时时才被触发(10*3 + 5 =35S)


根据上篇文章分析的第一种方法到底行不行呢?
经过测试,确实可以规避原丢一个包就启动EEM的问题。
但有个问题,因为要新增一个sla,如777,且其状态为pending,即只有在17丢三个包的情况下才启动777。

有如下两种情况:

1、线路已经开通时,这时配置如上命令时,因17无法连续丢3个包,导致777始终不能启动,导致track17的状态始终为down,最终导致不管丢多少包都不能启动EEM。(想想为什么?)

规避措施:配置完如上命令时需要shut上端或下端端口30s(因每10s探测一次),这时777才能启动,然后再做no shut操作,track17状态才能变为up,才能在专线中断的情况下正常启动EEM。所以在已经开通的线路配置如上命令时都要中断主用线路最少30s。

2、线路尚未开通时,这时需要在配置完如上命令最少30s才能开通此MSTP线路,否则同样会有如上问题。

关于问题的分析及解释:
在这我就不解释命令的含义了,关于命令的含义,可以看上篇文章 《 关于IP SLA及与EEM联动的探讨<2>

Config:

ip sla monitor 17
 type echo protocol ipIcmpEcho 12.1.1.2
 timeout 3000
 frequency 10
!
ip sla monitor reaction-configuration 17 react timeout threshold-type consecutive 3 action-type trapAndTrigger
ip sla monitor reaction-trigger 17 777
ip sla monitor schedule 17 life forever start-time now
!
ip sla monitor 777
 type echo protocol ipIcmpEcho 12.1.1.2
 timeout 1000
 frequency 10
!
ip sla monitor schedule 777 life 5 start-time pending
track 17 rtr 777
!
event manager session cli username "username"
event manager applet test_track_17
 event track 17 state down
 action 1.0 cli command "enable"
 action 2.0 cli command "conf t"
 action 3.0 cli command "int s1/0"
 action 3.1 cli command "shut"
 action 3.2 cli command "no shut"
 action 4.0 cli command "end"
!
R1#sh debugging
Track debugging is on
Embedded Event Manager:
  Debug EEM action cli debugging is on
IP SLA Monitor:
  TRACE debugging for all operations is on

Debug Information:

1. 初始配置后是down的状态:

R1(config)#track 17 rtr 777
*Feb 25 10:15:06.979: Track: 17 Adding rtr object
*Feb 25 10:15:06.979: Track: Initialise
*Feb 25 10:15:06.983: Track: 17 New rtr 777, state Down*Feb 25 10:15:06.987: Track: Starting process

R1#sh track
Track 17
  Response Time Reporter 777 state
  State is Down
    1 change, last change 00:01:19
  Latest operation return code: Unknown
  Tracked by:
     applet test_track_17

2. shutdown本断或对断端口,激活777,使track 17成为down

*Feb 25 10:16:25.107: IP SLA Monitor(777) Scheduler: Starting an operation
*Feb 25 10:16:25.107: IP SLA Monitor(777) echo operation: Sending an echo operation
*Feb 25 10:16:26.107: IP SLA Monitor(777) echo operation: Timeout
*Feb 25 10:16:26.107: IP SLA Monitor(777) Scheduler: Updating result
*Feb 25 10:16:26.777: IP SLA Monitor(777) Scheduler: Ageout

R1#sh track
Track 17
  Response Time Reporter 777 state
  State is Down
    1 change, last change 00:01:55
  Latest operation return code: Timeout
  Tracked by:
     applet test_track_17

3. no shut端口,再次激活777,使其成为up

*Feb 25 10:17:42.159: IP SLA Monitor(777) Scheduler: Starting an operation
*Feb 25 10:17:42.159: IP SLA Monitor(777) echo operation: Sending an echo operation
*Feb 25 10:17:42.171: IP SLA Monitor(777) echo operation: RTT=12
*Feb 25 10:17:42.175: IP SLA Monitor(777) Scheduler: Updating result
*Feb 25 10:17:42.175: IP SLA Monitor(777) Scheduler: Ageout
*Feb 25 10:17:46.983: Track: 17 Change #2 rtr 777, state Down->Up

R1#sh track
Track 17
  Response Time Reporter 777 state
  State is Up
    2 changes, last change 00:09:16
  Latest operation return code: OK
  Latest RTT (millisecs) 12
  Tracked by:
     applet test_track_17

4. shutdown本端端口,测试是否可以达到效果
注意:下面的时间戳跟上面的不是连续的,是经过两次测试得到的

R1#config ter
Enter configuration commands, one per line.  End with CNTL/Z.
R1(config)#int s1/0
R1(config-if)#
R1(config-if)#shutdown
R1(config-if)#end
R1#
*Feb 25 09:47:27.911: %LINK-5-CHANGED: Interface Serial1/0, changed state to administratively down
*Feb 25 09:47:27.915: %ENTITY_ALARM-6-INFO: ASSERT INFO Se1/0 Physical Port Administrative State Down
*Feb 25 09:47:28.911: %LINEPROTO-5-UPDOWN: Line protocol on Interface Serial1/0, changed state to down

*Feb 25 09:47:31.775: IP SLA Monitor(17) Scheduler: Starting an operation
*Feb 25 09:47:31.775: IP SLA Monitor(17) echo operation: Sending an echo operation
*Feb 25 09:47:34.779: IP SLA Monitor(17) echo operation: Timeout
*Feb 25 09:47:34.779: IP SLA Monitor(17) Scheduler: Updating result

*Feb 25 09:47:41.775: IP SLA Monitor(17) Scheduler: Starting an operation
*Feb 25 09:47:41.779: IP SLA Monitor(17) echo operation: Sending an echo operation
*Feb 25 09:47:44.779: IP SLA Monitor(17) echo operation: Timeout
*Feb 25 09:47:44.779: IP SLA Monitor(17) Scheduler: Updating result

*Feb 25 09:47:51.775: IP SLA Monitor(17) Scheduler: Starting an operation
*Feb 25 09:47:51.775: IP SLA Monitor(17) echo operation: Sending an echo operation
*Feb 25 09:47:54.779: IP SLA Monitor(17) echo operation: Timeout
*Feb 25 09:47:54.779: IP SLA Monitor(17) Scheduler: Updating result

*Feb 25 09:47:54.827: IP SLA Monitor(777) Scheduler: Starting an operation
*Feb 25 09:47:54.827: IP SLA Monitor(777) echo operation: Sending an echo operation
*Feb 25 09:47:55.831: IP SLA Monitor(777) echo operation: Timeout
*Feb 25 09:47:55.831: IP SLA Monitor(777) Scheduler: Updating result
*Feb 25 09:47:55.835: IP SLA Monitor(777) Scheduler: Ageout

*Feb 25 09:47:56.231: Track: 17 Change #5 rtr 777, state Up->Down
*Feb 25 09:47:56.251: fh_schedule_callback: EEM callback policy EEM Policy Director has been scheduled to run

*Feb 25 09:47:56.275: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : CTL : cli_open called.
*Feb 25 09:47:56.291: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT :
*Feb 25 09:47:56.291: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT : R1>
*Feb 25 09:47:56.295: %HA_EM-6-LOG:
R1#test_track_17 : DEBUG(cli_lib) : : IN  : >enable
*Feb 25 09:47:56.311: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT :
*Feb 25 09:47:56.311: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT : R1#
*Feb 25 09:47:56.311: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : IN  : #conf t

*Feb 25 09:47:56.327: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT :
*Feb 25 09:47:56.331: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT :
Enter configuration commands, one per line.  End with CNTL/Z.
*Feb 25 09:47:56.335: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT : R1(config)#
*Feb 25 09:47:56.339: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : IN  : #int s1/0

*Feb 25 09:47:56.355: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT :
*Feb 25 09:47:56.355: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT : R1(config-if)#
*Feb 25 09:47:56.355: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : IN  : #shut

*Feb 25 09:47:56.371: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT :
*Feb 25 09:47:56.375: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT : R1(config-if)#
*Feb 25 09:47:56.379: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : IN  : #no shut

*Feb 25 09:47:56.411: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT :
*Feb 25 09:47:56.411: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT : R1(config-if)#
*Feb 25 09:47:56.415: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : IN  : #end

*Feb 25 09:47:56.435: %SYS-5-CONFIG_I: Configured from console by name on vty1

*Feb 25 09:47:56.447: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT :
*Feb 25 09:47:56.447: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : OUT : R1#
*Feb 25 09:47:56.451: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : IN  : #exit
*Feb 25 09:47:56.455: %HA_EM-6-LOG: test_track_17 : DEBUG(cli_lib) : : CTL : cli_close called.

*Feb 25 09:47:58.387: %LINK-3-UPDOWN: Interface Serial1/0, changed state to up
*Feb 25 09:47:58.391: %ENTITY_ALARM-6-INFO: CLEAR INFO Se1/0 Physical Port Administrative State Down
*Feb 25 09:47:59.399: %LINEPROTO-5-UPDOWN: Line protocol on Interface Serial1/0, changed state to up

R1#sh track
R1#sh track 17
Track 17
  Response Time Reporter 777 state
  State is Up
    6 changes, last change 00:15:41
  Latest operation return code: OK
  Latest RTT (millisecs) 3
  Tracked by:
     applet test_track_17


你可能感兴趣的:(接口,监控,如何,Frequency,测量)