最近工作中一项任务涉及到使用SNMP协议获取网络设备的启动时间。一开始,这只是个简单的体力劳动:联系厂商->索要相关OID->测试->反馈。本以为一下午就可以搞定的事情,因为测试的时候发现了“神奇的问题”而拖了两三天。这里简单记录一下探索的过程。
1. 问题描述:
按照官方网站查询结果,Cisco 各系列IOS路由器、交换机的sysUpTime OID均为:
Name | Value |
---|---|
Object | sysUpTime |
OID | 1.3.6.1.2.1.1.3 |
Type | TimeTicks |
Permission | read-only |
MIB | SNMPv2-MIB |
使用SNMP工具测试一台最近上线的设备,通过该OID取到的值与登录设备show version看到的UpTime一致。原本事情到这里就结束了,但是不知为何一时鬼迷心窍又挑了一台前几年上线的6509进行测试。一看结果:
sysUpTime.0 = Timeticks: (254716234) 29 days, 11:32:42.34
瞬间傻眼了。
2. 排错过程
发现问题以后,脑海里第一反应是:OID有问题,也许是取到了另一个时间,比如配置变更时间之类。但是查了我能想到的能查到的所有关于时间的值以后,发现没一个能跟上面那哥们长得一样的。
再仔细看一下,官网给出的MIB为SNMPv2-MIB : RFC3418所定义的标准OID,值类型为"TimeTicks"
sysUpTime OBJECT-TYPE
SYNTAX TimeTicks
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"The time (in hundredths of a second) since the
network management portion of the system was last
re-initialized."
::= { system 3 }
不知道精确到1%秒是出于什么考虑。不过好像有点眉目了——这货是个无符号整数嘛,小时候老师教育我们拿到数据要算范围,一切没有最大值最小值精确度的计算机表示都是耍流氓。来大概猜一下:
![][1]
[1]: http://latex.codecogs.com/gif.latex?\frac{2^{32}}{100\times&space;24\times&space;60&space;\times&space;60}=497.1026963=497&space;days,02:27:53
加上刚才的29天,show version……
uptime is 1 year, 23 weeks, 13 hours, 32 minutes
365+23*7,日子好像差不多了,时间怎么不对?
左思右想……
这不就是敲这两条命令的时间差么!
3.知其所以然
子曾经曰:外事不决问谷哥。其实在网络这块,RFC才是你最后的堡垒,究极的大杀器。
TimeTicks
7.1.8. TimeTicks
The TimeTicks type represents a non-negative integer which represents
the time, modulo 2^32 (4294967296 decimal), in hundredths of a second
between two epochs. When objects are defined which use this ASN.1
type, the description of the object identifies both of the reference
epochs.
For example, [3] defines the TimeStamp textual convention which is
based on the TimeTicks type. With a TimeStamp, the first reference
epoch is defined as the time when sysUpTime [5] was zero, and the
second reference epoch is defined as the current value of sysUpTime.
The TimeTicks type may not be sub-typed.
再看看这个
Counter32
7.1.6. Counter32
The Counter32 type represents a non-negative integer which
monotonically increases until it reaches a maximum value of 2^32-1
(4294967295 decimal), when it wraps around and starts increasing
again from zero.
Counters have no defined "initial" value, and thus, a single value of
a Counter has (in general) no information content. Discontinuities
in the monotonically increasing value normally occur at re-
initialization of the management system, and at other times as
specified in the description of an object-type using this ASN.1 type.
If such other times can occur, for example, the creation of an object
instance at times other than re-initialization, then a corresponding
object should be defined, with an appropriate SYNTAX clause, to
indicate the last discontinuity. Examples of appropriate SYNTAX
clause include: TimeStamp (a textual convention defined in [3]),
DateAndTime (another textual convention from [3]) or TimeTicks.
The value of the MAX-ACCESS clause for objects with a SYNTAX clause
value of Counter32 is either "read-only" or "accessible-for-notify".
A DEFVAL clause is not allowed for objects with a SYNTAX clause value
of Counter32.
原来这是人家标准里定义好的,不是Bug。该值增长到2^32会自动重置为0,然后重新开始计数。害我瞎忙活一场。说到底还是功力不够。
Update:
写这篇文章的时候,查到对于这个问题,官方论坛有进一步的讨论。