抓包文件的格式说明

cThat's the file header.  Many capture file formats have such a header.

The header of libpcap-format files (as used by tcpdump, Ethereal,
Analyzer, and a number of other programs) contains:

 a 32-bit "magic number";

 a 16-bit major version number;

 a 16-bit minor version number;

 an unused 32-bit time zone offset field;

 an unused 32-bit time stamp accuracy field;

 a 32-bit field giving the maximum length of the saved data in
     packets;

 a 32-bit field giving the link-layer type of the packets in the
     capture.

All numbers are in the same byte order, which is typically the byte
order of the machine that wrote the capture file.

The magic number has the value hex A1B2C3D4.  On a big-endian machine,
such as a SPARC machine, the four bytes of that number are A1, B2, C3,
and D4, in order.  On a little-endian mchine, such as a PC, the four
bytes of that number are D4, C3, B2, and A1, in order.

(big-endian little-endian 两种文件存储数据格式,“endian”这个词出自《格列佛游记》。小人国的内战就源于吃鸡蛋时是究竟从大头(Big-Endian)敲开还是从小头(Little-Endian)敲开,由此曾发生过六次叛乱,其中一个皇帝送了命,另一个丢了王位。 我们一般将endian翻译成“字节序”,将big endian和little endian称作“大尾”和“小尾”。)

That number serves two purposes:

 1) it indicates that the file is a libpcap-format file;

 2) it indicates the byte order of the numbers in the file header
    and the header written in front of the packet data.

If, when a program or library routine reads the file header, the number
is hex A1B2C3D4, the other numbers in the header are in the byte order
of the machine reading the file, and do not need to be byte-swapped.
If, however, it's D4C3B2A2, they're in the opposite byte order of the
machine reading the file, so the program or library routine needs to
byte-swap them.

The current major and minor version numbers for libpcap-format files are
2 and 4, respectively.

The two unused fields are set to 0 by libpcap (as used by tcpdump and
many other programs) and the internal library Ethereal uses to write
capture files.  I don't know whether they were ever used.

The maximum length of the saved data in packets is the "snapshot length"
specified when the capture was done, e.g. with "-s" for tcpdump or
Tethereal, and "-s" or the appropriate dialog box option for Ethereal,
causing no more than that many bytes of packet data to be saved to the
file.

The link-layer type is a number specifying the type of link-layer
headers in the capture, e.g. Ethernet, FDDI, Token Ring, etc..

Following that header are a sequence of records, one per packet.  Each
record consists of a per-packet header followed by the raw packet data.

The per-packet header contains:

 a time stamp, consisting of 2 32-bit numbers, giving the time
 the packet arrived, in seconds since January 1, 1970, 00:00:00
 GMT in the first number, and microseconds since the second in
 question in the second number;

 a 32-bit number giving the number of bytes of data for that
 packet that are in the file;

frame length stored in the capture file

 a 32-bit number givin the number of bytes of data that were in
 the packet - this could be larger than the previous number.

frame length on the wire

So the data before the first MAC address, in an Ethernet capture,
consists of *two separate* pieces:

 1) the per-file header;

 2) the per-packet header.

There is only one per-file header, at the beginning of the file.  There
is one per-packet header before *each* packet's data.

Note that libpcap includes routines to read and write these files, so
one rarely needs to know the details of this - if you want to write a
program to read or write those files, you should try to use the libpcap
routines to read them ("pcap_open_offline()", "pcap_loop()",
"pcap_close()") or to write them ("pcap_dump_open()", "pcap_dump()",
"pcap_dump_close()") if you can.
 

你可能感兴趣的:(抓包文件的格式说明)