Wireshark启动时,所有解析器进行初始化和注册。要注册的信息包括协议名称、各个字段的信息、过滤用的关键字、要关联的下层协议与端口(handoff)等。在解析过程,每个解析器负责解析自己的协议部分, 然后把上层封装数据传递给后续协议解析器,这样就构成一个完整的协议解析链条。
解析链条的最上端是Frame解析器,它负责解析pcap帧头。后续该调用哪个解析器,是通过上层协议注册handoff信息时写在当前协议的hash表来查找的。
例如,考虑ipv4解析器有一个hash表,里面存储的信息形如下表。当它解析完ipv4首部后,就可以根据得到的协议号字段,比如6,那么它就能从此hash表中找到后续解析器tcp。
协议号 | 解析器指针 |
6 | *tcp |
17 | *udp |
…… |
Wireshark中实际的解析表有3种,分别是字符串表,整数表和启发式解析表。如下图所示:
下面以ip协议为例,说明一下它的注册过程。
相关的重要数据结构与全局变量如下。
proto.c
/* Name hashtables for fast detection of duplicate names */
static GHashTable* proto_names = NULL;
static GHashTable* proto_short_names = NULL;
static GHashTable* proto_filter_names = NULL;
/** Register a new protocol.
@param name the full name of the new protocol
@param short_name abbreviated name of the new protocol
@param filter_name protocol name used for a display filter string
@return the new protocol handle */
int
proto_register_protocol(const char *name, const char *short_name, const char *filter_name);
三个全局的哈希表分别用于保存协议名称、协议缩略名和用于过滤器的协议名。
packet.c:
struct dissector_table {
GHashTable *hash_table;
GSList *dissector_handles;
const char *ui_name;
ftenum_t type;
int base;
};
static GHashTable *dissector_tables = NULL;
/*
* List of registered dissectors.
*/
static GHashTable *registered_dissectors = NULL;
static GHashTable *heur_dissector_lists = NULL;
/* Register a dissector by name. */
dissector_handle_t
register_dissector(const char *name, dissector_t dissector, const int proto);
/** A protocol uses this function to register a heuristic sub-dissector list.
* Call this in the parent dissectors proto_register function.
*
* @param name the name of this protocol
* @param list the list of heuristic sub-dissectors to be registered
*/
void register_heur_dissector_list(const char *name,
heur_dissector_list_t *list);
/* a protocol uses the function to register a sub-dissector table */
dissector_table_t register_dissector_table(const char *name, const char *ui_name, const ftenum_t type, const int base);
dissector_tables可以说是“哈希表的哈希表”,它以解析表名为键(如“ip.proto”),以dissector_table结构指针为值。在dissector_table中的哈希表以无符号数的指针为键(如协议号,为指针是glib hash表API的参数要求),以解析器handle为值;heur_dissector_lists是启发式解析相关的东西,这个问题留待以后研究;registered_dissectors是解析器哈希表,它以解析器名为键(如”ip”),以解析器句柄为值。
packet.h:
typedef struct dissector_table *dissector_table_t;
packet-ip.c:
static dissector_table_t ip_dissector_table;
proto_register_ip函数中:
proto_ip = proto_register_protocol("Internet Protocol Version 4", "IPv4", "ip");
...
/* subdissector code */
ip_dissector_table = register_dissector_table("ip.proto", "IP protocol", FT_UINT8, BASE_DEC);
register_heur_dissector_list("ip", &heur_subdissector_list);
...
register_dissector("ip", dissect_ip, proto_ip);
register_init_routine(ip_defragment_init);
ip_tap = register_tap("ip");
register_dissector_table这个函数在packet.c中,在此函数内,创建了名为“ip.proto”的哈希表。解析ip协议后,会查询这个表,找出下一个解析器,并将后续数据的解析移交给它。
packet-ip.c,dissect_ip函数内:
dissector_try_uint_new(ip_dissector_table, nxt, next_tvb, pinfo,
parent_tree, TRUE, iph)
packet.c:
/* Look for a given value in a given uint dissector table and, if found, call the dissector with the arguments supplied, and return TRUE, otherwise return FALSE. */
gboolean
dissector_try_uint_new(dissector_table_t sub_dissectors, const guint32 uint_val, tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, const gboolean add_proto_name, void *data)
在dissector_try_uint_new函数中,会找到协议号对应的解析器句柄,并使用它解析其余数据。
首先我们得知道需要调用哪些函数。通过调试自己编译的wireshark,我发现如果要实现简单的协议解析,主要只需要以下几个函数(源码位置均相对于wireshark源码主目录而言):
函 数 | 功 能 | 源码位置 |
epan_init | 初始化协议解析库 | epan/epan.h |
epan_cleanup | 清理协议解析库 | 同上 |
epan_dissect_new | 创建协议解析数据结构edt | 同上 |
epan_dissect_run | 执行协议解析 | 同上 |
epan_dissect_free | 销毁协议解析数据结构edt | 同上 |
init_dissection | 初始化数据包级协议解析 | epan/packet.h |
cleanup_dissection | 清理数据包级协议解析 | 同上 |
除此之外,还需要导出一些辅助的函数,如register_all_protocols, register_all_protocol_handoffs,proto_item_fill_label等等。
不仅如此,还需要熟悉协议解析过程所涉及到的一些数据结构,主要有:
数据结构 | 功 能 | 源码位置 |
epan_dissect_t | 协议解析信息,保存协议数据及协议解析树 | epan/epan.h; epan/epan_dissect.h |
field_info | 协议字段信息 | epan/proto.h |
header_field_info | 协议首部字段信息 | 同上 |
proto_tree/proto_node | 协议树 | 同上 |
frame_data | 单帧(数据包)信息 | epan/frame_data.h |
wtap_pseudo_header | wtap伪首部,主要是链路层协议信息 | wiretap/wtap.h |
以上就是一些主要的函数及数据结构。实际的协议解析过程中,可能会涉及到更多的函数及数据结构,这里就不多说了,具体可以查看wireshark源码。如果对于某些函数,或者解析过程有不了解的,也可以自己编译wireshark,然后调试它。
知道原理及所需的函数后,就可以编码实现了。环境的配置等基础知识,本系列前一篇已经讲过了。
我继续用Win32 Console工程来写这个示例。我这个示例代码分为两个部分:
一、wireshark导出函数及简单的封装;
二、实际解析代码
第一个部分分成wireshark.h和wireshark.cpp两个文件。第二部分为dissector.cpp,其中也包括了main函数。
没什么好说的,主要就是所需函数的声明,以及动态调用代码。
wireshark.h:
/*
* wireshark协议解析相关的导出函数声明,以及简单函数封装
*
* Copyright (c) 2013 赵子清, All rights reserved.
*
*/
#ifndef __WIRESHARK_H__
#define __WIRESHARK_H__
// see \wireshark-1.8.4\CMakeLists.txt, #481
#define WS_VAR_IMPORT __declspec(dllimport) extern
// see \wireshark-1.8.4\CMakeLists.txt, #482
#define WS_MSVC_NORETURN __declspec(noreturn)
#ifdef TRY
#undef TRY
#endif
#ifdef CATCH
#undef CATCH
#endif
#ifdef CATCH_ALL
#undef CATCH_ALL
#endif
#ifdef THROW
#undef THROW
#endif
// wireshark源码头文件
#include "epan/epan.h"
#include "epan/epan_dissect.h"
#include "epan/proto.h"
#include "epan/packet_info.h"
#include "epan/frame_data.h"
#include "epan/packet.h"
#include
#define CHECK(x) if(!(x)) return FALSE;
/* \register.h -------------------------------------------------------------------------*/
typedef void (*register_cb) (register_action_e action, const char *message, gpointer client_data);
typedef void (*f_register_all_protocols) (register_cb cb, gpointer client_data);
typedef void (*f_register_all_protocol_handoffs) (register_cb cb, gpointer client_data);
typedef void (*f_register_all_tap_listeners)(void);
/*--------------------------------------------------------------------------------------*/
/* \epan\packet.h ----------------------------------------------------------------------*/
typedef void (*f_init_dissection) (void);
typedef void (*f_cleanup_dissection) (void);
/*--------------------------------------------------------------------------------------*/
/* \epan\epan.h -------------------------------------------------------------------------*/
typedef void (*f_epan_init) (void (*register_all_protocols)(register_cb cb, gpointer client_data),
void (*register_all_handoffs)(register_cb cb, gpointer client_data),
register_cb cb,
void *client_data,
void (*report_failure)(const char *, va_list),
void (*report_open_failure)(const char *, int, gboolean),
void (*report_read_failure)(const char *, int));
typedef void (*f_epan_cleanup) (void);
typedef epan_dissect_t* (*f_epan_dissect_new) (gboolean create_proto_tree,
gboolean proto_tree_visible);
typedef void (*f_epan_dissect_run) (epan_dissect_t *edt, void* pseudo_header,
const guint8* data, frame_data *fd, column_info *cinfo);
typedef void (*f_epan_dissect_free) (epan_dissect_t* edt);
typedef void (*f_epan_dissect_fill_in_columns) (epan_dissect_t *edt);
/*--------------------------------------------------------------------------------------*/
/* \epan\proto.h -----------------------------------------------------------------------*/
typedef void (*f_proto_item_fill_label) (field_info *fi, gchar *label_str);
/*--------------------------------------------------------------------------------------*/
extern f_epan_init ws_epan_init;
extern f_epan_cleanup ws_epan_cleanup;
extern f_register_all_protocols ws_register_all_protocols;
extern f_register_all_protocol_handoffs ws_register_all_protocol_handoffs;
extern f_init_dissection ws_init_dissection;
extern f_cleanup_dissection ws_cleanup_dissection;
extern f_epan_dissect_new ws_epan_dissect_new;
extern f_epan_dissect_run ws_epan_dissect_run;
extern f_epan_dissect_free ws_epan_dissect_free;
extern f_proto_item_fill_label ws_proto_item_fill_label;
HINSTANCE LoadWiresharkDLL(const TCHAR* szDLLPath);
BOOL FreeWiresharkDLL(HMODULE hModule);
BOOL GetWiresharkFunctions(HMODULE hDLL);
#endif /* WIRESHARK_H_ */
wireshark.cpp:
/*
* wireshark协议解析相关的导出函数声明,以及简单的函数封装
*
* Copyright (c) 2013 赵子清, All rights reserved.
*
*/
#include "wireshark.h"
f_epan_init ws_epan_init;
f_epan_cleanup ws_epan_cleanup;
f_register_all_protocols ws_register_all_protocols;
f_register_all_protocol_handoffs ws_register_all_protocol_handoffs;
f_init_dissection ws_init_dissection;
f_cleanup_dissection ws_cleanup_dissection;
f_epan_dissect_new ws_epan_dissect_new;
f_epan_dissect_run ws_epan_dissect_run;
f_epan_dissect_free ws_epan_dissect_free;
f_proto_item_fill_label ws_proto_item_fill_label;
HINSTANCE LoadWiresharkDLL(const TCHAR* szDLLPath)
{
return ::LoadLibrary(szDLLPath);
}
BOOL FreeWiresharkDLL(HMODULE hModule)
{
return ::FreeLibrary(hModule);
}
BOOL GetWiresharkFunctions(HMODULE hDLL)
{
CHECK(ws_epan_init = (f_epan_init)::GetProcAddress(hDLL, "epan_init"));
CHECK(ws_epan_cleanup = (f_epan_cleanup)::GetProcAddress(hDLL, "epan_cleanup"));
CHECK(ws_register_all_protocols = (f_register_all_protocols)
::GetProcAddress(hDLL, "register_all_protocols"));
CHECK(ws_register_all_protocol_handoffs = (f_register_all_protocol_handoffs)
::GetProcAddress(hDLL, "register_all_protocol_handoffs"));
CHECK(ws_init_dissection = (f_init_dissection)::GetProcAddress(hDLL, "init_dissection"));
CHECK(ws_cleanup_dissection = (f_cleanup_dissection)::GetProcAddress(hDLL, "cleanup_dissection"));
CHECK(ws_epan_dissect_new = (f_epan_dissect_new)::GetProcAddress(hDLL, "epan_dissect_new"));
CHECK(ws_epan_dissect_run = (f_epan_dissect_run)::GetProcAddress(hDLL, "epan_dissect_run"));
CHECK(ws_epan_dissect_free = (f_epan_dissect_free)::GetProcAddress(hDLL, "epan_dissect_free"));
CHECK(ws_proto_item_fill_label = (f_proto_item_fill_label)::GetProcAddress(hDLL, "proto_item_fill_label"));
return TRUE;
}
以下代码调用wireshark协议解析库,解析了一段数据。这段数据,如注释里所说,是我上网时随便用wireshark抓的。解析完成后,把结果输出到控制台。
主要的流程是:
动态调用所需的wireshark函数 -> 初始化协议解析库 -> 解析数据 -> 将解析结果按协议层次输出到控制台 -> 清理协议解析库。
解析的结果主要是一个树形结构,因为我写了一个递归函数print_tree来遍历此树。
/*
* 调用wireshark解析库完成数据解析
*
* Copyright (c) 2013 赵子清, All rights reserved.
*
*/
#include "wireshark.h"
#include
#include
#define DATA_LEN 73
#define WIRESHARK_DLL_PATH _T("E:\\dev\\wireshark-1.8.4\\release\\libwireshark.dll")
// 帧数据, 不包括PCAP文件头和帧头
// 数据为ethernet - ipv4 - udp - DNS, 上网时随便捕获的.
const guchar data[DATA_LEN] =
{
0x7E, 0x6D, 0x20, 0x00, 0x01, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, 0x00, 0x08, 0x00, 0x45, 0x00,
0x00, 0x3B, 0x5F, 0x15, 0x00, 0x00, 0x40, 0x11, 0xF1, 0x51, 0x73, 0xAB, 0x4F, 0x08, 0xDB, 0x8D,
0x8C, 0x0A, 0x9B, 0x90, 0x00, 0x35, 0x00, 0x27, 0xEF, 0x4D, 0x43, 0x07, 0x01, 0x00, 0x00, 0x01,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x74, 0x04, 0x73, 0x69, 0x6E, 0x61, 0x03, 0x63, 0x6F,
0x6D, 0x02, 0x63, 0x6E, 0x00, 0x00, 0x01, 0x00, 0x01
};
void print_tree(proto_tree* tree, int level)
{
if(tree == NULL)
return;
for(int i=0; ifinfo->rep == NULL)
ws_proto_item_fill_label(tree->finfo, field_str);
else
strcpy_s(field_str, tree->finfo->rep->representation);
if(!PROTO_ITEM_IS_HIDDEN(tree))
printf("%s\n", field_str);
print_tree(tree->first_child, level+1);
print_tree(tree->next, level);
}
void try_dissect()
{
frame_data *fdata;
epan_dissect_t *edt;
union wtap_pseudo_header pseudo_header;
pseudo_header.eth.fcs_len = -1;
fdata = (frame_data*)g_new(frame_data, 1);
memset(fdata, 0, sizeof(frame_data));
fdata->pfd = NULL;
fdata->num = 1;
fdata->interface_id = 0;
fdata->pkt_len = DATA_LEN;
fdata->cap_len = DATA_LEN;
fdata->cum_bytes = 0;
fdata->file_off = 0;
fdata->subnum = 0;
fdata->lnk_t = WTAP_ENCAP_ETHERNET;
fdata->flags.encoding = PACKET_CHAR_ENC_CHAR_ASCII;
fdata->flags.visited = 0;
fdata->flags.marked = 0;
fdata->flags.ref_time = 0;
fdata->color_filter = NULL;
fdata->abs_ts.secs = 0;
fdata->abs_ts.nsecs = 0;
fdata->opt_comment = NULL;
edt = ws_epan_dissect_new(TRUE, TRUE);
ws_epan_dissect_run(edt, &pseudo_header, data, fdata, NULL);
print_tree(edt->tree->first_child, 0);
ws_epan_dissect_free(edt);
g_free(fdata);
}
int main(int argc, char** argv)
{
HINSTANCE hDLL = NULL;
BOOL ret = FALSE;
void* addr = NULL;
hDLL = LoadWiresharkDLL(WIRESHARK_DLL_PATH);
if(hDLL)
{
ret = GetWiresharkFunctions(hDLL);
if(ret)
{
ws_epan_init(ws_register_all_protocols, ws_register_all_protocol_handoffs,
NULL, NULL, NULL, NULL, NULL);
ws_init_dissection();
try_dissect();
ws_cleanup_dissection();
ws_epan_cleanup();
}
else
fprintf(stderr, "某些导出函数获取失败!\n");
FreeWiresharkDLL(hDLL);
}
else
fprintf(stderr, "无法加载DLL!\n");
system("PAUSE");
return 0;
}
编译运行以上代码,控制台输出的解析结果如下:
Frame 1: 73 bytes on wire (584 bits), 73 bytes captured (584 bits)
WTAP_ENCAP: 1
Frame Number: 1
Frame Length: 73 bytes (584 bits)
Capture Length: 73 bytes (584 bits)
Frame is marked: False
Frame is ignored: False
Protocols in frame: eth:ip:udp:dns
Ethernet II, Src: 01:00:01:00:00:00 (01:00:01:00:00:00), Dst: 7e:6d:20:00:01:00 (7e:6d:20:00:01:00)
Destination: 7e:6d:20:00:01:00 (7e:6d:20:00:01:00)
Address: 7e:6d:20:00:01:00 (7e:6d:20:00:01:00)
.... ..1. .... .... .... .... = LG bit: Locally administered address (this is NOT the factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: 01:00:01:00:00:00 (01:00:01:00:00:00)
Expert Info (Warn/Protocol): Source MAC must not be a group address: IEEE 802.3-2002, Section 3.2.3(b)
Message: Source MAC must not be a group address: IEEE 802.3-2002, Section 3.2.3(b)
Severity level: Warn
Group: Protocol
Address: 01:00:01:00:00:00 (01:00:01:00:00:00)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...1 .... .... .... .... = IG bit: Group address (multicast/broadcast)
Type: IP (0x0800)
Internet Protocol Version 4, Src: 115.171.79.8 (115.171.79.8), Dst: 219.141.140.10 (219.141.140.10)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
0000 00.. = Differentiated Services Codepoint: Default (0x00)
.... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
Total Length: 59
Identification: 0x5f15 (24341)
Flags: 0x00
0... .... = Reserved bit: Not set
.0.. .... = Don't fragment: Not set
..0. .... = More fragments: Not set
Fragment offset: 0
Time to live: 64
Protocol: UDP (17)
Header checksum: 0xf151 [correct]
Good: True
Bad: False
Source: 115.171.79.8 (115.171.79.8)
Destination: 219.141.140.10 (219.141.140.10)
Source GeoIP: Unknown
Destination GeoIP: Unknown
User Datagram Protocol, Src Port: 39824 (39824), Dst Port: 53 (53)
Source port: 39824 (39824)
Destination port: 53 (53)
Length: 39
Checksum: 0xef4d [validation disabled]
Good Checksum: False
Bad Checksum: False
Domain Name System (query)
Transaction ID: 0x4307
Flags: 0x0100 Standard query
0... .... .... .... = Response: Message is a query
.000 0... .... .... = Opcode: Standard query (0)
.... ..0. .... .... = Truncated: Message is not truncated
.... ...1 .... .... = Recursion desired: Do query recursively
.... .... .0.. .... = Z: reserved (0)
.... .... ...0 .... = Non-authenticated data: Unacceptable
Questions: 1
Answer RRs: 0
Authority RRs: 0
Additional RRs: 0
Queries
t.sina.com.cn: type A, class IN
Name: t.sina.com.cn
Type: A (Host address)
Class: IN (0x0001)
我们当然也可以在自己的GUI界面上,使用TreeCtrl来把解析结果显示给用户,就像下面这样(我写的工具的截图):