[翻译]Improve Cuckoo’s Ability Of Analyzing Network Traffic

原文地址:
https://github.com/cssaheel/dissectors/blob/master/documentation.pdf

1.1 Introduction

Cuckoo Sandbox is an Automated Malware Analysis developed by Claudio Guarnieri, mainly Cuckoo is a lightweight solution that performs automated dynamic analysis of provided Windows binaries. It is able to return comprehensive reports on key API calls and network activity. This documentation is introduce you a library which processes the network files (PCAP files or Packet-Capture files) and return back a report of the result. This library dissect packets fields and extract the most possible extent of information out of network packets, it also aware of tcp reassemblingn not just that it can recover the downloaded files for http, ftp and the sent emails by smtp.
Cuckoo Sandbox是Claudio Guarnieri开发的自动恶意软件分析工具,主要的Cuckoo是一个轻量级解决方案,该方案对提供的Windows二进制文件执行自动动态分析。它能够返回有关关键API调用和网络活动的全面报告。本文档向您介绍一个库,它处理网络文件(PCAP文件或数据包捕获文件)并返回结果报告。该库解析数据包字段,并从网络数据包中提取尽可能多的信息,它还了解tcp重组,不仅可以恢复http、ftp下载的文件和smtp发送的电子邮件。

1.2 Description

This library depend on Scapy library. The supported protocols in this library are: TCP, UDP, ICMP, DNS, SMB, HTTP, FTP, IRC, SIP, TELNET, SSH, SMTP, IMAP and POP. Even that the first five protocols were supported by Scapy they have been interfaced by this library. This figure demonstrates the transparent structure of the library:
这个库依赖于Scapy库。该库中支持的协议有:TCP、UDP、ICMP、DNS、SMB、HTTP、FTP、IRC、SIP、TELNET、SSH、SMTP、IMAP和POP。即使前五个协议由Scapy支持,它们也由这个库连接。此图显示了库的透明结构:

1- The main component in this library which is dissector is responsible of receiving a path to pcap file and send back a dictionary of the supported protocols which holds the dissected packets. Also this component is the one who specify how to represent the data and also it is the responsible of importing Scapy classes and the library classes. Also it preprocesses the tcp sequence numbers and implements the tcp reassembly.
1- 此库中的主要组件是dissector,负责接收pcap文件的路径,并发回支持的协议字典,该字典保存解析的数据包。该组件还负责指定如何表示数据,并负责导入Scapy类和库类。同时对tcp序列号进行预处理,实现tcp重组。

2- The protocols files, each file has one or more classes which responsible fordissecting the corresponding protocol packets.
2- 协议文件,每个文件有一个或多个类,负责分离相应的协议包。

3- There are set of Scapy classes have been used in this library which are Packet class inherited by "Protocols classes", and Field class which inherited by "Fields classes" and it does use rdpcap which takes a path to pcap file and returns back a list of packets.
3- 该库中使用了一组Scapy类,它们是由“Protocols classes”继承的Packet类,以及由“Fields classes”继承的Field类,并且它确实使用rdpcap,该rdpcap获取pcap文件的路径并返回数据包列表。

1.3 General Protocol File Structure

For any future development no need to go deep in Scapy since in this library I didn’t use advanced features of Scapy, so I am going to introduce you the simplest (pseudo code) form of a protocol file structure I followed in this library annotated with some comments:
对于任何未来的开发,无需深入Scapy,因为在这个库中,我没有使用Scapy的高级功能,所以我将向您介绍我在这个库中遵循的协议文件结构的最简单(伪代码)形式,并附上一些注释:

class FTPData(Packet):
    """
    class for dissecting the ftp data
    @attention: it inherets Packet class from Scapy library
    """
    name = "ftp"
    fields_desc = [FTPDataField("data", "")]


class FTPResponse(Packet):
    """
    class for dissecting the ftp responses
    @attention: it inherets Packet class from Scapy library
    """
    name = "ftp"
    fields_desc = [FTPResField("command", "", "H"),
                    FTPResArgField("argument", "", "H")]


class FTPRequest(Packet):
    """
    class for dissecting the ftp requests
    @attention: it inherets Packet class from Scapy library
    """
    name = "ftp"
    fields_desc = [FTPReqField("command", "", "H"),
                    StrField("argument", "", "H")]

bind_layers(TCP, FTPResponse, sport=21)
bind_layers(TCP, FTPRequest, dport=21)
bind_layers(TCP, FTPData, dport=20)
bind_layers(TCP, FTPData, dport=20)

Are we done of the protocol file? well, Not yet. As you see in the previous code in fields desc we have used a class named FTPField and this class is "Field Class" which means in either way it should inherits Field class of Scapy, the other class StrField this has the same thing it inherits Field class but it is predefined by Scapy. Now let us have a look at FTPField class.
协议文件处理完毕了吗?嗯,还没有。正如您在前面的fields desc代码中看到的,我们使用了一个名为FTPField的类,这个类是“Field class”,这意味着它应该以任何一种方式继承Scapy的Field class,另一个类StrField它继承了Field class,但它是由Scapy预定义的。现在让我们看看FTPField类。

class FTPReqField(StrField):
    holds_packets = 1
    name = "FTPReqField"

    def getfield(self, pkt, s):
        """
        this method will get the packet, takes what does need to be
        taken and let the remaining go, so it returns two values.
        first value which belongs to this field and the second is
        the remaining which does need to be dissected with
        other "field classes".
        @param pkt: holds the whole packet
        @param s: holds only the remaining data which is not dissected yet.
        """
        remain = ""
        value = ""
        ls = s.split()
        if ls[0].lower() == "retr":
            c = 1
            file = ""
            while c < len(ls):
                file = file + ls[c]
                c = c + 1
            if len(file) > 0:
                add_file(file)
        length = len(ls)
        if length > 1:
            value = ls[0]
            if length == 2:
                remain = ls[1]
                return remain, value
            else:
                i = 1
                remain = ""
                while i < length:
                    remain = remain + ls[i] + " "
                    i = i + 1
                return remain, value
        else:
            return "", ls[0]

    def __init__(self, name, default, fmt, remain=0):
        """
        class constructor for initializing the instance variables
        @param name: name of the field
        @param default: Scapy has many formats to represent the data
        internal, human and machine. anyways you may sit this param to None.
        @param fmt: specifying the format, this has been set to "H"
        @param remain: this parameter specifies the size of the remaining
        data so make it 0 to handle all of the data.
        """
        self.name = name
        StrField.__init__(self, name, default, fmt, remain)

1.4 Protocols Details and Notes

Different protocols have different properties especially when you go in details. So here I am going to lists the different characteristics and features of the implemented protocols.
不同的协议有不同的属性,尤其是当你深入了解细节时。在这里,我将列出已实现协议的不同特征和特性。

1.5 Requirements

This library has been tested with python version 2.6.5 and Scapy version 2.1.0.

1.6 Usage

Here you will see simple use of this library. Let us have our file usedissector.py as follows:

from dissector import *

"""
this file is a test unit for a pcap library (mainly dissector.py
and its associated protocols classes). This library uses and
depends on Scapy library.
"""
# instance of dissector class
dissector = Dissector()
# sending the pcap file to be dissected
pkts = dissector.dissect_pkts("/root/Desktop/ssh.cap")
print(pkts)

the output will be similar to this:

{’ftp’: [....], ’http’: [....], ....}

1.7 Downloaded Files Recovery

I have wrote a dedicated section for the files recovery to state how this feature works for http, ftp and smtp. All of the protocols will create a directory named downloaded in the current working directory (CWD) to store the recovered files. in case that you want to change the default and want to store the recovered files in another directory you have to send a path to change dfolder just like this:
我已经为文件恢复写了一个专门的章节,来说明这个功能如何适用于http、ftp和smtp。所有协议都将在当前工作目录(CWD)中创建一个名为downloaded的目录来存储恢复的文件。如果要更改默认值,并希望将恢复的文件存储在另一个目录中,则必须发送一个路径来更改数据文件夹,如下所示:

from dissector import *

# instance of dissector class
dissector = Dissector()
# now the downloaded files will be stored on the desktop
dissector.change_dfolder("/root/Desktop/")
# sending the pcap file to be dissected
pkts = dissector.dissect_pkts("/root/Desktop/ssh.cap")

for http it takes the file name from the start line of the http request, so if another file has the same name in the specified directory or the name has some special characters then a random name will be generated. the same apply for ftp which takes the file name from RETR command. whereas smtp just gives the file a random name.
对于http,它从http请求的起始行获取文件名,因此如果另一个文件在指定目录中具有相同的名称,或者该名称具有一些特殊字符,则将生成一个随机名称。这同样适用于从RETR命令获取文件名的ftp。而smtp只是给文件一个随机名称。

1.8 Source Code

the source code of this library is on github:
$ git clone https://github.com/cssaheel/dissectors.git

你可能感兴趣的:([翻译]Improve Cuckoo’s Ability Of Analyzing Network Traffic)