DARPA dataset介绍

DARPA 1999年评测数据包括覆盖了Probe,DoS,R2L,U2R和Data等5大类58种典型攻击方式,是目前最为全面的攻击测试数据集.同时,作为研 究领域共同认可及广泛使用的基准评测数据集,DARPA 1999年评测数据为新提出的入侵检测算法和技术与其他算法之间的比较提供了可能.DARPA 1999评测数据给出了5周的模拟数据,其中前两周是提供给参于评测者的训练数据:第1,3周为不包含任何攻击的正常数据;第2周中插入了属于18种类型 的43次攻击实例.后2周的数据则用于评测;第4周、第5周中包含了属于58种类型的201次攻击实例,其中40种攻击类型并没有在前两周的训练数据中出现,属于新的攻击类型.
18个研究中的IDS系统参与了1999年的评测,优胜者为SRI International提交的EMERALD系统,在其检测范围内的169个攻击实例中检测出85个,检测率为50%.此外,58种攻击类型中有21 种类型共计77个攻击实例被划分为Poor Detected,参与测评的系统最多也仅能检测其中的15个攻击实例.

http://www.ll.mit.edu/IST/ideval/data/1999/1999_data_index.html

1999 DARPA Intrusion Detection Evaluation Data Set Overview

There were two parts to the 1999 DARPA Intrusion Detection Evaluation: an off-line evaluation and a realtime evaluation.

Intrusion detection systems were tested in the off-line evaluation using network traffic and audit logs collected on a simulation network. The systems processed this data in batch mode and attempted to identify attack sessions in the midst of normal activities.

Intrusion detection systems were delivered to AFRL for the realtime evaluation. These systems were inserted into the AFRL network testbed and attempted to identify attack sessions in the midst of normal activities, in realtime.

Intrusion detection systems were tested as part of the off-line evaluation, the realtime evaluation or both.


Training Data

Three weeks of training data were provided for the 1999 DARPA Intrusion Detection off-line evaluation.

The first and third weeks of the training data do not contain any attacks. This data was provided to facilitate the training of anomaly detection systems.

The second week of the training data contains a select subset of attacks from the 1998 evaluation in addition to several new attacks. The primary purpose in presenting these attacks was to provide examples of how to report attacks that are detected.

Note: In 1999, Intrusion detection systems were trained using the data from both the 1998 and the 1999 evaluations.

The following files are provided for each day in the training set:

  • Outside sniffing data ( Tcpdump format )

  • Inside sniffing data ( Tcpdump format )

  • BSM audit data ( From pascal )

  • NT audit data ( From hume )

  • Long listings of directory trees ( From pascal, marx, zeno, and hume )

  • Dumps of selected directories ( From pascal, marx, zeno, and hume )

  • A Report of file system inode information ( From pascal )

BSM Configuration [tar/gzip]
First Week of Training Data (Attack Free)
Second Week of Training Data (Contains Labled Attacks)
Third Week of Training Data (Attack Free)


Testing Data

Two weeks of network based attacks in the midst of normal background data. The forth and fifth weeks of data are the "Test Data" used in the 1999 Evaluation from 9/16/1999 to 10/1/1999. There are 201 instances of about 56 types of attacks distributed throughout these two weeks.

Further information about the attack instances, where they are located in week 4 and 5 data is found in the "1999 Attack Truth" available on the Documentation page.

Fourth Week of Test Data
Fifth Week of Test Data

 

http://www.ll.mit.edu/IST/ideval/data/1998/1998_data_index.html

1998 DARPA Intrusion Detection Evaluation Data Set Overview

There were two parts to the 1998 DARPA Intrusion Detection Evaluation: an off-line evaluation and a realtime evaluation.

Intrusion detection systems were tested in the off-line evaluation using network traffic and audit logs collected on a simulation network. The systems processed this data in batch mode and attempted to identify attack sessions in the midst of normal activities.

Intrusion detection systems were delivered to AFRL for the realtime evaluation. These systems were inserted into the AFRL network testbed and attempted to identify attack sessions in the midst of normal activities, in realtime.

Intrusion detection systems were tested as part of the off-line evaluation, the realtime evaluation or both.


Sample Data

A sample of the network traffic and audit logs that were used for evaluating systems. These data were first made available in February 1998.

  • README file

  • Sample Data Set [3,000 Kb tar/gzip]


Four-Hour Subset of Training Data

A somewhat larger sample of training data. These data were first made available in May 1998.

  • README file

  • Tcpdump data [38 MB gzip]

  • BSM data [5 MB gzip]

  • ASCII BSM data [6 MB gzip]

  • File system dump (ufsdump) - /root [40 MB gzip]

  • File system dump (ufsdump) - /usr [87 MB gzip]

  • File system dump (ufsdump) - /home [1 MB gzip]

  • File system dump (ufsdump) - /opt [93 MB gzip]


Training Data

Seven weeks of network based attacks in the midst of normal background data. Listings of attacks and anomalies are available on the Documentation page.

  • First Week of Training Data

  • Second Week of Training Data

  • Third Week of Training Data

  • Fourth Week of Training Data

  • Fifth Week of Training Data

  • Sixth Week of Training Data

  • Seventh Week of Training Data


Testing Data

Two weeks of network based attacks in the midst of normal background data.

  • First Week of Test Data

  • Second Week of Test Data

http://www.ll.mit.edu/IST/ideval/data/2000/2000_data_index.html

2000 DARPA Intrusion Detection Scenario Specific Data Sets

The content and labeling of data sets relies significantly on reports and feedback from consumers of this data. Please send feedback on this data set to Joshua W. Haines so that your ideas can be incorporated into future data sets. Thanks!

Overview

Off-line intrusion detection datasets were produced as per consensus from the Wisconsin Re-Think meeting and the July 2000 Hawaii PI meeting.

LLDOS 1.0 - Scenario One

This is the first attack scenario data set to be created for DARPA as a part of this effort. It includes a distributed denial of service attack run by a novice attacker. Future versions of this and other example scenarios will contain more stealthy attack versions.

This attack scenario is carried out over multiple network and audit sessions. These sessions have been grouped into 5 attack phases, over the course of which the attacker probes the network, breaks in to a host by exploiting the Solaris sadmind vulnerability, installs trojan mstream DDoS software, and launches a DDoS attack at an off site server from the comprismised host.

  • ADVERSARY: Novice

  • ADVERSARY GOAL: Install components for and carry out a DDOS attack

  • DEFENDER: Naive

Data and labeling information is available for downloading .

LLDOS 2.0.2 - Scenario Two

This is the second attack scenario data set to be created for DARPA as a part of this effort. It includes a distributed denial of service attack run by an attacker who is more stealthy than the attacker in the first dataset. The attacker is still considered a Novice, as the attack is mostly scripted in a fashion that dispite being a bit more stealthy, is still something that any attacker might be able to download and run.

This attack scenario is carried out over multiple network and audit sessions. These sessions have been grouped into 5 attack phases, over the course of which the attacker probes the network, breaks in to a host by exploiting the Solaris sadmind vulnerability, installs trojan mstream DDoS software, and launches a DDoS attack at an off-site server from the comprismised host.

  • ADVERSARY: Novice

  • ADVERSARY GOAL: Install components for and carry out a DDOS attack

  • DEFENDER: Naive

Data and labeling information is available for downloading .

Windows NT Attack Data Set

An experiment with a level of NT auditing higher than that which was run in the 1999 Evaluation was run in January of 2000. Here are the collected traces of data from that run of one day's traffic and attack impinging on the NT machine. High level labeling information for these is available now.

  • NT Event Log Audit Data

  • Outside Tcpdump Data

  • Inside Tcpdump Data

  • High-Level Attack Truth File [Word Document ]

Note: This day contains data from 08:00 to 14:30 hours. The network sniffers collected data until 17:00.




你可能感兴趣的:(DARPA)