Abstract
Labeled data sets are necessary to train and evaluateanomaly-based network intrusion detection systems. This workprovides a focused literature survey of data sets for network-based intrusion detection and describes the underlying packet-and flow-based network data in detail. The paper identifies 15different properties to assess the suitability of individual data setsfor specific evaluation scenarios. These properties cover a widerange of criteria and are grouped into five categories such asdata volume or recording environment for offering a structuredsearch. Based on these properties, a comprehensive overview ofexisting data sets is given. This overview also highlights thepeculiarities of each data set. Furthermore, this work brieflytouches upon other sources for network-based data such astraffic generators and data repositories. Finally, we discuss ourobservations and provide some recommendations for the use andthe creation of network-based data sets.
Users
Please
log in to take part in the discussion (add own reviews or comments).