Jaime C. Acosta, Stephanie Medina, J. Ellis, Luisana Clarke, Veronica Rivas, Allison Newcomb
{"title":"Network Data Curation Toolkit: Cybersecurity Data Collection, Aided-Labeling, and Rule Generation","authors":"Jaime C. Acosta, Stephanie Medina, J. Ellis, Luisana Clarke, Veronica Rivas, Allison Newcomb","doi":"10.1109/MILCOM52596.2021.9653049","DOIUrl":null,"url":null,"abstract":"Cybersecurity network data curation is the collection, labeling, and packaging of datasets that contain artifacts that are important in the cybersecurity domain. These assets are essential for cybersecurity research and key for defense technologies and systems to detect and respond to anomalies caused by adversaries. However, tools for data curation are lacking in all domains of cybersecurity, including enterprise and the military. Curation fuels empirical research and validation of protection, detection, and prevention techniques. Closing the gap will require the development of research-driven tools and technologies that facilitate and enforce not only collection and labeling, but also standardization and distribution. This paper describes a novel tool, called the Network Data Curation Toolkit (NDCT), which simplifies the process of collecting network traffic, keystrokes, mouse clicks; allows network packet labeling; automatically generates intrusion detection rules; and provides a visualization of results. Moreover, the tool has a built-in mechanism for exporting all data into a single distributable file. The tool is modular to allow extension and to facilitate its incorporation into existing workflows. We demonstrate the use of NDCT in two case studies. We first show how NDCT can augment cybersecurity exercises by having participants label their network data. We then describe a separate system that was embedded with the NDCT, which provides a workspace, allowing users to curate data through a multi-session environment, including generating intrusion detection rules for malware.","PeriodicalId":187645,"journal":{"name":"MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)","volume":"116 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MILCOM52596.2021.9653049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Cybersecurity network data curation is the collection, labeling, and packaging of datasets that contain artifacts that are important in the cybersecurity domain. These assets are essential for cybersecurity research and key for defense technologies and systems to detect and respond to anomalies caused by adversaries. However, tools for data curation are lacking in all domains of cybersecurity, including enterprise and the military. Curation fuels empirical research and validation of protection, detection, and prevention techniques. Closing the gap will require the development of research-driven tools and technologies that facilitate and enforce not only collection and labeling, but also standardization and distribution. This paper describes a novel tool, called the Network Data Curation Toolkit (NDCT), which simplifies the process of collecting network traffic, keystrokes, mouse clicks; allows network packet labeling; automatically generates intrusion detection rules; and provides a visualization of results. Moreover, the tool has a built-in mechanism for exporting all data into a single distributable file. The tool is modular to allow extension and to facilitate its incorporation into existing workflows. We demonstrate the use of NDCT in two case studies. We first show how NDCT can augment cybersecurity exercises by having participants label their network data. We then describe a separate system that was embedded with the NDCT, which provides a workspace, allowing users to curate data through a multi-session environment, including generating intrusion detection rules for malware.