Toward Realistic and Artifact-Free Insider-Threat Data

Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007) Pub Date : 2007-12-10 DOI:10.1109/ACSAC.2007.31

Kevin S. Killourhy, R. Maxion

{"title":"Toward Realistic and Artifact-Free Insider-Threat Data","authors":"Kevin S. Killourhy, R. Maxion","doi":"10.1109/ACSAC.2007.31","DOIUrl":null,"url":null,"abstract":"Progress in insider-threat detection is currently limited by a lack of realistic, publicly available, real-world data. For reasons of privacy and confidentiality, no one wants to expose their sensitive data to the research community. Data can be sanitized to mitigate privacy and confidentiality concerns, but the mere act of sanitizing the data may introduce artifacts that compromise its utility for research purposes. If sanitization artifacts change the results of insider-threat experiments, then those results could lead to conclusions which are not true in the real world. The goal of this work is to investigate the consequences of sanitization artifacts on insider-threat detection experiments. We assemble a suite of tools and present a methodology for collecting and sanitizing data. We use these tools and methods in an experimental evaluation of an insider-threat detection system. We compare the results of the evaluation using raw data to the results using each of three types of sanitized data, and we measure the effect of each sanitization strategy. We establish that two of the three sanitization strategies actually alter the results of the experiment. Since these two sanitization strategies are commonly used in practice, we must be concerned about the consequences of sanitization artifacts on insider-threat research. On the other hand, we demonstrate that the third sanitization strategy addresses these concerns, indicating that realistic, artifact-free data sets can be created with appropriate tools and methods.","PeriodicalId":199101,"journal":{"name":"Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSAC.2007.31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

Progress in insider-threat detection is currently limited by a lack of realistic, publicly available, real-world data. For reasons of privacy and confidentiality, no one wants to expose their sensitive data to the research community. Data can be sanitized to mitigate privacy and confidentiality concerns, but the mere act of sanitizing the data may introduce artifacts that compromise its utility for research purposes. If sanitization artifacts change the results of insider-threat experiments, then those results could lead to conclusions which are not true in the real world. The goal of this work is to investigate the consequences of sanitization artifacts on insider-threat detection experiments. We assemble a suite of tools and present a methodology for collecting and sanitizing data. We use these tools and methods in an experimental evaluation of an insider-threat detection system. We compare the results of the evaluation using raw data to the results using each of three types of sanitized data, and we measure the effect of each sanitization strategy. We establish that two of the three sanitization strategies actually alter the results of the experiment. Since these two sanitization strategies are commonly used in practice, we must be concerned about the consequences of sanitization artifacts on insider-threat research. On the other hand, we demonstrate that the third sanitization strategy addresses these concerns, indicating that realistic, artifact-free data sets can be created with appropriate tools and methods.

查看原文本刊更多论文

走向现实和无人工的内部威胁数据

目前，由于缺乏真实的、公开的、真实的数据，内部威胁检测的进展受到限制。出于隐私和保密的原因，没有人愿意将他们的敏感数据暴露给研究界。可以对数据进行消毒，以减轻隐私和机密性问题，但是仅仅对数据进行消毒的行为可能会引入损害其研究用途的工件。如果消毒人工制品改变了内部威胁实验的结果，那么这些结果可能会导致在现实世界中不正确的结论。这项工作的目的是调查消毒工件对内部威胁检测实验的后果。我们组装了一套工具，并提出了一种收集和处理数据的方法。我们将这些工具和方法用于内部威胁检测系统的实验评估。我们将使用原始数据的评估结果与使用三种类型的消毒数据的结果进行比较，并测量每种消毒策略的效果。我们确定三种消毒策略中的两种实际上改变了实验结果。由于这两种消毒策略在实践中被普遍使用，我们必须关注消毒工件对内部威胁研究的影响。另一方面，我们证明了第三个消毒策略解决了这些问题，表明可以使用适当的工具和方法创建现实的、无人工的数据集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007)

自引率

0.00%

发文量