Toward Realistic and Artifact-Free Insider-Threat Data

Kevin S. Killourhy, R. Maxion
{"title":"Toward Realistic and Artifact-Free Insider-Threat Data","authors":"Kevin S. Killourhy, R. Maxion","doi":"10.1109/ACSAC.2007.31","DOIUrl":null,"url":null,"abstract":"Progress in insider-threat detection is currently limited by a lack of realistic, publicly available, real-world data. For reasons of privacy and confidentiality, no one wants to expose their sensitive data to the research community. Data can be sanitized to mitigate privacy and confidentiality concerns, but the mere act of sanitizing the data may introduce artifacts that compromise its utility for research purposes. If sanitization artifacts change the results of insider-threat experiments, then those results could lead to conclusions which are not true in the real world. The goal of this work is to investigate the consequences of sanitization artifacts on insider-threat detection experiments. We assemble a suite of tools and present a methodology for collecting and sanitizing data. We use these tools and methods in an experimental evaluation of an insider-threat detection system. We compare the results of the evaluation using raw data to the results using each of three types of sanitized data, and we measure the effect of each sanitization strategy. We establish that two of the three sanitization strategies actually alter the results of the experiment. Since these two sanitization strategies are commonly used in practice, we must be concerned about the consequences of sanitization artifacts on insider-threat research. On the other hand, we demonstrate that the third sanitization strategy addresses these concerns, indicating that realistic, artifact-free data sets can be created with appropriate tools and methods.","PeriodicalId":199101,"journal":{"name":"Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSAC.2007.31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Progress in insider-threat detection is currently limited by a lack of realistic, publicly available, real-world data. For reasons of privacy and confidentiality, no one wants to expose their sensitive data to the research community. Data can be sanitized to mitigate privacy and confidentiality concerns, but the mere act of sanitizing the data may introduce artifacts that compromise its utility for research purposes. If sanitization artifacts change the results of insider-threat experiments, then those results could lead to conclusions which are not true in the real world. The goal of this work is to investigate the consequences of sanitization artifacts on insider-threat detection experiments. We assemble a suite of tools and present a methodology for collecting and sanitizing data. We use these tools and methods in an experimental evaluation of an insider-threat detection system. We compare the results of the evaluation using raw data to the results using each of three types of sanitized data, and we measure the effect of each sanitization strategy. We establish that two of the three sanitization strategies actually alter the results of the experiment. Since these two sanitization strategies are commonly used in practice, we must be concerned about the consequences of sanitization artifacts on insider-threat research. On the other hand, we demonstrate that the third sanitization strategy addresses these concerns, indicating that realistic, artifact-free data sets can be created with appropriate tools and methods.
走向现实和无人工的内部威胁数据
目前,由于缺乏真实的、公开的、真实的数据,内部威胁检测的进展受到限制。出于隐私和保密的原因,没有人愿意将他们的敏感数据暴露给研究界。可以对数据进行消毒,以减轻隐私和机密性问题,但是仅仅对数据进行消毒的行为可能会引入损害其研究用途的工件。如果消毒人工制品改变了内部威胁实验的结果,那么这些结果可能会导致在现实世界中不正确的结论。这项工作的目的是调查消毒工件对内部威胁检测实验的后果。我们组装了一套工具,并提出了一种收集和处理数据的方法。我们将这些工具和方法用于内部威胁检测系统的实验评估。我们将使用原始数据的评估结果与使用三种类型的消毒数据的结果进行比较,并测量每种消毒策略的效果。我们确定三种消毒策略中的两种实际上改变了实验结果。由于这两种消毒策略在实践中被普遍使用,我们必须关注消毒工件对内部威胁研究的影响。另一方面,我们证明了第三个消毒策略解决了这些问题,表明可以使用适当的工具和方法创建现实的、无人工的数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信