Feature Cultivation in Privileged Information-augmented Detection

Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics Pub Date : 2017-03-24 DOI:10.1145/3041008.3041018

Z. B. Celik, P. Mcdaniel, R. Izmailov

{"title":"Feature Cultivation in Privileged Information-augmented Detection","authors":"Z. B. Celik, P. Mcdaniel, R. Izmailov","doi":"10.1145/3041008.3041018","DOIUrl":null,"url":null,"abstract":"Modern detection systems use sensor outputs available in the deployment environment to probabilistically identify attacks. These systems are trained on past or synthetic feature vectors to create a model of anomalous or normal behavior. Thereafter, run-time collected sensor outputs are compared to the model to identify attacks (or the lack of attack). While this approach to detection has been proven to be effective in many environments, it is limited to training on only features that can be reliably collected at detection time. Hence, they fail to leverage the often vast amount of ancillary information available from past forensic analysis and post-mortem data. In short, detection systems do not train (and thus do not learn from) features that are unavailable or too costly to collect at run-time. Recent work proposed an alternate model construction approach that integrates forensic \"privilege\" information---features reliably available at training time, but not at run-time---to improve accuracy and resilience of detection systems. In this paper, we further evaluate two of proposed techniques to model training with privileged information: knowledge transfer, and model influence. We explore the cultivation of privileged features, the efficiency of those processes and their influence on the detection accuracy. We observe that the improved integration of privileged features makes the resulting detection models more accurate. Our evaluation shows that use of privileged information leads to up to 8.2% relative decrease in detection error for fast-flux bot detection over a system with no privileged information, and 5.5% for malware classification.","PeriodicalId":137012,"journal":{"name":"Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3041008.3041018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Modern detection systems use sensor outputs available in the deployment environment to probabilistically identify attacks. These systems are trained on past or synthetic feature vectors to create a model of anomalous or normal behavior. Thereafter, run-time collected sensor outputs are compared to the model to identify attacks (or the lack of attack). While this approach to detection has been proven to be effective in many environments, it is limited to training on only features that can be reliably collected at detection time. Hence, they fail to leverage the often vast amount of ancillary information available from past forensic analysis and post-mortem data. In short, detection systems do not train (and thus do not learn from) features that are unavailable or too costly to collect at run-time. Recent work proposed an alternate model construction approach that integrates forensic "privilege" information---features reliably available at training time, but not at run-time---to improve accuracy and resilience of detection systems. In this paper, we further evaluate two of proposed techniques to model training with privileged information: knowledge transfer, and model influence. We explore the cultivation of privileged features, the efficiency of those processes and their influence on the detection accuracy. We observe that the improved integration of privileged features makes the resulting detection models more accurate. Our evaluation shows that use of privileged information leads to up to 8.2% relative decrease in detection error for fast-flux bot detection over a system with no privileged information, and 5.5% for malware classification.

查看原文本刊更多论文

特权信息增强检测中的特征培养

现代检测系统使用部署环境中可用的传感器输出来概率地识别攻击。这些系统在过去或合成的特征向量上进行训练，以创建异常或正常行为的模型。然后，将运行时收集的传感器输出与模型进行比较，以识别攻击(或缺乏攻击)。虽然这种检测方法已被证明在许多环境中是有效的，但它仅限于训练在检测时可以可靠收集的特征。因此，他们无法利用从过去的法医分析和死后数据中获得的大量辅助信息。简而言之，检测系统不会训练(因此不会从中学习)那些在运行时不可用或收集成本过高的特征。最近的研究提出了一种替代的模型构建方法，该方法集成了法医“特权”信息——在训练时可靠可用的特征，但在运行时不可靠——以提高检测系统的准确性和弹性。在本文中，我们进一步评估了两种基于特权信息的训练建模技术:知识转移和模型影响。我们探讨了特权特征的培养，这些过程的效率及其对检测精度的影响。我们观察到，改进的特权特征集成使得到的检测模型更加准确。我们的评估表明，在没有特权信息的系统上，使用特权信息可以使快速通量僵尸检测的检测误差相对降低8.2%，恶意软件分类的检测误差相对降低5.5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics

自引率

0.00%

发文量