网络入侵检测系统中机器学习模型的中毒攻击和数据消毒缓解

S. Venkatesan, Harshvardhan Digvijay Sikka, R. Izmailov, R. Chadha, Alina Oprea, Michael J. de Lucia
{"title":"网络入侵检测系统中机器学习模型的中毒攻击和数据消毒缓解","authors":"S. Venkatesan, Harshvardhan Digvijay Sikka, R. Izmailov, R. Chadha, Alina Oprea, Michael J. de Lucia","doi":"10.1109/MILCOM52596.2021.9652916","DOIUrl":null,"url":null,"abstract":"Among many application domains of machine learning in real-world settings, cyber security can benefit from more automated techniques to combat sophisticated adversaries. Modern network intrusion detection systems leverage machine learning models on network logs to proactively detect cyber attacks. However, the risk of adversarial attacks against machine learning used in these cyber settings is not fully explored. In this paper, we investigate poisoning attacks at training time against machine learning models in constrained cyber environments such as network intrusion detection; we also explore mitigations of such attacks based on training data sanitization. We consider the setting of poisoning availability attacks, in which an attacker can insert a set of poisoned samples at training time with the goal of degrading the accuracy of the deployed model. We design a white-box, realizable poisoning attack that reduced the original model accuracy from 95% to less than 50 % by generating mislabeled samples in close vicinity of a selected subset of training points. We also propose a novel Nested Training method as a defense against these attacks. Our defense includes a diversified ensemble of classifiers, each trained on a different subset of the training set. We use the disagreement of the classifiers' predictions as a data sanitization method, and show that an ensemble of 10 SVM classifiers is resilient to a large fraction of poisoning samples, up to 30% of the training data.","PeriodicalId":187645,"journal":{"name":"MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Poisoning Attacks and Data Sanitization Mitigations for Machine Learning Models in Network Intrusion Detection Systems\",\"authors\":\"S. Venkatesan, Harshvardhan Digvijay Sikka, R. Izmailov, R. Chadha, Alina Oprea, Michael J. de Lucia\",\"doi\":\"10.1109/MILCOM52596.2021.9652916\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Among many application domains of machine learning in real-world settings, cyber security can benefit from more automated techniques to combat sophisticated adversaries. Modern network intrusion detection systems leverage machine learning models on network logs to proactively detect cyber attacks. However, the risk of adversarial attacks against machine learning used in these cyber settings is not fully explored. In this paper, we investigate poisoning attacks at training time against machine learning models in constrained cyber environments such as network intrusion detection; we also explore mitigations of such attacks based on training data sanitization. We consider the setting of poisoning availability attacks, in which an attacker can insert a set of poisoned samples at training time with the goal of degrading the accuracy of the deployed model. We design a white-box, realizable poisoning attack that reduced the original model accuracy from 95% to less than 50 % by generating mislabeled samples in close vicinity of a selected subset of training points. We also propose a novel Nested Training method as a defense against these attacks. Our defense includes a diversified ensemble of classifiers, each trained on a different subset of the training set. We use the disagreement of the classifiers' predictions as a data sanitization method, and show that an ensemble of 10 SVM classifiers is resilient to a large fraction of poisoning samples, up to 30% of the training data.\",\"PeriodicalId\":187645,\"journal\":{\"name\":\"MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MILCOM52596.2021.9652916\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MILCOM52596.2021.9652916","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

在现实世界中机器学习的许多应用领域中,网络安全可以从更自动化的技术中受益,以对抗复杂的对手。现代网络入侵检测系统利用网络日志上的机器学习模型来主动检测网络攻击。然而,针对这些网络环境中使用的机器学习的对抗性攻击的风险并没有得到充分的探讨。在本文中,我们研究了在约束网络环境(如网络入侵检测)下,针对机器学习模型在训练时的中毒攻击;我们还探讨了基于训练数据清理的此类攻击的缓解措施。我们考虑了中毒可用性攻击的设置,攻击者可以在训练时插入一组中毒样本,目的是降低部署模型的准确性。我们设计了一个白盒,可实现的中毒攻击,通过在选定的训练点子集附近生成错误标记的样本,将原始模型的准确率从95%降低到50%以下。我们还提出了一种新的嵌套训练方法来防御这些攻击。我们的防御包括一个多样化的分类器集合,每个分类器在训练集的不同子集上训练。我们使用分类器预测的不一致作为数据消毒方法,并表明10个SVM分类器的集合对很大一部分中毒样本(高达30%的训练数据)具有弹性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Poisoning Attacks and Data Sanitization Mitigations for Machine Learning Models in Network Intrusion Detection Systems
Among many application domains of machine learning in real-world settings, cyber security can benefit from more automated techniques to combat sophisticated adversaries. Modern network intrusion detection systems leverage machine learning models on network logs to proactively detect cyber attacks. However, the risk of adversarial attacks against machine learning used in these cyber settings is not fully explored. In this paper, we investigate poisoning attacks at training time against machine learning models in constrained cyber environments such as network intrusion detection; we also explore mitigations of such attacks based on training data sanitization. We consider the setting of poisoning availability attacks, in which an attacker can insert a set of poisoned samples at training time with the goal of degrading the accuracy of the deployed model. We design a white-box, realizable poisoning attack that reduced the original model accuracy from 95% to less than 50 % by generating mislabeled samples in close vicinity of a selected subset of training points. We also propose a novel Nested Training method as a defense against these attacks. Our defense includes a diversified ensemble of classifiers, each trained on a different subset of the training set. We use the disagreement of the classifiers' predictions as a data sanitization method, and show that an ensemble of 10 SVM classifiers is resilient to a large fraction of poisoning samples, up to 30% of the training data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信