GLAMP: Generative Learning for Adversarially-Robust Malware Prediction

IF 5.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Saurabh Kumar;Cristian Molinaro;Lirika Sola;V. S. Subrahmanian
{"title":"GLAMP: Generative Learning for Adversarially-Robust Malware Prediction","authors":"Saurabh Kumar;Cristian Molinaro;Lirika Sola;V. S. Subrahmanian","doi":"10.1109/TETC.2025.3583872","DOIUrl":null,"url":null,"abstract":"We propose a novel <i>Generative Malware Defense</i> strategy. When an antivirus company detects a malware sample <inline-formula><tex-math>$m$</tex-math></inline-formula>, they should: (i) generate a set <inline-formula><tex-math>${Var}(m)$</tex-math></inline-formula> of several variants of <inline-formula><tex-math>$m$</tex-math></inline-formula> and then (ii) train their malware classifiers on their usual training set augmented with <inline-formula><tex-math>${Var}(m)$</tex-math></inline-formula>. We believe this leads to a more proactive defense by making the classifiers more robust to future malware developed by the attacker. We formally define the malware generation problem as a non-traditional optimization problem. Our novel GLAMP (Generative Learning for Adversarially-robust Malware Prediction) framework analyzes the complexity of the malware generation problem and includes novel malware variant generation algorithms for (i) that leverage the complexity results. Our experiments show that a sufficiently large percentage of samples generated by GLAMP are able to evade both commercial anti-virus and machine learning classifiers with evasion rates up to 83.81% and 50.54%, respectively. GLAMP then proposes an adversarial training model as well. Our experiments show that GLAMP generates running malware that can evade 11 white boxclassifiers and 4 commercial (i.e., black box) detectors. Our experiments show GLAMP’s best adversarial training engine improves the recall by 16.1% and the F1 score by 2.4%-5.4% depending on the test set used.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1299-1315"},"PeriodicalIF":5.4000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11075921/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

We propose a novel Generative Malware Defense strategy. When an antivirus company detects a malware sample $m$, they should: (i) generate a set ${Var}(m)$ of several variants of $m$ and then (ii) train their malware classifiers on their usual training set augmented with ${Var}(m)$. We believe this leads to a more proactive defense by making the classifiers more robust to future malware developed by the attacker. We formally define the malware generation problem as a non-traditional optimization problem. Our novel GLAMP (Generative Learning for Adversarially-robust Malware Prediction) framework analyzes the complexity of the malware generation problem and includes novel malware variant generation algorithms for (i) that leverage the complexity results. Our experiments show that a sufficiently large percentage of samples generated by GLAMP are able to evade both commercial anti-virus and machine learning classifiers with evasion rates up to 83.81% and 50.54%, respectively. GLAMP then proposes an adversarial training model as well. Our experiments show that GLAMP generates running malware that can evade 11 white boxclassifiers and 4 commercial (i.e., black box) detectors. Our experiments show GLAMP’s best adversarial training engine improves the recall by 16.1% and the F1 score by 2.4%-5.4% depending on the test set used.
生成学习用于对抗鲁棒性恶意软件预测
我们提出了一种新的生成式恶意软件防御策略。当反病毒公司检测到恶意软件样本$m$时,他们应该:(i)由$m$的几个变体生成一个集${Var}(m)$,然后(ii)在用${Var}(m)$增强的常规训练集上训练他们的恶意软件分类器。我们相信,通过使分类器对攻击者开发的未来恶意软件更加健壮,这将导致更主动的防御。我们将恶意软件生成问题正式定义为一个非传统的优化问题。我们新颖的GLAMP(生成学习对抗鲁棒恶意软件预测)框架分析了恶意软件生成问题的复杂性,并包括利用复杂性结果的新型恶意软件变体生成算法(i)。我们的实验表明,GLAMP生成的足够大百分比的样本能够逃避商业反病毒和机器学习分类器,逃避率分别高达83.81%和50.54%。然后,GLAMP也提出了一个对抗训练模型。我们的实验表明,GLAMP生成的运行恶意软件可以逃避11个白盒分类器和4个商业(即黑匣子)检测器。我们的实验表明,根据所使用的测试集,GLAMP最好的对抗性训练引擎将召回率提高了16.1%,F1分数提高了2.4%-5.4%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Emerging Topics in Computing
IEEE Transactions on Emerging Topics in Computing Computer Science-Computer Science (miscellaneous)
CiteScore
12.10
自引率
5.10%
发文量
113
期刊介绍: IEEE Transactions on Emerging Topics in Computing publishes papers on emerging aspects of computer science, computing technology, and computing applications not currently covered by other IEEE Computer Society Transactions. Some examples of emerging topics in computing include: IT for Green, Synthetic and organic computing structures and systems, Advanced analytics, Social/occupational computing, Location-based/client computer systems, Morphic computer design, Electronic game systems, & Health-care IT.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信