基于深度强化学习的神经恶意软件控制

Yu Wang, J. W. Stokes, M. Marinescu
{"title":"基于深度强化学习的神经恶意软件控制","authors":"Yu Wang, J. W. Stokes, M. Marinescu","doi":"10.1109/MILCOM47813.2019.9020862","DOIUrl":null,"url":null,"abstract":"Antimalware products are a key component in detecting malware attacks, and their engines typically execute unknown programs in a sandbox prior to running them on the native operating system. Files cannot be scanned indefinitely so the engine employs heuristics to determine when to halt execution. Previous research has investigated analyzing the sequence of system calls generated during this emulation process to predict if an unknown file is malicious, but these models often require the emulation to be stopped after executing a fixed number of events from the beginning of the file. Also, these classifiers are not accurate enough to halt emulation in the middle of the file on their own. In this paper, we propose a novel algorithm which overcomes this limitation and learns the best time to halt the file's execution based on deep reinforcement learning (DRL). Because the new DRL-based system continues to emulate the unknown file until it can make a confident decision to stop, it prevents attackers from avoiding detection by initiating malicious activity after a fixed number of system calls. Results show that the proposed malware execution control model automatically halts emulation for 91.3% of the files earlier than heuristics employed by the engine. Furthermore, classifying the files at that time significantly improves the classifier's accuracy. This new model improves the true positive rate by 61.5%, at a false positive rate of 1%, compared to the best baseline classifier.","PeriodicalId":371812,"journal":{"name":"MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Neural Malware Control with Deep Reinforcement Learning\",\"authors\":\"Yu Wang, J. W. Stokes, M. Marinescu\",\"doi\":\"10.1109/MILCOM47813.2019.9020862\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Antimalware products are a key component in detecting malware attacks, and their engines typically execute unknown programs in a sandbox prior to running them on the native operating system. Files cannot be scanned indefinitely so the engine employs heuristics to determine when to halt execution. Previous research has investigated analyzing the sequence of system calls generated during this emulation process to predict if an unknown file is malicious, but these models often require the emulation to be stopped after executing a fixed number of events from the beginning of the file. Also, these classifiers are not accurate enough to halt emulation in the middle of the file on their own. In this paper, we propose a novel algorithm which overcomes this limitation and learns the best time to halt the file's execution based on deep reinforcement learning (DRL). Because the new DRL-based system continues to emulate the unknown file until it can make a confident decision to stop, it prevents attackers from avoiding detection by initiating malicious activity after a fixed number of system calls. Results show that the proposed malware execution control model automatically halts emulation for 91.3% of the files earlier than heuristics employed by the engine. Furthermore, classifying the files at that time significantly improves the classifier's accuracy. This new model improves the true positive rate by 61.5%, at a false positive rate of 1%, compared to the best baseline classifier.\",\"PeriodicalId\":371812,\"journal\":{\"name\":\"MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MILCOM47813.2019.9020862\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MILCOM47813.2019.9020862","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

反恶意软件产品是检测恶意软件攻击的关键组件,它们的引擎通常在本机操作系统上运行之前在沙箱中执行未知程序。文件不能无限扫描,所以引擎使用启发式来确定何时停止执行。以前的研究分析了仿真过程中生成的系统调用序列,以预测未知文件是否为恶意文件,但是这些模型通常要求在从文件开始执行固定数量的事件后停止仿真。此外,这些分类器不够精确,无法在文件中间自行停止模拟。在本文中,我们提出了一种新的算法,该算法克服了这一限制,并基于深度强化学习(DRL)学习停止文件执行的最佳时间。由于新的基于drl的系统会继续模拟未知文件,直到它能够确定停止为止,因此它可以防止攻击者通过在固定数量的系统调用之后发起恶意活动来逃避检测。结果表明,所提出的恶意软件执行控制模型比引擎采用的启发式算法更早地自动停止91.3%的文件仿真。此外,此时对文件进行分类可以显著提高分类器的准确率。与最佳基线分类器相比,该新模型将真阳性率提高了61.5%,假阳性率为1%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Neural Malware Control with Deep Reinforcement Learning
Antimalware products are a key component in detecting malware attacks, and their engines typically execute unknown programs in a sandbox prior to running them on the native operating system. Files cannot be scanned indefinitely so the engine employs heuristics to determine when to halt execution. Previous research has investigated analyzing the sequence of system calls generated during this emulation process to predict if an unknown file is malicious, but these models often require the emulation to be stopped after executing a fixed number of events from the beginning of the file. Also, these classifiers are not accurate enough to halt emulation in the middle of the file on their own. In this paper, we propose a novel algorithm which overcomes this limitation and learns the best time to halt the file's execution based on deep reinforcement learning (DRL). Because the new DRL-based system continues to emulate the unknown file until it can make a confident decision to stop, it prevents attackers from avoiding detection by initiating malicious activity after a fixed number of system calls. Results show that the proposed malware execution control model automatically halts emulation for 91.3% of the files earlier than heuristics employed by the engine. Furthermore, classifying the files at that time significantly improves the classifier's accuracy. This new model improves the true positive rate by 61.5%, at a false positive rate of 1%, compared to the best baseline classifier.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信