基于深度强化学习的神经恶意软件控制

MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM) Pub Date : 2019-11-01 DOI:10.1109/MILCOM47813.2019.9020862

Yu Wang, J. W. Stokes, M. Marinescu

{"title":"基于深度强化学习的神经恶意软件控制","authors":"Yu Wang, J. W. Stokes, M. Marinescu","doi":"10.1109/MILCOM47813.2019.9020862","DOIUrl":null,"url":null,"abstract":"Antimalware products are a key component in detecting malware attacks, and their engines typically execute unknown programs in a sandbox prior to running them on the native operating system. Files cannot be scanned indefinitely so the engine employs heuristics to determine when to halt execution. Previous research has investigated analyzing the sequence of system calls generated during this emulation process to predict if an unknown file is malicious, but these models often require the emulation to be stopped after executing a fixed number of events from the beginning of the file. Also, these classifiers are not accurate enough to halt emulation in the middle of the file on their own. In this paper, we propose a novel algorithm which overcomes this limitation and learns the best time to halt the file's execution based on deep reinforcement learning (DRL). Because the new DRL-based system continues to emulate the unknown file until it can make a confident decision to stop, it prevents attackers from avoiding detection by initiating malicious activity after a fixed number of system calls. Results show that the proposed malware execution control model automatically halts emulation for 91.3% of the files earlier than heuristics employed by the engine. Furthermore, classifying the files at that time significantly improves the classifier's accuracy. This new model improves the true positive rate by 61.5%, at a false positive rate of 1%, compared to the best baseline classifier.","PeriodicalId":371812,"journal":{"name":"MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Neural Malware Control with Deep Reinforcement Learning\",\"authors\":\"Yu Wang, J. W. Stokes, M. Marinescu\",\"doi\":\"10.1109/MILCOM47813.2019.9020862\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Antimalware products are a key component in detecting malware attacks, and their engines typically execute unknown programs in a sandbox prior to running them on the native operating system. Files cannot be scanned indefinitely so the engine employs heuristics to determine when to halt execution. Previous research has investigated analyzing the sequence of system calls generated during this emulation process to predict if an unknown file is malicious, but these models often require the emulation to be stopped after executing a fixed number of events from the beginning of the file. Also, these classifiers are not accurate enough to halt emulation in the middle of the file on their own. In this paper, we propose a novel algorithm which overcomes this limitation and learns the best time to halt the file's execution based on deep reinforcement learning (DRL). Because the new DRL-based system continues to emulate the unknown file until it can make a confident decision to stop, it prevents attackers from avoiding detection by initiating malicious activity after a fixed number of system calls. Results show that the proposed malware execution control model automatically halts emulation for 91.3% of the files earlier than heuristics employed by the engine. Furthermore, classifying the files at that time significantly improves the classifier's accuracy. This new model improves the true positive rate by 61.5%, at a false positive rate of 1%, compared to the best baseline classifier.\",\"PeriodicalId\":371812,\"journal\":{\"name\":\"MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MILCOM47813.2019.9020862\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MILCOM47813.2019.9020862","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

摘要

反恶意软件产品是检测恶意软件攻击的关键组件，它们的引擎通常在本机操作系统上运行之前在沙箱中执行未知程序。文件不能无限扫描，所以引擎使用启发式来确定何时停止执行。以前的研究分析了仿真过程中生成的系统调用序列，以预测未知文件是否为恶意文件，但是这些模型通常要求在从文件开始执行固定数量的事件后停止仿真。此外，这些分类器不够精确，无法在文件中间自行停止模拟。在本文中，我们提出了一种新的算法，该算法克服了这一限制，并基于深度强化学习(DRL)学习停止文件执行的最佳时间。由于新的基于drl的系统会继续模拟未知文件，直到它能够确定停止为止，因此它可以防止攻击者通过在固定数量的系统调用之后发起恶意活动来逃避检测。结果表明，所提出的恶意软件执行控制模型比引擎采用的启发式算法更早地自动停止91.3%的文件仿真。此外，此时对文件进行分类可以显著提高分类器的准确率。与最佳基线分类器相比，该新模型将真阳性率提高了61.5%，假阳性率为1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Neural Malware Control with Deep Reinforcement Learning

Antimalware products are a key component in detecting malware attacks, and their engines typically execute unknown programs in a sandbox prior to running them on the native operating system. Files cannot be scanned indefinitely so the engine employs heuristics to determine when to halt execution. Previous research has investigated analyzing the sequence of system calls generated during this emulation process to predict if an unknown file is malicious, but these models often require the emulation to be stopped after executing a fixed number of events from the beginning of the file. Also, these classifiers are not accurate enough to halt emulation in the middle of the file on their own. In this paper, we propose a novel algorithm which overcomes this limitation and learns the best time to halt the file's execution based on deep reinforcement learning (DRL). Because the new DRL-based system continues to emulate the unknown file until it can make a confident decision to stop, it prevents attackers from avoiding detection by initiating malicious activity after a fixed number of system calls. Results show that the proposed malware execution control model automatically halts emulation for 91.3% of the files earlier than heuristics employed by the engine. Furthermore, classifying the files at that time significantly improves the classifier's accuracy. This new model improves the true positive rate by 61.5%, at a false positive rate of 1%, compared to the best baseline classifier.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)

自引率

0.00%

发文量