强化学习在网络事件响应过程中高效和有效的恶意软件调查

IF 3 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

High-Confidence Computing Pub Date : 2025-01-17 DOI:10.1016/j.hcc.2025.100299

Dipo Dunsin , Mohamed Chahine Ghanem , Karim Ouazzane , Vassil Vassilev

{"title":"强化学习在网络事件响应过程中高效和有效的恶意软件调查","authors":"Dipo Dunsin , Mohamed Chahine Ghanem , Karim Ouazzane , Vassil Vassilev","doi":"10.1016/j.hcc.2025.100299","DOIUrl":null,"url":null,"abstract":"<div><div>The ever-escalating prevalence of malware is a serious cybersecurity threat, often requiring advanced post-incident forensic investigation techniques. This paper proposes a framework to enhance malware forensics by leveraging reinforcement learning (RL). The approach combines heuristic and signature-based methods, supported by RL through a unified MDP model, which breaks down malware analysis into distinct states and actions. This optimisation enhances the identification and classification of malware variants. The framework employs Q-learning and other techniques to boost the speed and accuracy of detecting new and unknown malware, outperforming traditional methods. We tested the experimental framework across multiple virtual environments infected with various malware types. The RL agent collected forensic evidence and improved its performance through Q-tables and temporal difference learning. The epsilon-greedy exploration strategy, in conjunction with Q-learning updates, effectively facilitated transitions. The learning rate depended on the complexity of the MDP environment: higher in simpler ones for quicker convergence and lower in more complex ones for stability. This RL-enhanced model significantly reduced the time required for post-incident malware investigations, achieving a high accuracy rate of 94<span><math><mtext>%</mtext></math></span> in identifying malware. These results indicate RL’s potential to revolutionise post-incident forensics investigations in cybersecurity. Future work will incorporate more advanced RL algorithms and large language models (LLMs) to further enhance the effectiveness of malware forensic analysis.</div></div>","PeriodicalId":100605,"journal":{"name":"High-Confidence Computing","volume":"5 3","pages":"Article 100299"},"PeriodicalIF":3.0000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement learning for an efficient and effective malware investigation during cyber incident response\",\"authors\":\"Dipo Dunsin , Mohamed Chahine Ghanem , Karim Ouazzane , Vassil Vassilev\",\"doi\":\"10.1016/j.hcc.2025.100299\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The ever-escalating prevalence of malware is a serious cybersecurity threat, often requiring advanced post-incident forensic investigation techniques. This paper proposes a framework to enhance malware forensics by leveraging reinforcement learning (RL). The approach combines heuristic and signature-based methods, supported by RL through a unified MDP model, which breaks down malware analysis into distinct states and actions. This optimisation enhances the identification and classification of malware variants. The framework employs Q-learning and other techniques to boost the speed and accuracy of detecting new and unknown malware, outperforming traditional methods. We tested the experimental framework across multiple virtual environments infected with various malware types. The RL agent collected forensic evidence and improved its performance through Q-tables and temporal difference learning. The epsilon-greedy exploration strategy, in conjunction with Q-learning updates, effectively facilitated transitions. The learning rate depended on the complexity of the MDP environment: higher in simpler ones for quicker convergence and lower in more complex ones for stability. This RL-enhanced model significantly reduced the time required for post-incident malware investigations, achieving a high accuracy rate of 94<span><math><mtext>%</mtext></math></span> in identifying malware. These results indicate RL’s potential to revolutionise post-incident forensics investigations in cybersecurity. Future work will incorporate more advanced RL algorithms and large language models (LLMs) to further enhance the effectiveness of malware forensic analysis.</div></div>\",\"PeriodicalId\":100605,\"journal\":{\"name\":\"High-Confidence Computing\",\"volume\":\"5 3\",\"pages\":\"Article 100299\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"High-Confidence Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667295225000030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"High-Confidence Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667295225000030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

恶意软件的不断升级是一个严重的网络安全威胁，通常需要先进的事后取证调查技术。本文提出了一个利用强化学习（RL）来增强恶意软件取证的框架。该方法结合了启发式和基于签名的方法，RL通过统一的MDP模型提供支持，该模型将恶意软件分析分解为不同的状态和操作。这种优化增强了恶意软件变体的识别和分类。该框架采用Q-learning和其他技术来提高检测新的和未知恶意软件的速度和准确性，优于传统方法。我们在感染各种恶意软件类型的多个虚拟环境中测试了实验框架。RL代理收集取证证据，并通过q表和时间差异学习提高其性能。贪心探索策略与Q-learning更新相结合，有效地促进了转换。学习率取决于MDP环境的复杂性：为了更快的收敛，简单的学习率越高；为了稳定，复杂的学习率越低。这种强化学习的模型显著减少了事件后恶意软件调查所需的时间，在识别恶意软件方面实现了高达94%的准确率。这些结果表明，RL有可能彻底改变网络安全领域的事后取证调查。未来的工作将包括更先进的强化学习算法和大型语言模型（llm），以进一步提高恶意软件取证分析的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement learning for an efficient and effective malware investigation during cyber incident response

The ever-escalating prevalence of malware is a serious cybersecurity threat, often requiring advanced post-incident forensic investigation techniques. This paper proposes a framework to enhance malware forensics by leveraging reinforcement learning (RL). The approach combines heuristic and signature-based methods, supported by RL through a unified MDP model, which breaks down malware analysis into distinct states and actions. This optimisation enhances the identification and classification of malware variants. The framework employs Q-learning and other techniques to boost the speed and accuracy of detecting new and unknown malware, outperforming traditional methods. We tested the experimental framework across multiple virtual environments infected with various malware types. The RL agent collected forensic evidence and improved its performance through Q-tables and temporal difference learning. The epsilon-greedy exploration strategy, in conjunction with Q-learning updates, effectively facilitated transitions. The learning rate depended on the complexity of the MDP environment: higher in simpler ones for quicker convergence and lower in more complex ones for stability. This RL-enhanced model significantly reduced the time required for post-incident malware investigations, achieving a high accuracy rate of 94

%

in identifying malware. These results indicate RL’s potential to revolutionise post-incident forensics investigations in cybersecurity. Future work will incorporate more advanced RL algorithms and large language models (LLMs) to further enhance the effectiveness of malware forensic analysis.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

High-Confidence Computing

CiteScore

4.70

自引率

0.00%

发文量