利用内存取证功能的可解释的模糊恶意软件检测与孤立的家庭区分范式

IF 4.9 3区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computers & Electrical Engineering Pub Date : 2025-02-17 DOI:10.1016/j.compeleceng.2025.110107

S.P. Sharmila , Shubham Gupta , Aruna Tiwari , Narendra S. Chaudhari

{"title":"利用内存取证功能的可解释的模糊恶意软件检测与孤立的家庭区分范式","authors":"S.P. Sharmila , Shubham Gupta , Aruna Tiwari , Narendra S. Chaudhari","doi":"10.1016/j.compeleceng.2025.110107","DOIUrl":null,"url":null,"abstract":"<div><div>In the IoT edge computing era, inevitable and ubiquitous presence of the internet is opening the door for numerous cyberattacks. Obfuscated malware adds layers of difficulty to detect complex modern cyber attacks by evading AI-enabled Next-Generation Anti-Virus (NGAV) scanners and breaching digital privacy. To tackle this problem, in this paper, we propose “Augmented Sparse Projection Oblique Random Forest (AugSPORF)”, an Explainable sparse projections based Oblique Random Forest (ORF) with Isolated Family Distinction (IFD) Paradigm to detect multiple obfuscated malware belonging to Spyware, Ransomware, and Trojan families effectively. Irrespective of obfuscation, malware variants possess common behavior and family traits aligned with their families and leave traces in the memory on execution. To begin with this motivation, we handle the huge dimension of memory forensic features with sparse random projections. Next, we perform feature importance aware training with ORF to learn inherent behavioral features of malware families by isolating the target family, and distinguishing with other families. Further, the model’s scalability is assessed by increasing the number of malware families. To offer an insightful conclusion on the predictions, an Interpretable Machine Learning (IML) layer is interleaved to generate a report of explanations, thereby enhancing the interpretability of the model. The proposed approach yields an average accuracy of 96.76%, 96.45%, and 97.33% in detecting sub-families of Spyware, Ransomware, and Trojan respectively. Improved accuracy is also demonstrated by benchmarking the performance of AugSPORF on UCI repository datasets.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"123 ","pages":"Article 110107"},"PeriodicalIF":4.9000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Leveraging Memory Forensic Features for Explainable Obfuscated Malware Detection with Isolated Family Distinction Paradigm\",\"authors\":\"S.P. Sharmila , Shubham Gupta , Aruna Tiwari , Narendra S. Chaudhari\",\"doi\":\"10.1016/j.compeleceng.2025.110107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In the IoT edge computing era, inevitable and ubiquitous presence of the internet is opening the door for numerous cyberattacks. Obfuscated malware adds layers of difficulty to detect complex modern cyber attacks by evading AI-enabled Next-Generation Anti-Virus (NGAV) scanners and breaching digital privacy. To tackle this problem, in this paper, we propose “Augmented Sparse Projection Oblique Random Forest (AugSPORF)”, an Explainable sparse projections based Oblique Random Forest (ORF) with Isolated Family Distinction (IFD) Paradigm to detect multiple obfuscated malware belonging to Spyware, Ransomware, and Trojan families effectively. Irrespective of obfuscation, malware variants possess common behavior and family traits aligned with their families and leave traces in the memory on execution. To begin with this motivation, we handle the huge dimension of memory forensic features with sparse random projections. Next, we perform feature importance aware training with ORF to learn inherent behavioral features of malware families by isolating the target family, and distinguishing with other families. Further, the model’s scalability is assessed by increasing the number of malware families. To offer an insightful conclusion on the predictions, an Interpretable Machine Learning (IML) layer is interleaved to generate a report of explanations, thereby enhancing the interpretability of the model. The proposed approach yields an average accuracy of 96.76%, 96.45%, and 97.33% in detecting sub-families of Spyware, Ransomware, and Trojan respectively. Improved accuracy is also demonstrated by benchmarking the performance of AugSPORF on UCI repository datasets.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"123 \",\"pages\":\"Article 110107\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-02-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625000503\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625000503","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

在物联网边缘计算时代，互联网的不可避免和无处不在的存在为无数的网络攻击打开了大门。模糊恶意软件通过避开人工智能下一代反病毒（NGAV）扫描仪和侵犯数字隐私，增加了检测复杂的现代网络攻击的难度。为了解决这个问题，在本文中，我们提出了“增强稀疏投影斜随机森林（AugSPORF）”，这是一种基于可解释稀疏投影的斜随机森林（ORF），具有隔离族区分（IFD）范式，可有效检测属于间谍软件，勒索软件和特洛伊木马家族的多种混淆恶意软件。不考虑混淆，恶意软件变体具有与其家族一致的共同行为和家族特征，并在执行时在内存中留下痕迹。从这个动机开始，我们用稀疏随机投影处理巨大维度的内存取证特征。接下来，我们使用ORF进行特征重要性感知训练，通过隔离目标家族并与其他家族区分来学习恶意软件家族的固有行为特征。此外，通过增加恶意软件家族的数量来评估模型的可扩展性。为了提供关于预测的深刻结论，可解释机器学习（IML）层相互交错以生成解释报告，从而增强模型的可解释性。该方法在检测间谍软件、勒索软件和特洛伊木马的子家族方面的平均准确率分别为96.76%、96.45%和97.33%。通过在UCI存储库数据集上对AugSPORF的性能进行基准测试，也证明了提高的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Leveraging Memory Forensic Features for Explainable Obfuscated Malware Detection with Isolated Family Distinction Paradigm

In the IoT edge computing era, inevitable and ubiquitous presence of the internet is opening the door for numerous cyberattacks. Obfuscated malware adds layers of difficulty to detect complex modern cyber attacks by evading AI-enabled Next-Generation Anti-Virus (NGAV) scanners and breaching digital privacy. To tackle this problem, in this paper, we propose “Augmented Sparse Projection Oblique Random Forest (AugSPORF)”, an Explainable sparse projections based Oblique Random Forest (ORF) with Isolated Family Distinction (IFD) Paradigm to detect multiple obfuscated malware belonging to Spyware, Ransomware, and Trojan families effectively. Irrespective of obfuscation, malware variants possess common behavior and family traits aligned with their families and leave traces in the memory on execution. To begin with this motivation, we handle the huge dimension of memory forensic features with sparse random projections. Next, we perform feature importance aware training with ORF to learn inherent behavioral features of malware families by isolating the target family, and distinguishing with other families. Further, the model’s scalability is assessed by increasing the number of malware families. To offer an insightful conclusion on the predictions, an Interpretable Machine Learning (IML) layer is interleaved to generate a report of explanations, thereby enhancing the interpretability of the model. The proposed approach yields an average accuracy of 96.76%, 96.45%, and 97.33% in detecting sub-families of Spyware, Ransomware, and Trojan respectively. Improved accuracy is also demonstrated by benchmarking the performance of AugSPORF on UCI repository datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Electrical Engineering 工程技术-工程：电子与电气

CiteScore

9.20

自引率

7.00%

发文量

661

审稿时长

47 days

期刊介绍： The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.