HExNet: Enhancing malware classification through hierarchical CNNs and multi-level feature attribution

IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Muhammed Shafi K.P. , Vinod P. , Rafidha Rehiman K.A. , Alejandro Guerra-Manzanares
{"title":"HExNet: Enhancing malware classification through hierarchical CNNs and multi-level feature attribution","authors":"Muhammed Shafi K.P. ,&nbsp;Vinod P. ,&nbsp;Rafidha Rehiman K.A. ,&nbsp;Alejandro Guerra-Manzanares","doi":"10.1016/j.jisa.2025.104207","DOIUrl":null,"url":null,"abstract":"<div><div>The ever-shifting landscape of malware presents a significant threat, as it routinely circumvents traditional defenses. This paper presents HExNet, a Hierarchical Explainable Convolutional Neural Network (CNN) architecture, designed to improve malware analysis and bolster security defenses. Recognizing the growing sophistication of malware, HExNet leverages a dual image representation, converting assembly mnemonics and raw bytecode of malware into visual representations for in-depth pattern recognition. The architecture, optimized for performance and security relevance, integrates multi-level features to enhance detection accuracy. To increase trust and facilitate security audits, HExNet incorporates SHAPley Additive Explanations (SHAP), Class Activation Maps (CAM), and GIST descriptors, providing transparent insights into the model’s classification process. t-SNE visualizations further demonstrate HExNet’s ability to effectively separate malware families, aiding in security intelligence. Evaluated on the Microsoft Malware Classification Challenge (BIG 2015) dataset, HExNet achieves an overall F1-score of 0.9890, with three malware families reaching a perfect F1-score of 1.0 and the remaining six families achieving near-optimal values. To evaluate the generalization capability, we further tested HExNet on a custom dataset consisting 26,401 samples collected from VirusShare, where the proposed model achieved an F1-score of 0.9724, demonstrating generalization performance across diverse malware datasets.</div></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"94 ","pages":"Article 104207"},"PeriodicalIF":3.7000,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212625002443","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The ever-shifting landscape of malware presents a significant threat, as it routinely circumvents traditional defenses. This paper presents HExNet, a Hierarchical Explainable Convolutional Neural Network (CNN) architecture, designed to improve malware analysis and bolster security defenses. Recognizing the growing sophistication of malware, HExNet leverages a dual image representation, converting assembly mnemonics and raw bytecode of malware into visual representations for in-depth pattern recognition. The architecture, optimized for performance and security relevance, integrates multi-level features to enhance detection accuracy. To increase trust and facilitate security audits, HExNet incorporates SHAPley Additive Explanations (SHAP), Class Activation Maps (CAM), and GIST descriptors, providing transparent insights into the model’s classification process. t-SNE visualizations further demonstrate HExNet’s ability to effectively separate malware families, aiding in security intelligence. Evaluated on the Microsoft Malware Classification Challenge (BIG 2015) dataset, HExNet achieves an overall F1-score of 0.9890, with three malware families reaching a perfect F1-score of 1.0 and the remaining six families achieving near-optimal values. To evaluate the generalization capability, we further tested HExNet on a custom dataset consisting 26,401 samples collected from VirusShare, where the proposed model achieved an F1-score of 0.9724, demonstrating generalization performance across diverse malware datasets.
HExNet:通过分层cnn和多级特征归属增强恶意软件分类
不断变化的恶意软件构成了重大威胁,因为它经常绕过传统的防御措施。本文介绍了HExNet,一种分层可解释卷积神经网络(CNN)架构,旨在提高恶意软件分析和加强安全防御。认识到恶意软件的日益复杂,HExNet利用双重图像表示,将汇编助记符和恶意软件的原始字节码转换为深入模式识别的视觉表示。该架构针对性能和安全相关性进行了优化,集成了多层次功能,以提高检测精度。为了增加信任和促进安全审计,HExNet结合了SHAPley加性解释(SHAP)、类激活图(CAM)和GIST描述符,为模型的分类过程提供了透明的见解。t-SNE可视化进一步展示了HExNet有效分离恶意软件家族的能力,有助于安全情报。在微软恶意软件分类挑战(BIG 2015)数据集上进行评估,HExNet的总体f1得分为0.9890,其中三个恶意软件家族达到了完美的f1得分1.0,其余六个恶意软件家族达到了接近最优的值。为了评估泛化能力,我们在VirusShare收集的26401个样本的自定义数据集上进一步测试了HExNet,其中所提出的模型达到了f1得分0.9724,证明了在不同恶意软件数据集上的泛化性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Information Security and Applications
Journal of Information Security and Applications Computer Science-Computer Networks and Communications
CiteScore
10.90
自引率
5.40%
发文量
206
审稿时长
56 days
期刊介绍: Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信