Tong Anh Tuan , Pham Sy Nguyen , Pham Ngoc Van , Nguyen Duc Hai , Pham Duy Trung , Nguyen Thi Kim Son , Hoang Viet Long
{"title":"A novel framework for cross-platform malware detection via AFSP and ADASYN-based balancing","authors":"Tong Anh Tuan , Pham Sy Nguyen , Pham Ngoc Van , Nguyen Duc Hai , Pham Duy Trung , Nguyen Thi Kim Son , Hoang Viet Long","doi":"10.1016/j.compeleceng.2025.110625","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid spread of malware and the growing complexity of attack methods demand accurate and scalable detection solutions, particularly in classification techniques in which both feature selection and model selection play a critical role. However, malware datasets are often high-dimensional and imbalanced, leading to biased models and suboptimal classification performance. This paper introduces CMF, a novel cross-platform malware detection framework that integrates Adaptive Feature Selection and Projection (AFSP) for dimensionality reduction, Adaptive Synthetic Sampling (ADASYN) for data balancing, and voting ensemble learning for classification. ADASYN consistently outperforms SMOTE by adaptively oversampling hard-to-learn boundary regions, improving minority class detection. Meanwhile, AFSP preserves feature structures while reducing dimensions, while PCA only retains maximal variance directions, making AFSP more effective for malware classification. Extensive experiments on four comprehensive available malware datasets demonstrate that CMF outperforms traditional and deep learning-based approaches, achieving superior accuracy and robustness. Notably, the highest improvement was close to 5% compared to the state-of-the-art on the CIC-MalMem-2022 (16 classes) dataset. CMF framework is highly effective detection of malware variants across multiple operating systems, for instance Windows, Linux, and Android, and heterogeneous cloud environments. This confirms CMF framework as a scalable and high-performance solution for real-world malware detection across environmental diversity.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"128 ","pages":"Article 110625"},"PeriodicalIF":4.9000,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625005683","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The rapid spread of malware and the growing complexity of attack methods demand accurate and scalable detection solutions, particularly in classification techniques in which both feature selection and model selection play a critical role. However, malware datasets are often high-dimensional and imbalanced, leading to biased models and suboptimal classification performance. This paper introduces CMF, a novel cross-platform malware detection framework that integrates Adaptive Feature Selection and Projection (AFSP) for dimensionality reduction, Adaptive Synthetic Sampling (ADASYN) for data balancing, and voting ensemble learning for classification. ADASYN consistently outperforms SMOTE by adaptively oversampling hard-to-learn boundary regions, improving minority class detection. Meanwhile, AFSP preserves feature structures while reducing dimensions, while PCA only retains maximal variance directions, making AFSP more effective for malware classification. Extensive experiments on four comprehensive available malware datasets demonstrate that CMF outperforms traditional and deep learning-based approaches, achieving superior accuracy and robustness. Notably, the highest improvement was close to 5% compared to the state-of-the-art on the CIC-MalMem-2022 (16 classes) dataset. CMF framework is highly effective detection of malware variants across multiple operating systems, for instance Windows, Linux, and Android, and heterogeneous cloud environments. This confirms CMF framework as a scalable and high-performance solution for real-world malware detection across environmental diversity.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.