{"title":"EnFeSTDroid: Ensembled feature selection techniques based Android malware detection","authors":"Suruchi Jain , Hemant Goyal , Anshul Arora , Dhirendra Kumar","doi":"10.1016/j.compeleceng.2025.110763","DOIUrl":null,"url":null,"abstract":"<div><div>Android smartphones have gained widespread popularity since 2008, making them frequent targets for malware. To address these threats, researchers have developed various detection models. Most existing techniques either use a single feature selection technique or combine a very few selection techniques, which can lead to overlooking other important features. In this study, we propose a novel method that first extracts permissions from applications. Then, it applies six different feature selection techniques, namely Information Gain, Extra Tree Classifier, Chi-Square, Mean Term Frequency (MTF), Inverse Document Frequency (IDF), and Mean Term Frequency–Inverse Document Frequency (MTF–IDF), to rank the permissions from the most to least significant. Furthermore, it applies Friedman’s and Post hoc Nemenyi tests to combine the rankings and identify the most relevant and distinguishing features for classifying malware. The results show that our proposed model could accurately classify 96.27% of the samples. Our work is novel and significant, as we have combined six feature selection techniques to enable the model to leverage the advantages of all the methods, rather than relying on a single or a few techniques. The proposed work also outperforms several other existing works in the literature.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"129 ","pages":"Article 110763"},"PeriodicalIF":4.9000,"publicationDate":"2025-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625007062","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Android smartphones have gained widespread popularity since 2008, making them frequent targets for malware. To address these threats, researchers have developed various detection models. Most existing techniques either use a single feature selection technique or combine a very few selection techniques, which can lead to overlooking other important features. In this study, we propose a novel method that first extracts permissions from applications. Then, it applies six different feature selection techniques, namely Information Gain, Extra Tree Classifier, Chi-Square, Mean Term Frequency (MTF), Inverse Document Frequency (IDF), and Mean Term Frequency–Inverse Document Frequency (MTF–IDF), to rank the permissions from the most to least significant. Furthermore, it applies Friedman’s and Post hoc Nemenyi tests to combine the rankings and identify the most relevant and distinguishing features for classifying malware. The results show that our proposed model could accurately classify 96.27% of the samples. Our work is novel and significant, as we have combined six feature selection techniques to enable the model to leverage the advantages of all the methods, rather than relying on a single or a few techniques. The proposed work also outperforms several other existing works in the literature.
自2008年以来,安卓智能手机获得了广泛的普及,使其成为恶意软件的频繁目标。为了解决这些威胁,研究人员开发了各种检测模型。大多数现有的技术要么使用单一的特征选择技术,要么结合很少的选择技术,这可能导致忽略其他重要的特征。在这项研究中,我们提出了一种新颖的方法,首先从应用程序中提取权限。然后,它应用六种不同的特征选择技术,即信息增益、额外树分类器、卡方、平均项频率(MTF)、逆文档频率(IDF)和平均项频率-逆文档频率(MTF - IDF),对权限进行从最重要到最不重要的排序。此外,它应用弗里德曼和Post hoc Nemenyi测试来结合排名,并确定最相关和最显著的特征来分类恶意软件。结果表明,该模型对样本的分类准确率为96.27%。我们的工作是新颖而重要的,因为我们结合了六种特征选择技术,使模型能够利用所有方法的优点,而不是依赖于单一或几种技术。拟议的工作也优于其他几个现有的作品在文献中。
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.