{"title":"基于集成机器学习的移动设备恶意软件攻击检测与分类方法","authors":"Eiman Alsharif, Maher Alharby","doi":"10.1007/s13369-025-10011-5","DOIUrl":null,"url":null,"abstract":"<div><p>The widespread use of mobile devices makes them targets for cybercriminals, especially with the rise of malware. Existing malware detection studies have limitations. These include focusing on subsets of datasets, using single classification approaches, and lacking usability in practical applications. This research develops a stacking ensemble method for detecting and classifying malware attacks on Android devices, employing supervised machine learning algorithms like Random Forest, Decision Tree, Gaussian Naive Bayes, K-Nearest Neighbors, and Logistic Regression. Using the CIC-AndMal2017 dataset, we apply data preprocessing techniques to address missing data and data imbalance. We employ various feature selection methods, including Random Forest Importance, Principal Component Analysis, and Correlation-Based Selection, to help reduce data dimensionality. We also utilize a grid search technique for hyperparameter tuning. We assess model performance using evaluation metrics, including accuracy, precision, recall, and F1 score. Additionally, we measure training and prediction times to ensure efficiency. The stacking technique achieved remarkable results, with 99.86% across all metrics (accuracy, precision, recall, and F1 score) for binary classification. For multi-class classification, the results were 97.0% accuracy, 97.03% precision, 97.07% recall, and 97.03% F1 score. Finally, we develop a user-friendly web application to enhance the accessibility and usability of the proposed models in detecting Android malware, ensuring broader adoption and practical application of the developed models.</p></div>","PeriodicalId":54354,"journal":{"name":"Arabian Journal for Science and Engineering","volume":"50 19","pages":"15825 - 15841"},"PeriodicalIF":2.9000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Ensemble Machine Learning Approach for Detecting and Classifying Malware Attacks on Mobile Devices\",\"authors\":\"Eiman Alsharif, Maher Alharby\",\"doi\":\"10.1007/s13369-025-10011-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The widespread use of mobile devices makes them targets for cybercriminals, especially with the rise of malware. Existing malware detection studies have limitations. These include focusing on subsets of datasets, using single classification approaches, and lacking usability in practical applications. This research develops a stacking ensemble method for detecting and classifying malware attacks on Android devices, employing supervised machine learning algorithms like Random Forest, Decision Tree, Gaussian Naive Bayes, K-Nearest Neighbors, and Logistic Regression. Using the CIC-AndMal2017 dataset, we apply data preprocessing techniques to address missing data and data imbalance. We employ various feature selection methods, including Random Forest Importance, Principal Component Analysis, and Correlation-Based Selection, to help reduce data dimensionality. We also utilize a grid search technique for hyperparameter tuning. We assess model performance using evaluation metrics, including accuracy, precision, recall, and F1 score. Additionally, we measure training and prediction times to ensure efficiency. The stacking technique achieved remarkable results, with 99.86% across all metrics (accuracy, precision, recall, and F1 score) for binary classification. For multi-class classification, the results were 97.0% accuracy, 97.03% precision, 97.07% recall, and 97.03% F1 score. Finally, we develop a user-friendly web application to enhance the accessibility and usability of the proposed models in detecting Android malware, ensuring broader adoption and practical application of the developed models.</p></div>\",\"PeriodicalId\":54354,\"journal\":{\"name\":\"Arabian Journal for Science and Engineering\",\"volume\":\"50 19\",\"pages\":\"15825 - 15841\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-02-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Arabian Journal for Science and Engineering\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s13369-025-10011-5\",\"RegionNum\":4,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal for Science and Engineering","FirstCategoryId":"103","ListUrlMain":"https://link.springer.com/article/10.1007/s13369-025-10011-5","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
An Ensemble Machine Learning Approach for Detecting and Classifying Malware Attacks on Mobile Devices
The widespread use of mobile devices makes them targets for cybercriminals, especially with the rise of malware. Existing malware detection studies have limitations. These include focusing on subsets of datasets, using single classification approaches, and lacking usability in practical applications. This research develops a stacking ensemble method for detecting and classifying malware attacks on Android devices, employing supervised machine learning algorithms like Random Forest, Decision Tree, Gaussian Naive Bayes, K-Nearest Neighbors, and Logistic Regression. Using the CIC-AndMal2017 dataset, we apply data preprocessing techniques to address missing data and data imbalance. We employ various feature selection methods, including Random Forest Importance, Principal Component Analysis, and Correlation-Based Selection, to help reduce data dimensionality. We also utilize a grid search technique for hyperparameter tuning. We assess model performance using evaluation metrics, including accuracy, precision, recall, and F1 score. Additionally, we measure training and prediction times to ensure efficiency. The stacking technique achieved remarkable results, with 99.86% across all metrics (accuracy, precision, recall, and F1 score) for binary classification. For multi-class classification, the results were 97.0% accuracy, 97.03% precision, 97.07% recall, and 97.03% F1 score. Finally, we develop a user-friendly web application to enhance the accessibility and usability of the proposed models in detecting Android malware, ensuring broader adoption and practical application of the developed models.
期刊介绍:
King Fahd University of Petroleum & Minerals (KFUPM) partnered with Springer to publish the Arabian Journal for Science and Engineering (AJSE).
AJSE, which has been published by KFUPM since 1975, is a recognized national, regional and international journal that provides a great opportunity for the dissemination of research advances from the Kingdom of Saudi Arabia, MENA and the world.