Noureen Talpur , Shoaib-ul Hassan , Mohd Hafizul Afifi Abdullah , Abdulrahman Aminu Ghali , Ambreen Abdul Raheem , Shazia Khatoon , Norshakirah Aziz , Sivashankari Alaganandham
{"title":"Boosting classification accuracy using an efficient stochastic optimization technique for feature selection in high-dimensional data","authors":"Noureen Talpur , Shoaib-ul Hassan , Mohd Hafizul Afifi Abdullah , Abdulrahman Aminu Ghali , Ambreen Abdul Raheem , Shazia Khatoon , Norshakirah Aziz , Sivashankari Alaganandham","doi":"10.1016/j.swevo.2025.102025","DOIUrl":null,"url":null,"abstract":"<div><div>Many real-world problems involve a large number of features, among which several features are irrelevant or redundant. This problem not only increases the dimensionality but also reduces the classification performance of machine learning models. To address this issue, feature selection methods have been extensively used in the literature, either by applying existing algorithms or developing new algorithms. However, many of these approaches suffer from limitations such as insufficient feature reduction due to getting trapped in local minima in the large search space. Hence, this study proposed a recent stochastic optimization-based technique called the Osprey Optimization Algorithm (OOA). The OOA algorithm has the capability of balancing exploration and exploitation effectively during the search process, making it suitable for solving high-dimensional optimization tasks. To validate the efficiency of the selected feature subsets, the study employs the <em>k</em>-nearest neighbor (<em>k</em>-NN) classifier. Comparative results between OOA and five state-of-the-art algorithms show that OOA achieves the highest average classification accuracy of 89.22 %, while selecting the fewest average features of 70.63 and reduces the feature burden by 62.80 %. Moreover, the results of a non-parametric Wilcoxon signed-rank test based on classification accuracy show a <em>p</em>-value less than 5.00E-02, confirming a statistically significant difference in performance among the six algorithms.</div></div>","PeriodicalId":48682,"journal":{"name":"Swarm and Evolutionary Computation","volume":"97 ","pages":"Article 102025"},"PeriodicalIF":8.2000,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Swarm and Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S221065022500183X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Many real-world problems involve a large number of features, among which several features are irrelevant or redundant. This problem not only increases the dimensionality but also reduces the classification performance of machine learning models. To address this issue, feature selection methods have been extensively used in the literature, either by applying existing algorithms or developing new algorithms. However, many of these approaches suffer from limitations such as insufficient feature reduction due to getting trapped in local minima in the large search space. Hence, this study proposed a recent stochastic optimization-based technique called the Osprey Optimization Algorithm (OOA). The OOA algorithm has the capability of balancing exploration and exploitation effectively during the search process, making it suitable for solving high-dimensional optimization tasks. To validate the efficiency of the selected feature subsets, the study employs the k-nearest neighbor (k-NN) classifier. Comparative results between OOA and five state-of-the-art algorithms show that OOA achieves the highest average classification accuracy of 89.22 %, while selecting the fewest average features of 70.63 and reduces the feature burden by 62.80 %. Moreover, the results of a non-parametric Wilcoxon signed-rank test based on classification accuracy show a p-value less than 5.00E-02, confirming a statistically significant difference in performance among the six algorithms.
期刊介绍:
Swarm and Evolutionary Computation is a pioneering peer-reviewed journal focused on the latest research and advancements in nature-inspired intelligent computation using swarm and evolutionary algorithms. It covers theoretical, experimental, and practical aspects of these paradigms and their hybrids, promoting interdisciplinary research. The journal prioritizes the publication of high-quality, original articles that push the boundaries of evolutionary computation and swarm intelligence. Additionally, it welcomes survey papers on current topics and novel applications. Topics of interest include but are not limited to: Genetic Algorithms, and Genetic Programming, Evolution Strategies, and Evolutionary Programming, Differential Evolution, Artificial Immune Systems, Particle Swarms, Ant Colony, Bacterial Foraging, Artificial Bees, Fireflies Algorithm, Harmony Search, Artificial Life, Digital Organisms, Estimation of Distribution Algorithms, Stochastic Diffusion Search, Quantum Computing, Nano Computing, Membrane Computing, Human-centric Computing, Hybridization of Algorithms, Memetic Computing, Autonomic Computing, Self-organizing systems, Combinatorial, Discrete, Binary, Constrained, Multi-objective, Multi-modal, Dynamic, and Large-scale Optimization.