Yelleti Vivek , Vadlamani Ravi , Ponnuthurai Nagaratnam Suganthan , P. Radha Krishna
{"title":"Parallel fractional dominance MOEAs for feature subset selection in big data","authors":"Yelleti Vivek , Vadlamani Ravi , Ponnuthurai Nagaratnam Suganthan , P. Radha Krishna","doi":"10.1016/j.swevo.2024.101687","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, we solve the feature subset selection (FSS) problem with three objective functions namely, cardinality, area under receiver operating characteristic curve (AUC) and Matthews correlation coefficient (MCC) using novel multi-objective evolutionary algorithms (MOEAs). MOEAs often encounter poor convergence due to the increase in non-dominated solutions and getting entrapped in the local optima. This situation worsens when dealing with large, voluminous big and high-dimensional datasets. To address these challenges, we propose parallel, fractional dominance-based MOEAs for FSS under Spark. Further, to improve the exploitation of MOEAs, we introduce a novel batch opposition-based learning (BOP) along with a cardinality constraint on the opposite solution. Accordingly, we propose two variants, namely, BOP1 and BOP2. In BOP1, a single neighbour is randomly chosen in the opposite solution space, whereas in BOP2, a group of randomly chosen neighbours in the opposite solution space. In either case, the opposite solutions are evaluated to improve the exploitation capability of the underlying MOEAs. We observe that in terms of mean optimal objective function values and across all datasets, the proposed BOP2 variant of parallel fractional dominance-based algorithms emerges as the top performer in obtaining efficient solutions. Further, we introduce a novel metric, namely the ratio of hypervolume (HV) and inverted generated distance (IGD), HV/IGD, that combines both diversity and convergence. With respect to the mean HV/IGD computed over 20 runs and Formula 1 racing, the BOP1 variants of fractional dominance-based MOEAs outperformed other algorithms.</p></div>","PeriodicalId":48682,"journal":{"name":"Swarm and Evolutionary Computation","volume":"91 ","pages":"Article 101687"},"PeriodicalIF":8.2000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Swarm and Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210650224002256","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we solve the feature subset selection (FSS) problem with three objective functions namely, cardinality, area under receiver operating characteristic curve (AUC) and Matthews correlation coefficient (MCC) using novel multi-objective evolutionary algorithms (MOEAs). MOEAs often encounter poor convergence due to the increase in non-dominated solutions and getting entrapped in the local optima. This situation worsens when dealing with large, voluminous big and high-dimensional datasets. To address these challenges, we propose parallel, fractional dominance-based MOEAs for FSS under Spark. Further, to improve the exploitation of MOEAs, we introduce a novel batch opposition-based learning (BOP) along with a cardinality constraint on the opposite solution. Accordingly, we propose two variants, namely, BOP1 and BOP2. In BOP1, a single neighbour is randomly chosen in the opposite solution space, whereas in BOP2, a group of randomly chosen neighbours in the opposite solution space. In either case, the opposite solutions are evaluated to improve the exploitation capability of the underlying MOEAs. We observe that in terms of mean optimal objective function values and across all datasets, the proposed BOP2 variant of parallel fractional dominance-based algorithms emerges as the top performer in obtaining efficient solutions. Further, we introduce a novel metric, namely the ratio of hypervolume (HV) and inverted generated distance (IGD), HV/IGD, that combines both diversity and convergence. With respect to the mean HV/IGD computed over 20 runs and Formula 1 racing, the BOP1 variants of fractional dominance-based MOEAs outperformed other algorithms.
期刊介绍:
Swarm and Evolutionary Computation is a pioneering peer-reviewed journal focused on the latest research and advancements in nature-inspired intelligent computation using swarm and evolutionary algorithms. It covers theoretical, experimental, and practical aspects of these paradigms and their hybrids, promoting interdisciplinary research. The journal prioritizes the publication of high-quality, original articles that push the boundaries of evolutionary computation and swarm intelligence. Additionally, it welcomes survey papers on current topics and novel applications. Topics of interest include but are not limited to: Genetic Algorithms, and Genetic Programming, Evolution Strategies, and Evolutionary Programming, Differential Evolution, Artificial Immune Systems, Particle Swarms, Ant Colony, Bacterial Foraging, Artificial Bees, Fireflies Algorithm, Harmony Search, Artificial Life, Digital Organisms, Estimation of Distribution Algorithms, Stochastic Diffusion Search, Quantum Computing, Nano Computing, Membrane Computing, Human-centric Computing, Hybridization of Algorithms, Memetic Computing, Autonomic Computing, Self-organizing systems, Combinatorial, Discrete, Binary, Constrained, Multi-objective, Multi-modal, Dynamic, and Large-scale Optimization.