A particle swarm optimization based multiobjective memetic algorithm for high-dimensional feature selection

IF 3.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Memetic Computing Pub Date : 2022-01-29 DOI:10.1007/s12293-022-00354-z

Juanjuan Luo, Dongqing Zhou, Lingling Jiang, Huadong Ma

{"title":"A particle swarm optimization based multiobjective memetic algorithm for high-dimensional feature selection","authors":"Juanjuan Luo, Dongqing Zhou, Lingling Jiang, Huadong Ma","doi":"10.1007/s12293-022-00354-z","DOIUrl":null,"url":null,"abstract":"<p>Feature selection, as one of the dimension reduction methods, is a crucial processing step in dealing with high-dimensional data. It tries to preserve feature subset representing the whole feature space, which aims to reduce redundancy and increase the classification accuracy. Since the two objectives are usually in conflict with each other, feature selection is modeled as a multi-objective problem. However, the high search space and discrete Pareto front makes it not easy for existing evolutionary multiobjective algorithms. Classic evolutionary computation method, which is often applied to feature selection problem straightforwardly, gradually exposes its inefficiency in searching process. Hence, a particle swarm optimization based multiobjective memetic algorithm for high-dimensional feature selection is designed in this paper to deal with above shortcomings. Its basic idea is to model feature selection as a multiobjective optimization problem by optimizing the number of features and the classification accuracy in supervised condition simultaneously, in which information entropy based initialization and adaptive local search are designed to improve the search efficiency. Moreover, a new particle velocity update rule considering both convergence and diversity of solutions is designed to update particles, and a fast discrete nondominated sorting strategy is designed to rank the Pareto solutions. These strategies enable the proposed algorithm to gain better performance on both the quality and size of feature subset. The experimental results show that the proposed algorithm can improve the quality of Pareto fronts evolved by the state-of-the-art algorithms for feature selection.</p>","PeriodicalId":48780,"journal":{"name":"Memetic Computing","volume":"33 4","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2022-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Memetic Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12293-022-00354-z","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 10

Abstract

Feature selection, as one of the dimension reduction methods, is a crucial processing step in dealing with high-dimensional data. It tries to preserve feature subset representing the whole feature space, which aims to reduce redundancy and increase the classification accuracy. Since the two objectives are usually in conflict with each other, feature selection is modeled as a multi-objective problem. However, the high search space and discrete Pareto front makes it not easy for existing evolutionary multiobjective algorithms. Classic evolutionary computation method, which is often applied to feature selection problem straightforwardly, gradually exposes its inefficiency in searching process. Hence, a particle swarm optimization based multiobjective memetic algorithm for high-dimensional feature selection is designed in this paper to deal with above shortcomings. Its basic idea is to model feature selection as a multiobjective optimization problem by optimizing the number of features and the classification accuracy in supervised condition simultaneously, in which information entropy based initialization and adaptive local search are designed to improve the search efficiency. Moreover, a new particle velocity update rule considering both convergence and diversity of solutions is designed to update particles, and a fast discrete nondominated sorting strategy is designed to rank the Pareto solutions. These strategies enable the proposed algorithm to gain better performance on both the quality and size of feature subset. The experimental results show that the proposed algorithm can improve the quality of Pareto fronts evolved by the state-of-the-art algorithms for feature selection.

查看原文本刊更多论文

基于粒子群优化的多目标模因算法用于高维特征选择

特征选择作为降维方法之一，是处理高维数据的关键处理步骤。它试图保留代表整个特征空间的特征子集，以减少冗余，提高分类精度。由于这两个目标通常是相互冲突的，因此特征选择被建模为一个多目标问题。然而，现有的进化多目标算法由于搜索空间大、Pareto前离散等问题，求解起来并不容易。经典的进化计算方法通常直接用于特征选择问题，在搜索过程中逐渐暴露出其低效率。为此，本文设计了一种基于粒子群优化的多目标模因算法用于高维特征选择。其基本思想是将特征选择建模为一个多目标优化问题，同时对有监督条件下的特征数量和分类精度进行优化，其中设计了基于信息熵的初始化和自适应局部搜索来提高搜索效率。此外，设计了一种同时考虑解的收敛性和多样性的粒子速度更新规则来更新粒子，并设计了一种快速离散非支配排序策略来对Pareto解进行排序。这些策略使得所提算法在特征子集的质量和大小上都获得了更好的性能。实验结果表明，该算法可以提高当前特征选择算法得到的Pareto前沿的质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Memetic Computing COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-OPERATIONS RESEARCH & MANAGEMENT SCIENCE

CiteScore

6.80

自引率

12.80%

发文量

期刊介绍： Memes have been defined as basic units of transferrable information that reside in the brain and are propagated across populations through the process of imitation. From an algorithmic point of view, memes have come to be regarded as building-blocks of prior knowledge, expressed in arbitrary computational representations (e.g., local search heuristics, fuzzy rules, neural models, etc.), that have been acquired through experience by a human or machine, and can be imitated (i.e., reused) across problems. The Memetic Computing journal welcomes papers incorporating the aforementioned socio-cultural notion of memes into artificial systems, with particular emphasis on enhancing the efficacy of computational and artificial intelligence techniques for search, optimization, and machine learning through explicit prior knowledge incorporation. The goal of the journal is to thus be an outlet for high quality theoretical and applied research on hybrid, knowledge-driven computational approaches that may be characterized under any of the following categories of memetics: Type 1: General-purpose algorithms integrated with human-crafted heuristics that capture some form of prior domain knowledge; e.g., traditional memetic algorithms hybridizing evolutionary global search with a problem-specific local search. Type 2: Algorithms with the ability to automatically select, adapt, and reuse the most appropriate heuristics from a diverse pool of available choices; e.g., learning a mapping between global search operators and multiple local search schemes, given an optimization problem at hand. Type 3: Algorithms that autonomously learn with experience, adaptively reusing data and/or machine learning models drawn from related problems as prior knowledge in new target tasks of interest; examples include, but are not limited to, transfer learning and optimization, multi-task learning and optimization, or any other multi-X evolutionary learning and optimization methodologies.