SMEM: A Subspace Merging Based Evolutionary Method for High-Dimensional Feature Selection

IF 5.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Emerging Topics in Computational Intelligence Pub Date : 2024-09-04 DOI:10.1109/TETCI.2024.3451695

Kaixuan Li;Shibo Jiang;Rui Zhang;Jianfeng Qiu;Lei Zhang;Lixia Yang;Fan Cheng

{"title":"SMEM: A Subspace Merging Based Evolutionary Method for High-Dimensional Feature Selection","authors":"Kaixuan Li;Shibo Jiang;Rui Zhang;Jianfeng Qiu;Lei Zhang;Lixia Yang;Fan Cheng","doi":"10.1109/TETCI.2024.3451695","DOIUrl":null,"url":null,"abstract":"In the past decade, evolutionary algorithms (EAs) have shown their promising performance in solving the problem of feature selection. Despite that, it is still quite challenging to design the EAs for high-dimensional feature selection (HDFS), since the increasing number of features causes the search space of EAs grows exponentially, which is known as the “curse of dimensionality”. To tackle the issue, in this paper, a <bold>S</b>ubspace <bold>M</b>erging based <bold>E</b>volutionary <bold>M</b>ethod, termed SMEM is suggested. In SMEM, to avoid directly optimizing the large search space of HDFS, the original feature space of HDFS is firstly divided into several independent low-dimensional subspaces. In each subspace, a subpopulation is evolved to obtain the latent good feature subsets quickly. Then, to avoid some features being missed, these low-dimensional subspaces merge in pairs, and the further search is carried on the merged subspaces. During the evolving of each merged subspace, the good feature subsets obtained from previous subspace pair are fully utilized. The above subspace merging procedure repeats, and the performance of SMEM is improved gradually, until in the end, all the subspaces are merged into one final space. At that time, the final space is also the original feature space in HDFS, which ensures all the features in the data is considered. Experimental results on different high-dimensional datasets demonstrate the effectiveness and the efficiency of the proposed SMEM, when compared with the state-of-the-arts.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"1712-1727"},"PeriodicalIF":5.3000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10665901/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In the past decade, evolutionary algorithms (EAs) have shown their promising performance in solving the problem of feature selection. Despite that, it is still quite challenging to design the EAs for high-dimensional feature selection (HDFS), since the increasing number of features causes the search space of EAs grows exponentially, which is known as the “curse of dimensionality”. To tackle the issue, in this paper, a Subspace Merging based Evolutionary Method, termed SMEM is suggested. In SMEM, to avoid directly optimizing the large search space of HDFS, the original feature space of HDFS is firstly divided into several independent low-dimensional subspaces. In each subspace, a subpopulation is evolved to obtain the latent good feature subsets quickly. Then, to avoid some features being missed, these low-dimensional subspaces merge in pairs, and the further search is carried on the merged subspaces. During the evolving of each merged subspace, the good feature subsets obtained from previous subspace pair are fully utilized. The above subspace merging procedure repeats, and the performance of SMEM is improved gradually, until in the end, all the subspaces are merged into one final space. At that time, the final space is also the original feature space in HDFS, which ensures all the features in the data is considered. Experimental results on different high-dimensional datasets demonstrate the effectiveness and the efficiency of the proposed SMEM, when compared with the state-of-the-arts.

查看原文本刊更多论文

基于子空间融合的高维特征选择进化方法

在过去的十年中，进化算法在解决特征选择问题方面显示出了良好的表现。尽管如此，为高维特征选择（HDFS）设计ea仍然是相当具有挑战性的，因为特征数量的增加导致ea的搜索空间呈指数级增长，这被称为“维度诅咒”。为了解决这一问题，本文提出了一种基于子空间合并的进化方法（SMEM）。在SMEM中，为了避免直接优化HDFS庞大的搜索空间，首先将HDFS原有的特征空间划分为几个独立的低维子空间。在每个子空间中进化一个子种群，以快速获得潜在的良好特征子集。然后，为了避免遗漏某些特征，将这些低维子空间成对合并，在合并后的子空间上进行进一步的搜索。在每个合并子空间的演化过程中，充分利用了前一子空间对得到的良好特征子集。重复上述子空间合并过程，SMEM的性能逐渐提高，直到最后将所有子空间合并为一个最终空间。此时，最终空间也是HDFS中的原始特征空间，保证了数据中的所有特征都被考虑在内。在不同高维数据集上的实验结果表明了该方法的有效性和效率，并与现有方法进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computational Intelligence Mathematics-Control and Optimization

CiteScore

10.30

自引率

7.50%

发文量

147

期刊介绍： The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.