A Multiform Framework for Multiobjective Feature Selection in Unbalanced Classification: Combining Oversampling and Cost-Sensitive Learning

IF 8.6 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems Pub Date : 2025-06-10 DOI:10.1109/TSMC.2025.3573080

Jing Liang;Yu-Yang Zhang;Boyang Qu;Ke Chen;Kunjie Yu;Caitong Yue

{"title":"A Multiform Framework for Multiobjective Feature Selection in Unbalanced Classification: Combining Oversampling and Cost-Sensitive Learning","authors":"Jing Liang;Yu-Yang Zhang;Boyang Qu;Ke Chen;Kunjie Yu;Caitong Yue","doi":"10.1109/TSMC.2025.3573080","DOIUrl":null,"url":null,"abstract":"Unbalanced classification problems have attracted significant academic attention due to their widespread existence in the real world. The lack of recognition accuracy of minority class samples and the “curse of dimensionality” are two major difficulties in unbalanced classification problems. Existing unbalanced classification methods run the risk of losing the original feature information and are prone to bias toward the majority class. Multiform optimization is famous for capturing useful knowledge from alternative forms to help solve the original task. Motivated by this, this article introduces a multiform evolutionary framework that addresses the issue of multiobjective feature selection in unbalanced classification scenarios. It aims to utilize the advanced experience of selecting features on balanced datasets to assist in the search for feature subsets that can more accurately identify minority classes on the original dataset. Specifically, a knowledge transfer strategy is proposed to draw on the search experience of the auxiliary task from the oversampled dataset to help the cost-sensitive learning task based on the original dataset jump out of the local optimum. In addition, an offspring repairing mechanism is proposed to filter redundant features by considering the frequency of selected features. Experimental results on 23 real-world benchmark datasets demonstrate that the proposed method can select fewer features and achieve better classification results compared to six state-of-the-art multiobjective feature selection algorithms and three classical oversampling algorithms. Furthermore, the difference in performance of four base classifiers is investigated through a series of comparative experiments.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 8","pages":"5717-5729"},"PeriodicalIF":8.6000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man Cybernetics-Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11029247/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Unbalanced classification problems have attracted significant academic attention due to their widespread existence in the real world. The lack of recognition accuracy of minority class samples and the “curse of dimensionality” are two major difficulties in unbalanced classification problems. Existing unbalanced classification methods run the risk of losing the original feature information and are prone to bias toward the majority class. Multiform optimization is famous for capturing useful knowledge from alternative forms to help solve the original task. Motivated by this, this article introduces a multiform evolutionary framework that addresses the issue of multiobjective feature selection in unbalanced classification scenarios. It aims to utilize the advanced experience of selecting features on balanced datasets to assist in the search for feature subsets that can more accurately identify minority classes on the original dataset. Specifically, a knowledge transfer strategy is proposed to draw on the search experience of the auxiliary task from the oversampled dataset to help the cost-sensitive learning task based on the original dataset jump out of the local optimum. In addition, an offspring repairing mechanism is proposed to filter redundant features by considering the frequency of selected features. Experimental results on 23 real-world benchmark datasets demonstrate that the proposed method can select fewer features and achieve better classification results compared to six state-of-the-art multiobjective feature selection algorithms and three classical oversampling algorithms. Furthermore, the difference in performance of four base classifiers is investigated through a series of comparative experiments.

查看原文本刊更多论文

非平衡分类中多目标特征选择的多形式框架：结合过采样和代价敏感学习

不平衡分类问题由于其在现实世界中的广泛存在而引起了学术界的广泛关注。少数类样本的识别精度不足和“维数诅咒”是不平衡分类问题的两大难点。现有的不平衡分类方法存在丢失原始特征信息的风险，并且容易偏向多数类。多表单优化以从备选表单中获取有用的知识来帮助解决原始任务而闻名。受此启发，本文介绍了一个多形式的进化框架，解决了不平衡分类场景下的多目标特征选择问题。它旨在利用在平衡数据集上选择特征的高级经验来帮助搜索可以更准确地识别原始数据集上的少数类的特征子集。具体而言，提出了一种知识转移策略，利用辅助任务在过采样数据集中的搜索经验，帮助基于原始数据集的代价敏感学习任务跳出局部最优。此外，提出了一种后代修复机制，通过考虑所选特征的频率来过滤冗余特征。在23个真实基准数据集上的实验结果表明，与6种最先进的多目标特征选择算法和3种经典的过采样算法相比，该方法可以选择更少的特征，取得更好的分类效果。此外，通过一系列对比实验研究了四种基分类器的性能差异。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Systems Man Cybernetics-Systems AUTOMATION & CONTROL SYSTEMS-COMPUTER SCIENCE, CYBERNETICS

CiteScore

18.50

自引率

11.50%

发文量

812

审稿时长

6 months

期刊介绍： The IEEE Transactions on Systems, Man, and Cybernetics: Systems encompasses the fields of systems engineering, covering issue formulation, analysis, and modeling throughout the systems engineering lifecycle phases. It addresses decision-making, issue interpretation, systems management, processes, and various methods such as optimization, modeling, and simulation in the development and deployment of large systems.