{"title":"A Multiform Framework for Multiobjective Feature Selection in Unbalanced Classification: Combining Oversampling and Cost-Sensitive Learning","authors":"Jing Liang;Yu-Yang Zhang;Boyang Qu;Ke Chen;Kunjie Yu;Caitong Yue","doi":"10.1109/TSMC.2025.3573080","DOIUrl":null,"url":null,"abstract":"Unbalanced classification problems have attracted significant academic attention due to their widespread existence in the real world. The lack of recognition accuracy of minority class samples and the “curse of dimensionality” are two major difficulties in unbalanced classification problems. Existing unbalanced classification methods run the risk of losing the original feature information and are prone to bias toward the majority class. Multiform optimization is famous for capturing useful knowledge from alternative forms to help solve the original task. Motivated by this, this article introduces a multiform evolutionary framework that addresses the issue of multiobjective feature selection in unbalanced classification scenarios. It aims to utilize the advanced experience of selecting features on balanced datasets to assist in the search for feature subsets that can more accurately identify minority classes on the original dataset. Specifically, a knowledge transfer strategy is proposed to draw on the search experience of the auxiliary task from the oversampled dataset to help the cost-sensitive learning task based on the original dataset jump out of the local optimum. In addition, an offspring repairing mechanism is proposed to filter redundant features by considering the frequency of selected features. Experimental results on 23 real-world benchmark datasets demonstrate that the proposed method can select fewer features and achieve better classification results compared to six state-of-the-art multiobjective feature selection algorithms and three classical oversampling algorithms. Furthermore, the difference in performance of four base classifiers is investigated through a series of comparative experiments.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 8","pages":"5717-5729"},"PeriodicalIF":8.6000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man Cybernetics-Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11029247/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Unbalanced classification problems have attracted significant academic attention due to their widespread existence in the real world. The lack of recognition accuracy of minority class samples and the “curse of dimensionality” are two major difficulties in unbalanced classification problems. Existing unbalanced classification methods run the risk of losing the original feature information and are prone to bias toward the majority class. Multiform optimization is famous for capturing useful knowledge from alternative forms to help solve the original task. Motivated by this, this article introduces a multiform evolutionary framework that addresses the issue of multiobjective feature selection in unbalanced classification scenarios. It aims to utilize the advanced experience of selecting features on balanced datasets to assist in the search for feature subsets that can more accurately identify minority classes on the original dataset. Specifically, a knowledge transfer strategy is proposed to draw on the search experience of the auxiliary task from the oversampled dataset to help the cost-sensitive learning task based on the original dataset jump out of the local optimum. In addition, an offspring repairing mechanism is proposed to filter redundant features by considering the frequency of selected features. Experimental results on 23 real-world benchmark datasets demonstrate that the proposed method can select fewer features and achieve better classification results compared to six state-of-the-art multiobjective feature selection algorithms and three classical oversampling algorithms. Furthermore, the difference in performance of four base classifiers is investigated through a series of comparative experiments.
期刊介绍:
The IEEE Transactions on Systems, Man, and Cybernetics: Systems encompasses the fields of systems engineering, covering issue formulation, analysis, and modeling throughout the systems engineering lifecycle phases. It addresses decision-making, issue interpretation, systems management, processes, and various methods such as optimization, modeling, and simulation in the development and deployment of large systems.