{"title":"Classification Hardness Based Adaptive Sampling Ensemble for Imbalanced Data Classification","authors":"Zenghao Cui;Ziyi Gao;Shuaibing Yue;Rui Wang;Haiyan Zhu","doi":"10.26599/TST.2024.9010149","DOIUrl":null,"url":null,"abstract":"Class imbalance can substantially affect classification tasks using traditional classifiers, especially when identifying instances of minority categories. In addition to class imbalance, other challenges can also hinder accurate classification. Researchers have explored various approaches to mitigate the effects of class imbalance. However, most studies focus only on processing correlations within a single category of samples. This paper introduces an ensemble framework called Inter- and Intra-Class Overlapping Ensemble (IICOE), which incorporates two sampling methods. The first method, which is based on classification hardness undersampling, targets majority category samples by using simple samples as the foundation for classification and improving performance by focusing on samples near classification boundaries. The second method addresses the issue of overfitting minority category samples in undersampling and ensemble learning. To mitigate this, an adaptive augment hybrid sampling method is proposed, which enhances the classification boundary of samples and reduces overfitting. This paper conducts multiple experiments on 15 public datasets and concludes that the IICOE ensemble framework outperforms other ensemble learning algorithms in classifying imbalanced data.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"30 6","pages":"2419-2433"},"PeriodicalIF":3.5000,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072117","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tsinghua Science and Technology","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11072117/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Multidisciplinary","Score":null,"Total":0}
引用次数: 0
Abstract
Class imbalance can substantially affect classification tasks using traditional classifiers, especially when identifying instances of minority categories. In addition to class imbalance, other challenges can also hinder accurate classification. Researchers have explored various approaches to mitigate the effects of class imbalance. However, most studies focus only on processing correlations within a single category of samples. This paper introduces an ensemble framework called Inter- and Intra-Class Overlapping Ensemble (IICOE), which incorporates two sampling methods. The first method, which is based on classification hardness undersampling, targets majority category samples by using simple samples as the foundation for classification and improving performance by focusing on samples near classification boundaries. The second method addresses the issue of overfitting minority category samples in undersampling and ensemble learning. To mitigate this, an adaptive augment hybrid sampling method is proposed, which enhances the classification boundary of samples and reduces overfitting. This paper conducts multiple experiments on 15 public datasets and concludes that the IICOE ensemble framework outperforms other ensemble learning algorithms in classifying imbalanced data.
期刊介绍:
Tsinghua Science and Technology (Tsinghua Sci Technol) started publication in 1996. It is an international academic journal sponsored by Tsinghua University and is published bimonthly. This journal aims at presenting the up-to-date scientific achievements in computer science, electronic engineering, and other IT fields. Contributions all over the world are welcome.