{"title":"An improved ensemble approach for imbalanced classification problems","authors":"B. Krawczyk, G. Schaefer","doi":"10.1109/SACI.2013.6609011","DOIUrl":null,"url":null,"abstract":"Classification of imbalanced data is a challenging task in machine learning, as most classification approaches tend to bias towards the majority class, even though the minority class is often the one of greater importance. Consequently, methods that are capable of boosting the classification accuracy on the minority class are sought after. In this paper, we propose an improved ensemble approach for imbalanced classification. Our algorithm is based on undersampling of the majority class to create balanced object subspaces, on which individual classifiers are trained. As not all generated classifiers will be useful for the ensemble construction, we carry out a pruning procedure to discard irrelevant models. This classifier selection is based on a diversity measure to identify mutually complementary classifiers. The remaining predictors are combined using a trained fuser based on discriminants. Extensive experimental results on several benchmark datasets demonstrate our proposed method to adequately address class imbalance and to (statistically) outperform several state-of-the-art classifier ensembles dedicated to imbalanced classification.","PeriodicalId":304729,"journal":{"name":"2013 IEEE 8th International Symposium on Applied Computational Intelligence and Informatics (SACI)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 8th International Symposium on Applied Computational Intelligence and Informatics (SACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SACI.2013.6609011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
Classification of imbalanced data is a challenging task in machine learning, as most classification approaches tend to bias towards the majority class, even though the minority class is often the one of greater importance. Consequently, methods that are capable of boosting the classification accuracy on the minority class are sought after. In this paper, we propose an improved ensemble approach for imbalanced classification. Our algorithm is based on undersampling of the majority class to create balanced object subspaces, on which individual classifiers are trained. As not all generated classifiers will be useful for the ensemble construction, we carry out a pruning procedure to discard irrelevant models. This classifier selection is based on a diversity measure to identify mutually complementary classifiers. The remaining predictors are combined using a trained fuser based on discriminants. Extensive experimental results on several benchmark datasets demonstrate our proposed method to adequately address class imbalance and to (statistically) outperform several state-of-the-art classifier ensembles dedicated to imbalanced classification.