多类失衡问题:分析与可能的解决方案。

Shuo Wang, Xin Yao
{"title":"多类失衡问题:分析与可能的解决方案。","authors":"Shuo Wang,&nbsp;Xin Yao","doi":"10.1109/TSMCB.2012.2187280","DOIUrl":null,"url":null,"abstract":"<p><p>Class imbalance problems have drawn growing interest recently because of their classification difficulty caused by the imbalanced class distributions. In particular, many ensemble methods have been proposed to deal with such imbalance. However, most efforts so far are only focused on two-class imbalance problems. There are unsolved issues in multiclass imbalance problems, which exist in real-world applications. This paper studies the challenges posed by the multiclass imbalance problems and investigates the generalization ability of some ensemble solutions, including our recently proposed algorithm AdaBoost.NC, with the aim of handling multiclass and imbalance effectively and directly. We first study the impact of multiminority and multimajority on the performance of two basic resampling techniques. They both present strong negative effects. \"Multimajority\" tends to be more harmful to the generalization performance. Motivated by the results, we then apply AdaBoost.NC to several real-world multiclass imbalance tasks and compare it to other popular ensemble methods. AdaBoost.NC is shown to be better at recognizing minority class examples and balancing the performance among classes in terms of G-mean without using any class decomposition. </p>","PeriodicalId":55006,"journal":{"name":"IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics","volume":" ","pages":"1119-30"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TSMCB.2012.2187280","citationCount":"440","resultStr":"{\"title\":\"Multiclass Imbalance Problems: Analysis and Potential Solutions.\",\"authors\":\"Shuo Wang,&nbsp;Xin Yao\",\"doi\":\"10.1109/TSMCB.2012.2187280\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Class imbalance problems have drawn growing interest recently because of their classification difficulty caused by the imbalanced class distributions. In particular, many ensemble methods have been proposed to deal with such imbalance. However, most efforts so far are only focused on two-class imbalance problems. There are unsolved issues in multiclass imbalance problems, which exist in real-world applications. This paper studies the challenges posed by the multiclass imbalance problems and investigates the generalization ability of some ensemble solutions, including our recently proposed algorithm AdaBoost.NC, with the aim of handling multiclass and imbalance effectively and directly. We first study the impact of multiminority and multimajority on the performance of two basic resampling techniques. They both present strong negative effects. \\\"Multimajority\\\" tends to be more harmful to the generalization performance. Motivated by the results, we then apply AdaBoost.NC to several real-world multiclass imbalance tasks and compare it to other popular ensemble methods. AdaBoost.NC is shown to be better at recognizing minority class examples and balancing the performance among classes in terms of G-mean without using any class decomposition. </p>\",\"PeriodicalId\":55006,\"journal\":{\"name\":\"IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics\",\"volume\":\" \",\"pages\":\"1119-30\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/TSMCB.2012.2187280\",\"citationCount\":\"440\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TSMCB.2012.2187280\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2012/3/16 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TSMCB.2012.2187280","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2012/3/16 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 440

摘要

由于班级分布不平衡导致班级分类困难,班级不平衡问题近年来引起了人们越来越多的关注。特别是,许多集成方法被提出来处理这种不平衡。然而,到目前为止,大多数努力只集中在两级失衡问题上。在实际应用中存在的多类不平衡问题中,有一些尚未解决的问题。本文研究了多类不平衡问题带来的挑战,并研究了一些集成解的泛化能力,包括我们最近提出的算法AdaBoost。数控,目的是有效、直接地处理多类和不平衡。我们首先研究了多少数和多多数对两种基本重采样技术性能的影响。它们都呈现出强烈的负面影响。“多多数”倾向于对泛化性能更有害。受到结果的激励,我们然后应用AdaBoost。NC应用于几个现实世界的多类不平衡任务,并将其与其他流行的集成方法进行比较。演算法。NC在识别少数类示例方面表现得更好,并且在不使用任何类分解的情况下,根据g均值平衡类之间的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multiclass Imbalance Problems: Analysis and Potential Solutions.

Class imbalance problems have drawn growing interest recently because of their classification difficulty caused by the imbalanced class distributions. In particular, many ensemble methods have been proposed to deal with such imbalance. However, most efforts so far are only focused on two-class imbalance problems. There are unsolved issues in multiclass imbalance problems, which exist in real-world applications. This paper studies the challenges posed by the multiclass imbalance problems and investigates the generalization ability of some ensemble solutions, including our recently proposed algorithm AdaBoost.NC, with the aim of handling multiclass and imbalance effectively and directly. We first study the impact of multiminority and multimajority on the performance of two basic resampling techniques. They both present strong negative effects. "Multimajority" tends to be more harmful to the generalization performance. Motivated by the results, we then apply AdaBoost.NC to several real-world multiclass imbalance tasks and compare it to other popular ensemble methods. AdaBoost.NC is shown to be better at recognizing minority class examples and balancing the performance among classes in terms of G-mean without using any class decomposition.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
审稿时长
6.0 months
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信