用于不平衡多类分类的 SAMME.C2 算法

IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Banghee So, Emiliano A. Valdez
{"title":"用于不平衡多类分类的 SAMME.C2 算法","authors":"Banghee So, Emiliano A. Valdez","doi":"10.1007/s00500-024-09847-0","DOIUrl":null,"url":null,"abstract":"<p>Classification predictive modeling involves the accurate assignment of observations in a dataset to target classes or categories. Real-world classification problems with severely imbalanced class distributions have increased substantially in recent years. In such cases, significantly fewer observations are available for minority classes to learn from than for majority classes. Despite this sparsity, the minority class is often considered as the more interesting class, yet the development of a scientific learning algorithm that is suitable for these observations presents numerous challenges. In this study, we further explore the merits of an effective multi-class classification algorithm known as <span>SAMME.C2</span> that is specialized for handling severely imbalanced classes. This innovative method blends the flexible mechanics of the boosting techniques from the <span>SAMME</span> algorithm, which is a multi-class classifier, and the <span>Ada.C2</span> algorithm, which is a cost-sensitive binary classifier that is designed to address highly imbalanced classes. We establish a scientific and statistical formulation of the <span>SAMME.C2</span> algorithm, together with providing and explaining the resulting procedure. We demonstrate the consistently superior performance of this algorithm through numerical experiments as well as empirical studies.</p>","PeriodicalId":22039,"journal":{"name":"Soft Computing","volume":"51 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SAMME.C2 algorithm for imbalanced multi-class classification\",\"authors\":\"Banghee So, Emiliano A. Valdez\",\"doi\":\"10.1007/s00500-024-09847-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Classification predictive modeling involves the accurate assignment of observations in a dataset to target classes or categories. Real-world classification problems with severely imbalanced class distributions have increased substantially in recent years. In such cases, significantly fewer observations are available for minority classes to learn from than for majority classes. Despite this sparsity, the minority class is often considered as the more interesting class, yet the development of a scientific learning algorithm that is suitable for these observations presents numerous challenges. In this study, we further explore the merits of an effective multi-class classification algorithm known as <span>SAMME.C2</span> that is specialized for handling severely imbalanced classes. This innovative method blends the flexible mechanics of the boosting techniques from the <span>SAMME</span> algorithm, which is a multi-class classifier, and the <span>Ada.C2</span> algorithm, which is a cost-sensitive binary classifier that is designed to address highly imbalanced classes. We establish a scientific and statistical formulation of the <span>SAMME.C2</span> algorithm, together with providing and explaining the resulting procedure. We demonstrate the consistently superior performance of this algorithm through numerical experiments as well as empirical studies.</p>\",\"PeriodicalId\":22039,\"journal\":{\"name\":\"Soft Computing\",\"volume\":\"51 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00500-024-09847-0\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00500-024-09847-0","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

分类预测建模涉及将数据集中的观测结果准确分配到目标类或类别中。近年来,现实世界中严重失衡类分布的分类问题大幅增加。在这种情况下,少数类可用于学习的观测数据明显少于多数类。尽管存在这种稀缺性,少数类往往被认为是更有趣的类,但开发适合这些观察结果的科学学习算法却面临诸多挑战。在本研究中,我们进一步探索了一种名为 SAMME.C2 的有效多类分类算法的优点,该算法专门用于处理严重不平衡的类。这种创新方法融合了 SAMME 算法(一种多类分类器)和 Ada.C2 算法(一种成本敏感的二进制分类器,专为处理高度不平衡类而设计)中提升技术的灵活机制。我们对 SAMME.C2 算法进行了科学的统计表述,并提供和解释了由此产生的程序。我们通过数值实验和实证研究证明了该算法始终如一的卓越性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

SAMME.C2 algorithm for imbalanced multi-class classification

SAMME.C2 algorithm for imbalanced multi-class classification

Classification predictive modeling involves the accurate assignment of observations in a dataset to target classes or categories. Real-world classification problems with severely imbalanced class distributions have increased substantially in recent years. In such cases, significantly fewer observations are available for minority classes to learn from than for majority classes. Despite this sparsity, the minority class is often considered as the more interesting class, yet the development of a scientific learning algorithm that is suitable for these observations presents numerous challenges. In this study, we further explore the merits of an effective multi-class classification algorithm known as SAMME.C2 that is specialized for handling severely imbalanced classes. This innovative method blends the flexible mechanics of the boosting techniques from the SAMME algorithm, which is a multi-class classifier, and the Ada.C2 algorithm, which is a cost-sensitive binary classifier that is designed to address highly imbalanced classes. We establish a scientific and statistical formulation of the SAMME.C2 algorithm, together with providing and explaining the resulting procedure. We demonstrate the consistently superior performance of this algorithm through numerical experiments as well as empirical studies.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Soft Computing
Soft Computing 工程技术-计算机:跨学科应用
CiteScore
8.10
自引率
9.80%
发文量
927
审稿时长
7.3 months
期刊介绍: Soft Computing is dedicated to system solutions based on soft computing techniques. It provides rapid dissemination of important results in soft computing technologies, a fusion of research in evolutionary algorithms and genetic programming, neural science and neural net systems, fuzzy set theory and fuzzy systems, and chaos theory and chaotic systems. Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. By linking the ideas and techniques of soft computing with other disciplines, the journal serves as a unifying platform that fosters comparisons, extensions, and new applications. As a result, the journal is an international forum for all scientists and engineers engaged in research and development in this fast growing field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信