基于动态类平均损失的长尾分类

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-05-23 DOI:10.1016/j.eswa.2025.128292

Do Ryun Lee, Chang Ouk Kim

{"title":"基于动态类平均损失的长尾分类","authors":"Do Ryun Lee, Chang Ouk Kim","doi":"10.1016/j.eswa.2025.128292","DOIUrl":null,"url":null,"abstract":"<div><div>In real-world data distributions, class imbalance is a common issue. When training deep learning models on class-imbalanced data, the performance of classes with fewer samples tends to deteriorate. Numerous studies have addressed this problem, focusing on loss reweighting techniques based on the number of training samples per class. However, because some classes are inherently easier or harder to classify, having a larger number of samples in a particular class does not necessarily ensure lower loss or better learning for that class. Additionally, if the ratio of loss magnitudes differs substantially from the ratio of the number of training samples per class, reweighting based solely on sample size may be inappropriate. This study proposes a method to reweight losses based on dynamic class average losses rather than the number of training samples per class to address these issues. Specifically, this method evaluates the class average losses for each mini-batch, applies a nonlinear transformation to these values, and dynamically adjusts the class-wise loss weights within the loss function during training to better mitigate class imbalance. Experimental results from various types of datasets, including image and tabular data, demonstrate that the proposed method improves performance by 1%–8% across various datasets compared to existing methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"288 ","pages":"Article 128292"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Long-tailed classification based on dynamic class average loss\",\"authors\":\"Do Ryun Lee, Chang Ouk Kim\",\"doi\":\"10.1016/j.eswa.2025.128292\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In real-world data distributions, class imbalance is a common issue. When training deep learning models on class-imbalanced data, the performance of classes with fewer samples tends to deteriorate. Numerous studies have addressed this problem, focusing on loss reweighting techniques based on the number of training samples per class. However, because some classes are inherently easier or harder to classify, having a larger number of samples in a particular class does not necessarily ensure lower loss or better learning for that class. Additionally, if the ratio of loss magnitudes differs substantially from the ratio of the number of training samples per class, reweighting based solely on sample size may be inappropriate. This study proposes a method to reweight losses based on dynamic class average losses rather than the number of training samples per class to address these issues. Specifically, this method evaluates the class average losses for each mini-batch, applies a nonlinear transformation to these values, and dynamically adjusts the class-wise loss weights within the loss function during training to better mitigate class imbalance. Experimental results from various types of datasets, including image and tabular data, demonstrate that the proposed method improves performance by 1%–8% across various datasets compared to existing methods.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"288 \",\"pages\":\"Article 128292\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425019116\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425019116","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在现实世界的数据分布中，类不平衡是一个常见的问题。当在类不平衡数据上训练深度学习模型时，样本较少的类的性能趋于恶化。许多研究已经解决了这个问题，重点是基于每类训练样本数量的损失重加权技术。然而，由于某些类天生就容易或难以分类，因此在特定类中拥有更多的样本数量并不一定确保该类的损失更低或学习效果更好。此外，如果损失幅度的比率与每个类的训练样本数量的比率有很大的不同，那么仅仅基于样本大小的重新加权可能是不合适的。本研究提出了一种基于动态类平均损失而不是基于每个类的训练样本数量来重加权损失的方法来解决这些问题。具体来说，该方法评估每个小批的类平均损失，对这些值进行非线性变换，并在训练过程中动态调整损失函数内的类损失权重，以更好地缓解类不平衡。不同类型数据集（包括图像和表格数据）的实验结果表明，与现有方法相比，该方法在不同数据集上的性能提高了1%-8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Long-tailed classification based on dynamic class average loss

In real-world data distributions, class imbalance is a common issue. When training deep learning models on class-imbalanced data, the performance of classes with fewer samples tends to deteriorate. Numerous studies have addressed this problem, focusing on loss reweighting techniques based on the number of training samples per class. However, because some classes are inherently easier or harder to classify, having a larger number of samples in a particular class does not necessarily ensure lower loss or better learning for that class. Additionally, if the ratio of loss magnitudes differs substantially from the ratio of the number of training samples per class, reweighting based solely on sample size may be inappropriate. This study proposes a method to reweight losses based on dynamic class average losses rather than the number of training samples per class to address these issues. Specifically, this method evaluates the class average losses for each mini-batch, applies a nonlinear transformation to these values, and dynamically adjusts the class-wise loss weights within the loss function during training to better mitigate class imbalance. Experimental results from various types of datasets, including image and tabular data, demonstrate that the proposed method improves performance by 1%–8% across various datasets compared to existing methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.