Bi-granularity balance learning for long-tailed image classification

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Vision and Image Understanding Pub Date : 2025-08-20 DOI:10.1016/j.cviu.2025.104469

Ning Ren , Xiaosong Li , Yanxia Wu , Yan Fu

{"title":"Bi-granularity balance learning for long-tailed image classification","authors":"Ning Ren , Xiaosong Li , Yanxia Wu , Yan Fu","doi":"10.1016/j.cviu.2025.104469","DOIUrl":null,"url":null,"abstract":"<div><div>In long-tailed datasets, the training of deep neural network-based models faces challenges, where the model may become biased towards the head classes with abundant training data, resulting in poor performance on tail classes with limited samples. Most current methods employ contrastive learning to learn more balanced representations by finding the class center. However, these methods use class centers to address local imbalance within a mini-batch, they overlook the global imbalance between batches throughout an epoch, caused by the long-tailed distribution of the dataset. In this paper, we propose <strong>bi-granularity balance</strong> learning to address the two-layer imbalance. We decouple the attraction–repulsion term in contrastive loss into two independent components: global and local balance. The global balance component focuses on capturing semantic information from different perspectives of the image and shifting learning attention from the head classes to the tail classes in the global perspective. The local balance component aims to learn inter-class separability from the local perspective. The proposed method efficiently learns the intra-class compactness and inter-class separability in long-tailed model training and improves the performance of the long-tailed model. Experimental results show that the proposed method achieves competitive performance on long-tailed benchmarks such as CIFAR-10/100-LT, TinyImageNet-LT, and iNaturalist 2018.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"261 ","pages":"Article 104469"},"PeriodicalIF":3.5000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314225001924","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In long-tailed datasets, the training of deep neural network-based models faces challenges, where the model may become biased towards the head classes with abundant training data, resulting in poor performance on tail classes with limited samples. Most current methods employ contrastive learning to learn more balanced representations by finding the class center. However, these methods use class centers to address local imbalance within a mini-batch, they overlook the global imbalance between batches throughout an epoch, caused by the long-tailed distribution of the dataset. In this paper, we propose bi-granularity balance learning to address the two-layer imbalance. We decouple the attraction–repulsion term in contrastive loss into two independent components: global and local balance. The global balance component focuses on capturing semantic information from different perspectives of the image and shifting learning attention from the head classes to the tail classes in the global perspective. The local balance component aims to learn inter-class separability from the local perspective. The proposed method efficiently learns the intra-class compactness and inter-class separability in long-tailed model training and improves the performance of the long-tailed model. Experimental results show that the proposed method achieves competitive performance on long-tailed benchmarks such as CIFAR-10/100-LT, TinyImageNet-LT, and iNaturalist 2018.

查看原文本刊更多论文

基于双粒度平衡学习的长尾图像分类

在长尾数据集中，基于深度神经网络的模型训练面临挑战，模型可能会偏向训练数据丰富的头部类，导致样本有限的尾部类训练效果不佳。目前大多数方法采用对比学习，通过寻找类中心来学习更平衡的表征。然而，这些方法使用类中心来解决小批内的局部不平衡，它们忽略了整个epoch中批次之间的全局不平衡，这是由数据集的长尾分布引起的。在本文中，我们提出了双粒度平衡学习来解决两层不平衡。我们将对比损失中的吸引-排斥项解耦为两个独立的分量：全局平衡和局部平衡。全局平衡组件侧重于从图像的不同角度捕获语义信息，并在全局视角下将学习注意力从头部类转移到尾部类。局部平衡组件旨在从局部角度学习类间可分离性。该方法有效地学习了长尾模型训练中的类内紧密性和类间可分离性，提高了长尾模型的性能。实验结果表明，该方法在CIFAR-10/100-LT、TinyImageNet-LT和iNaturalist 2018等长尾基准测试上取得了具有竞争力的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Vision and Image Understanding 工程技术-工程：电子与电气

CiteScore

7.80

自引率

4.40%

发文量

112

审稿时长

79 days

期刊介绍： The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems