Weixiang Gao , Caijuan Shi , Rui Wang , Ao Cai , Changyu Duan , Meiqin Liu
{"title":"Incremental few-shot instance segmentation via feature enhancement and prototype calibration","authors":"Weixiang Gao , Caijuan Shi , Rui Wang , Ao Cai , Changyu Duan , Meiqin Liu","doi":"10.1016/j.cviu.2025.104317","DOIUrl":null,"url":null,"abstract":"<div><div>Incremental few-shot instance segmentation (iFSIS) aims to detect and segment instances of novel classes with only a few training samples, while maintaining performance on base classes without revisiting base class data. iMTFA, a representative iFSIS method, offers a flexible approach for adding novel classes. Its key mechanism involves generating novel class weights by normalizing and averaging embeddings obtained from <span><math><mi>K</mi></math></span>-shot novel instances. However, relying on such a small sample size often leads to insufficient representation of the real class distribution, which in turn results in biased weights for the novel classes. Furthermore, due to the absence of novel fine-tuning, iMTFA tends to predict potential novel class foregrounds as background, which exacerbates the bias in the generated novel class weights. To overcome these limitations, we propose a simple but effective iFSIS method, named Enhancement and Calibration-based iMTFA (EC-iMTFA). Specifically, we first design an embedding enhancement and aggregation (EEA) module, which enhances the feature diversity of each novel instance embedding before generating novel class weights. We then design a novel prototype calibration (NPC) module that leverages the well-calibrated base class and background weights in the classifier to enhance the discriminability of novel class prototypes. In addition, a simple weight preprocessing (WP) mechanism is designed based on NPC to improve the calibration process further. Extensive experiments on COCO and VOC datasets demonstrate that EC-iMTFA outperforms iMTFA in terms of iFSIS and iFSOD performance, stability, and efficiency without requiring novel fine-tuning. Moreover, EC-iMTFA achieves competitive results compared to recent state-of-the-art methods.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"253 ","pages":"Article 104317"},"PeriodicalIF":4.3000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314225000402","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Incremental few-shot instance segmentation (iFSIS) aims to detect and segment instances of novel classes with only a few training samples, while maintaining performance on base classes without revisiting base class data. iMTFA, a representative iFSIS method, offers a flexible approach for adding novel classes. Its key mechanism involves generating novel class weights by normalizing and averaging embeddings obtained from -shot novel instances. However, relying on such a small sample size often leads to insufficient representation of the real class distribution, which in turn results in biased weights for the novel classes. Furthermore, due to the absence of novel fine-tuning, iMTFA tends to predict potential novel class foregrounds as background, which exacerbates the bias in the generated novel class weights. To overcome these limitations, we propose a simple but effective iFSIS method, named Enhancement and Calibration-based iMTFA (EC-iMTFA). Specifically, we first design an embedding enhancement and aggregation (EEA) module, which enhances the feature diversity of each novel instance embedding before generating novel class weights. We then design a novel prototype calibration (NPC) module that leverages the well-calibrated base class and background weights in the classifier to enhance the discriminability of novel class prototypes. In addition, a simple weight preprocessing (WP) mechanism is designed based on NPC to improve the calibration process further. Extensive experiments on COCO and VOC datasets demonstrate that EC-iMTFA outperforms iMTFA in terms of iFSIS and iFSOD performance, stability, and efficiency without requiring novel fine-tuning. Moreover, EC-iMTFA achieves competitive results compared to recent state-of-the-art methods.
期刊介绍:
The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views.
Research Areas Include:
• Theory
• Early vision
• Data structures and representations
• Shape
• Range
• Motion
• Matching and recognition
• Architecture and languages
• Vision systems