Prototype-wise self-knowledge distillation for few-shot segmentation

IF 3.4 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Signal Processing-Image Communication Pub Date : 2024-08-21 DOI:10.1016/j.image.2024.117186

Yadang Chen , Xinyu Xu , Chenchen Wei , Chuhan Lu

{"title":"Prototype-wise self-knowledge distillation for few-shot segmentation","authors":"Yadang Chen , Xinyu Xu , Chenchen Wei , Chuhan Lu","doi":"10.1016/j.image.2024.117186","DOIUrl":null,"url":null,"abstract":"<div>Few-shot segmentation was proposed to obtain segmentation results for a image with an unseen class by referring to a few labeled samples. However, due to the limited number of samples, many few-shot segmentation models suffer from poor generalization. Prototypical network-based few-shot segmentation still has issues with spatial inconsistency and prototype bias. Since the target class has different appearance in each image, some specific features in the prototypes generated from the support image and its mask do not accurately reflect the generalized features of the target class. To address the support prototype consistency issue, we put forward two modules: Data Augmentation Self-knowledge Distillation (DASKD) and Prototype-wise Regularization (PWR). The DASKD module focuses on enhancing spatial consistency by using data augmentation and self-knowledge distillation. Self-knowledge distillation helps the model acquire generalized features of the target class and learn hidden knowledge from the support images. The PWR module focuses on obtaining a more representative support prototype by conducting prototype-level loss to obtain support prototypes closer to the category center. Broad evaluation experiments on PASCAL-<math><msup><mrow><mn>5</mn></mrow><mrow><mi>i</mi></mrow></msup></math> and COCO-<math><mrow><mn>2</mn><msup><mrow><mn>0</mn></mrow><mrow><mi>i</mi></mrow></msup></mrow></math> demonstrate that our model outperforms the prior works on few-shot segmentation. Our approach surpasses the state of the art by 7.5% in PASCAL-<math><msup><mrow><mn>5</mn></mrow><mrow><mi>i</mi></mrow></msup></math> and 4.2% in COCO-<math><mrow><mn>2</mn><msup><mrow><mn>0</mn></mrow><mrow><mi>i</mi></mrow></msup></mrow></math>.</div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"129 ","pages":"Article 117186"},"PeriodicalIF":3.4000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing-Image Communication","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0923596524000870","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Few-shot segmentation was proposed to obtain segmentation results for a image with an unseen class by referring to a few labeled samples. However, due to the limited number of samples, many few-shot segmentation models suffer from poor generalization. Prototypical network-based few-shot segmentation still has issues with spatial inconsistency and prototype bias. Since the target class has different appearance in each image, some specific features in the prototypes generated from the support image and its mask do not accurately reflect the generalized features of the target class. To address the support prototype consistency issue, we put forward two modules: Data Augmentation Self-knowledge Distillation (DASKD) and Prototype-wise Regularization (PWR). The DASKD module focuses on enhancing spatial consistency by using data augmentation and self-knowledge distillation. Self-knowledge distillation helps the model acquire generalized features of the target class and learn hidden knowledge from the support images. The PWR module focuses on obtaining a more representative support prototype by conducting prototype-level loss to obtain support prototypes closer to the category center. Broad evaluation experiments on PASCAL- $5^{i}$ and COCO- $2 0^{i}$ demonstrate that our model outperforms the prior works on few-shot segmentation. Our approach surpasses the state of the art by 7.5% in PASCAL- $5^{i}$ and 4.2% in COCO- $2 0^{i}$ .

查看原文本刊更多论文

以原型为导向，提炼自我知识，进行少量细分

少数镜头分割法的提出是为了通过参考少数标注样本来获得未见类别图像的分割结果。然而，由于样本数量有限，许多少数镜头分割模型都存在泛化能力差的问题。基于原型网络的少拍分割仍然存在空间不一致和原型偏差的问题。由于目标类别在每幅图像中都有不同的外观，因此由支持图像及其掩膜生成的原型中的某些特定特征并不能准确反映目标类别的概括特征。为了解决支持原型一致性问题，我们提出了两个模块：数据增强自知蒸馏（DASKD）和原型正则化（PWR）。DASKD 模块的重点是通过数据扩增和自我知识提炼来增强空间一致性。自知提炼有助于模型获取目标类别的通用特征，并从支持图像中学习隐藏知识。PWR 模块的重点是通过原型级损耗获得更具代表性的支持原型，从而获得更接近类别中心的支持原型。在 PASCAL-5i 和 COCO-20i 上进行的广泛评估实验表明，我们的模型在少镜头分割方面优于之前的研究成果。在 PASCAL-5i 和 COCO-20i 中，我们的方法分别比现有技术高出 7.5% 和 4.2%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Signal Processing-Image Communication 工程技术-工程：电子与电气

CiteScore

8.40

自引率

2.90%

发文量

138

审稿时长

5.2 months

期刊介绍： Signal Processing: Image Communication is an international journal for the development of the theory and practice of image communication. Its primary objectives are the following: To present a forum for the advancement of theory and practice of image communication. To stimulate cross-fertilization between areas similar in nature which have traditionally been separated, for example, various aspects of visual communications and information systems. To contribute to a rapid information exchange between the industrial and academic environments. The editorial policy and the technical content of the journal are the responsibility of the Editor-in-Chief, the Area Editors and the Advisory Editors. The Journal is self-supporting from subscription income and contains a minimum amount of advertisements. Advertisements are subject to the prior approval of the Editor-in-Chief. The journal welcomes contributions from every country in the world. Signal Processing: Image Communication publishes articles relating to aspects of the design, implementation and use of image communication systems. The journal features original research work, tutorial and review articles, and accounts of practical developments. Subjects of interest include image/video coding, 3D video representations and compression, 3D graphics and animation compression, HDTV and 3DTV systems, video adaptation, video over IP, peer-to-peer video networking, interactive visual communication, multi-user video conferencing, wireless video broadcasting and communication, visual surveillance, 2D and 3D image/video quality measures, pre/post processing, video restoration and super-resolution, multi-camera video analysis, motion analysis, content-based image/video indexing and retrieval, face and gesture processing, video synthesis, 2D and 3D image/video acquisition and display technologies, architectures for image/video processing and communication.