Few-shot object detection via data augmentation and distribution calibration

IF 2.4 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Vision and Applications Pub Date : 2023-12-08 DOI:10.1007/s00138-023-01486-z

Songhao Zhu, Kai Zhang

{"title":"Few-shot object detection via data augmentation and distribution calibration","authors":"Songhao Zhu, Kai Zhang","doi":"10.1007/s00138-023-01486-z","DOIUrl":null,"url":null,"abstract":"<p>General object detection has been widely developed and studied over the past few years, while few-shot object detection is still in the exploratory stage. Learning effective knowledge from a limited number of samples is challenging, as the trained model is prone to over-fitting due to biased feature distributions in a few training samples. There exist two significant challenges in traditional few-shot object detection methods: (1) The scarcity of extreme samples aggravates the proposal distribution bias, hindering the evolution of regions of interest (ROI) heads toward new categories; (2) Due to the scarce of the samples in novel categories, the region proposal network (RPN) is identified as a key source of classification errors, resulting in a significant decrease in detection performance on novel categories. To overcome these challenges, an effective knowledge transfer method based on distributed calibration and data augmentation is proposed. Firstly, the biased novel category distributions are calibrated with the basic category distributions; secondly, a drift compensation strategy is utilized to reduce the negative impact on new categories classifications during the fine-tuning process; thirdly, synthetic features are obtained from calibrated distributions of novel categories and added to the subsequent training process. Furthermore, the domain-aware data augmentation is utilized to alleviate the issue of data scarcity by exploiting the cross-image foreground—background mixture to increase the diversity and rationality of augmented data. Experimental results demonstrate the effectiveness and applicability of the proposed method.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"195 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-023-01486-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

General object detection has been widely developed and studied over the past few years, while few-shot object detection is still in the exploratory stage. Learning effective knowledge from a limited number of samples is challenging, as the trained model is prone to over-fitting due to biased feature distributions in a few training samples. There exist two significant challenges in traditional few-shot object detection methods: (1) The scarcity of extreme samples aggravates the proposal distribution bias, hindering the evolution of regions of interest (ROI) heads toward new categories; (2) Due to the scarce of the samples in novel categories, the region proposal network (RPN) is identified as a key source of classification errors, resulting in a significant decrease in detection performance on novel categories. To overcome these challenges, an effective knowledge transfer method based on distributed calibration and data augmentation is proposed. Firstly, the biased novel category distributions are calibrated with the basic category distributions; secondly, a drift compensation strategy is utilized to reduce the negative impact on new categories classifications during the fine-tuning process; thirdly, synthetic features are obtained from calibrated distributions of novel categories and added to the subsequent training process. Furthermore, the domain-aware data augmentation is utilized to alleviate the issue of data scarcity by exploiting the cross-image foreground—background mixture to increase the diversity and rationality of augmented data. Experimental results demonstrate the effectiveness and applicability of the proposed method.

Abstract Image

查看原文本刊更多论文

通过数据扩增和分布校准进行少量物体检测

在过去的几年中，一般物体检测技术得到了广泛的发展和研究，而少镜头物体检测技术仍处于探索阶段。从有限的样本中学习有效的知识具有挑战性，因为在少数训练样本中，由于特征分布存在偏差，训练后的模型容易出现过拟合。传统的少镜头物体检测方法存在两个重大挑战：（1）极端样本的稀缺加剧了提议分布偏差，阻碍了感兴趣区域（ROI）头部向新类别的演化；（2）由于新类别样本的稀缺，区域提议网络（RPN）被认为是分类错误的关键来源，导致新类别的检测性能显著下降。为了克服这些挑战，我们提出了一种基于分布式校准和数据增强的有效知识转移方法。首先，将有偏差的新类别分布与基本类别分布进行校准；其次，利用漂移补偿策略减少微调过程中对新类别分类的负面影响；第三，从校准后的新类别分布中获取合成特征，并将其添加到后续训练过程中。此外，利用跨图像的前景-背景混合来增加增强数据的多样性和合理性，从而利用领域感知数据增强来缓解数据稀缺的问题。实验结果证明了所提方法的有效性和适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine Vision and Applications 工程技术-工程：电子与电气

CiteScore

6.30

自引率

3.00%

发文量

审稿时长

8.7 months

期刊介绍： Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal. Particular emphasis is placed on engineering and technology aspects of image processing and computer vision. The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.