{"title":"Few-shot object detection via data augmentation and distribution calibration","authors":"Songhao Zhu, Kai Zhang","doi":"10.1007/s00138-023-01486-z","DOIUrl":null,"url":null,"abstract":"<p>General object detection has been widely developed and studied over the past few years, while few-shot object detection is still in the exploratory stage. Learning effective knowledge from a limited number of samples is challenging, as the trained model is prone to over-fitting due to biased feature distributions in a few training samples. There exist two significant challenges in traditional few-shot object detection methods: (1) The scarcity of extreme samples aggravates the proposal distribution bias, hindering the evolution of regions of interest (ROI) heads toward new categories; (2) Due to the scarce of the samples in novel categories, the region proposal network (RPN) is identified as a key source of classification errors, resulting in a significant decrease in detection performance on novel categories. To overcome these challenges, an effective knowledge transfer method based on distributed calibration and data augmentation is proposed. Firstly, the biased novel category distributions are calibrated with the basic category distributions; secondly, a drift compensation strategy is utilized to reduce the negative impact on new categories classifications during the fine-tuning process; thirdly, synthetic features are obtained from calibrated distributions of novel categories and added to the subsequent training process. Furthermore, the domain-aware data augmentation is utilized to alleviate the issue of data scarcity by exploiting the cross-image foreground—background mixture to increase the diversity and rationality of augmented data. Experimental results demonstrate the effectiveness and applicability of the proposed method.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"195 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-023-01486-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
General object detection has been widely developed and studied over the past few years, while few-shot object detection is still in the exploratory stage. Learning effective knowledge from a limited number of samples is challenging, as the trained model is prone to over-fitting due to biased feature distributions in a few training samples. There exist two significant challenges in traditional few-shot object detection methods: (1) The scarcity of extreme samples aggravates the proposal distribution bias, hindering the evolution of regions of interest (ROI) heads toward new categories; (2) Due to the scarce of the samples in novel categories, the region proposal network (RPN) is identified as a key source of classification errors, resulting in a significant decrease in detection performance on novel categories. To overcome these challenges, an effective knowledge transfer method based on distributed calibration and data augmentation is proposed. Firstly, the biased novel category distributions are calibrated with the basic category distributions; secondly, a drift compensation strategy is utilized to reduce the negative impact on new categories classifications during the fine-tuning process; thirdly, synthetic features are obtained from calibrated distributions of novel categories and added to the subsequent training process. Furthermore, the domain-aware data augmentation is utilized to alleviate the issue of data scarcity by exploiting the cross-image foreground—background mixture to increase the diversity and rationality of augmented data. Experimental results demonstrate the effectiveness and applicability of the proposed method.
期刊介绍:
Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal.
Particular emphasis is placed on engineering and technology aspects of image processing and computer vision.
The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.