Towards Few-Shot Object Detection Through Dual Calibration

IF 14.3 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Intelligent Vehicles Pub Date : 2024-09-16 DOI:10.1109/TIV.2024.3461742

Ding Sheng Ong;Yi Liu;Jungong Han

{"title":"Towards Few-Shot Object Detection Through Dual Calibration","authors":"Ding Sheng Ong;Yi Liu;Jungong Han","doi":"10.1109/TIV.2024.3461742","DOIUrl":null,"url":null,"abstract":"Object detection is crucial in traffic scenes for accurately identifying multiple objects within complex environments. Traditional systems rely on deep learning models trained on large-scale datasets, but this approach can be expensive and impractical. Few-shot object detection (FSOD) offers a potential solution by addressing limited data availability. However, object detectors trained with FSOD frameworks often generalize poorly on classes with limited samples. Although most existing methods alleviate this problem by calibrating either the feature maps or prediction heads of the object detector, none of them, like this work, have proposed a unified, dual calibration strategy that operates in both the latent feature space and the prediction probability space of the object detector. Specifically, we propose to improve representation precision by reducing the variances of feature vectors using highly adaptive centroids learned from ensembles of training features in the latent space. These centroids are employed to calibrate the features and reveal the underlying structure of the latent feature space. Moreover, we further exploit the association between the query and support features to calibrate inaccurate predictions resulting from overfitting or underfitting when fine-tuned with few training samples and low training iterations. Through visualization, we demonstrate that our method produces more discriminative high-level features, ultimately improving the precision of an object detector's predictions. To validate the effectiveness of our approaches, we conduct comprehensive experiments on well-known benchmarks, including PASCAL VOC and MS-COCO, showing considerable performance gains compared to existing works.","PeriodicalId":36532,"journal":{"name":"IEEE Transactions on Intelligent Vehicles","volume":"10 6","pages":"3670-3683"},"PeriodicalIF":14.3000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Vehicles","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10681243/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Object detection is crucial in traffic scenes for accurately identifying multiple objects within complex environments. Traditional systems rely on deep learning models trained on large-scale datasets, but this approach can be expensive and impractical. Few-shot object detection (FSOD) offers a potential solution by addressing limited data availability. However, object detectors trained with FSOD frameworks often generalize poorly on classes with limited samples. Although most existing methods alleviate this problem by calibrating either the feature maps or prediction heads of the object detector, none of them, like this work, have proposed a unified, dual calibration strategy that operates in both the latent feature space and the prediction probability space of the object detector. Specifically, we propose to improve representation precision by reducing the variances of feature vectors using highly adaptive centroids learned from ensembles of training features in the latent space. These centroids are employed to calibrate the features and reveal the underlying structure of the latent feature space. Moreover, we further exploit the association between the query and support features to calibrate inaccurate predictions resulting from overfitting or underfitting when fine-tuned with few training samples and low training iterations. Through visualization, we demonstrate that our method produces more discriminative high-level features, ultimately improving the precision of an object detector's predictions. To validate the effectiveness of our approaches, we conduct comprehensive experiments on well-known benchmarks, including PASCAL VOC and MS-COCO, showing considerable performance gains compared to existing works.

查看原文本刊更多论文

基于双校准的少镜头目标检测

在交通场景中，目标检测对于准确识别复杂环境中的多个目标至关重要。传统的系统依赖于在大规模数据集上训练的深度学习模型，但这种方法可能既昂贵又不切实际。少射目标检测（FSOD）通过解决有限的数据可用性提供了一个潜在的解决方案。然而，使用FSOD框架训练的对象检测器通常在有限样本的类上泛化得很差。尽管大多数现有方法通过校准目标检测器的特征映射或预测头来缓解这一问题，但没有一种方法像本研究一样，提出了一种统一的双重校准策略，既可以在目标检测器的潜在特征空间中工作，也可以在目标检测器的预测概率空间中工作。具体来说，我们建议通过使用从潜在空间的训练特征集合中学习到的高度自适应质心来减少特征向量的方差来提高表示精度。这些质心被用来校准特征并揭示潜在特征空间的底层结构。此外，我们进一步利用查询和支持特征之间的关联来校准在使用少量训练样本和低训练迭代进行微调时由于过拟合或欠拟合而导致的不准确预测。通过可视化，我们证明了我们的方法产生了更多的判别高级特征，最终提高了目标检测器预测的精度。为了验证我们方法的有效性，我们在著名的基准上进行了全面的实验，包括PASCAL VOC和MS-COCO，与现有的工作相比，显示出相当大的性能提升。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Intelligent Vehicles Mathematics-Control and Optimization

CiteScore

12.10

自引率

13.40%

发文量

177

期刊介绍： The IEEE Transactions on Intelligent Vehicles (T-IV) is a premier platform for publishing peer-reviewed articles that present innovative research concepts, application results, significant theoretical findings, and application case studies in the field of intelligent vehicles. With a particular emphasis on automated vehicles within roadway environments, T-IV aims to raise awareness of pressing research and application challenges. Our focus is on providing critical information to the intelligent vehicle community, serving as a dissemination vehicle for IEEE ITS Society members and others interested in learning about the state-of-the-art developments and progress in research and applications related to intelligent vehicles. Join us in advancing knowledge and innovation in this dynamic field.