One Shot Object Detection Via Hierarchical Adaptive Alignment

2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI:10.1109/VCIP56404.2022.10008884

Enquan Zhang, Cheolkon Jung

{"title":"One Shot Object Detection Via Hierarchical Adaptive Alignment","authors":"Enquan Zhang, Cheolkon Jung","doi":"10.1109/VCIP56404.2022.10008884","DOIUrl":null,"url":null,"abstract":"Recently, deep learning based object detectors have achieved good performance with abundant labeled data. However, data labeling is often expensive and time-consuming in real life. Therefore, it is required to introduce one shot learning into object detection. In this paper, we propose one shot object detection based on hierarchical adaptive alignment to address the limited information of one shot in feature representation. We present a multi-adaptive alignment framework based on faster R-CNN to extract effective features from query patch and target image using siamese convolutional feature extraction, then generate a fused feature map by aggregating query and target features. We use the fused feature map in object classification and localization. The proposed framework adaptively adjusts feature representation through hierarchical and aggregated alignment so that it can learn correlation between the target image and the query patch. Experimental results demonstrate that the proposed method significantly improves the unseen-class object detection from 24.3 AP50 to 26.2 AP50 on the MS-COCO dataset.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP56404.2022.10008884","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, deep learning based object detectors have achieved good performance with abundant labeled data. However, data labeling is often expensive and time-consuming in real life. Therefore, it is required to introduce one shot learning into object detection. In this paper, we propose one shot object detection based on hierarchical adaptive alignment to address the limited information of one shot in feature representation. We present a multi-adaptive alignment framework based on faster R-CNN to extract effective features from query patch and target image using siamese convolutional feature extraction, then generate a fused feature map by aggregating query and target features. We use the fused feature map in object classification and localization. The proposed framework adaptively adjusts feature representation through hierarchical and aggregated alignment so that it can learn correlation between the target image and the query patch. Experimental results demonstrate that the proposed method significantly improves the unseen-class object detection from 24.3 AP50 to 26.2 AP50 on the MS-COCO dataset.

查看原文本刊更多论文

通过分层自适应对齐的单镜头目标检测

近年来，基于深度学习的目标检测器在标记数据丰富的情况下取得了良好的性能。然而，在现实生活中，数据标注通常既昂贵又耗时。因此，需要将一次性学习引入到目标检测中。本文提出了一种基于分层自适应对齐的单镜头目标检测方法，以解决特征表示中单镜头信息有限的问题。提出了一种基于更快R-CNN的多自适应对齐框架，利用连体卷积特征提取从查询补丁和目标图像中提取有效特征，然后通过聚合查询和目标特征生成融合特征映射。将融合特征映射用于目标分类和定位。该框架通过分层和聚合对齐自适应调整特征表示，从而学习目标图像与查询补丁之间的相关性。实验结果表明，该方法显著提高了MS-COCO数据集上看不见类目标的检测效率，从24.3 AP50提高到26.2 AP50。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)

自引率

0.00%

发文量