基于更快R-CNN的改进少镜头目标检测方法

IF 2 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
YangJie Wei, Shangwei Long, Yutong Wang
{"title":"基于更快R-CNN的改进少镜头目标检测方法","authors":"YangJie Wei,&nbsp;Shangwei Long,&nbsp;Yutong Wang","doi":"10.1049/ipr2.70038","DOIUrl":null,"url":null,"abstract":"<p>Uneven distribution of object features and insufficient feature learning significantly affect the accuracy and generalizability of existing detection methods. This paper proposes an improved two-stage few-shot object detection method that builds upon the faster region-based convolutional neural network framework to enhance its performance in detecting objects with limited training data. First, a modified data augmentation method for optical images is introduced, and a Gaussian optimization module of sample feature distribution is constructed to enhance the model's generalizability. Second, a parameter-less 3D space attention module without additional parameters, is added to enhance the space features of a sample, where a neuron linear separability measurement and feature optimization module based on mathematical operations are used to adjust the feature distribution and reduce data distribution bias. Finally, a class feature vector extractor based on meta-learning is provided to reconstruct the feature map by overlaying a class feature vector from the target domain onto the query image. This process improves accuracy and generalization performance, and multiple experiments on the PASCAL VOC dataset show that the proposed method has higher detection accuracy and stronger generalizability than other methods. Especially, the experiment using practical images under complicated environments indicates its potential effectiveness in real-world scenarios.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70038","citationCount":"0","resultStr":"{\"title\":\"Improved Few-Shot Object Detection Method Based on Faster R-CNN\",\"authors\":\"YangJie Wei,&nbsp;Shangwei Long,&nbsp;Yutong Wang\",\"doi\":\"10.1049/ipr2.70038\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Uneven distribution of object features and insufficient feature learning significantly affect the accuracy and generalizability of existing detection methods. This paper proposes an improved two-stage few-shot object detection method that builds upon the faster region-based convolutional neural network framework to enhance its performance in detecting objects with limited training data. First, a modified data augmentation method for optical images is introduced, and a Gaussian optimization module of sample feature distribution is constructed to enhance the model's generalizability. Second, a parameter-less 3D space attention module without additional parameters, is added to enhance the space features of a sample, where a neuron linear separability measurement and feature optimization module based on mathematical operations are used to adjust the feature distribution and reduce data distribution bias. Finally, a class feature vector extractor based on meta-learning is provided to reconstruct the feature map by overlaying a class feature vector from the target domain onto the query image. This process improves accuracy and generalization performance, and multiple experiments on the PASCAL VOC dataset show that the proposed method has higher detection accuracy and stronger generalizability than other methods. Especially, the experiment using practical images under complicated environments indicates its potential effectiveness in real-world scenarios.</p>\",\"PeriodicalId\":56303,\"journal\":{\"name\":\"IET Image Processing\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70038\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Image Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70038\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70038","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

目标特征分布不均匀和特征学习不足严重影响了现有检测方法的准确性和泛化性。本文提出了一种改进的两阶段少镜头目标检测方法,该方法基于更快的基于区域的卷积神经网络框架,以提高其在有限训练数据下检测目标的性能。首先,介绍了一种改进的光学图像数据增强方法,构建了样本特征分布的高斯优化模块,增强了模型的泛化能力;其次,通过增加无参数的无参数三维空间关注模块来增强样本的空间特征,其中基于数学运算的神经元线性可分性测量和特征优化模块来调整特征分布,减少数据分布偏差。最后,提出了一种基于元学习的类特征向量提取器,通过将目标域的类特征向量叠加到查询图像上来重建特征映射。在PASCAL VOC数据集上的多次实验表明,该方法比其他方法具有更高的检测精度和更强的泛化性能。特别是,在复杂环境下使用实际图像的实验表明了其在现实场景中的潜在有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Improved Few-Shot Object Detection Method Based on Faster R-CNN

Improved Few-Shot Object Detection Method Based on Faster R-CNN

Uneven distribution of object features and insufficient feature learning significantly affect the accuracy and generalizability of existing detection methods. This paper proposes an improved two-stage few-shot object detection method that builds upon the faster region-based convolutional neural network framework to enhance its performance in detecting objects with limited training data. First, a modified data augmentation method for optical images is introduced, and a Gaussian optimization module of sample feature distribution is constructed to enhance the model's generalizability. Second, a parameter-less 3D space attention module without additional parameters, is added to enhance the space features of a sample, where a neuron linear separability measurement and feature optimization module based on mathematical operations are used to adjust the feature distribution and reduce data distribution bias. Finally, a class feature vector extractor based on meta-learning is provided to reconstruct the feature map by overlaying a class feature vector from the target domain onto the query image. This process improves accuracy and generalization performance, and multiple experiments on the PASCAL VOC dataset show that the proposed method has higher detection accuracy and stronger generalizability than other methods. Especially, the experiment using practical images under complicated environments indicates its potential effectiveness in real-world scenarios.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IET Image Processing
IET Image Processing 工程技术-工程:电子与电气
CiteScore
5.40
自引率
8.70%
发文量
282
审稿时长
6 months
期刊介绍: The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications. Principal topics include: Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality. Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing. Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing. Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video. Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography. Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security. Current Special Issue Call for Papers: Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信