一种用于视觉目标检测的可变形卷积路径聚合网络。

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science Pub Date : 2025-08-18 eCollection Date: 2025-01-01 DOI:10.7717/peerj-cs.3083

Chengming Rao, Zunhao Hu, QiMing Zhao, Min Shan, Li Mao

{"title":"一种用于视觉目标检测的可变形卷积路径聚合网络。","authors":"Chengming Rao, Zunhao Hu, QiMing Zhao, Min Shan, Li Mao","doi":"10.7717/peerj-cs.3083","DOIUrl":null,"url":null,"abstract":"One of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named the deformable convolution and path aggregation network (DePAN), is an integration of a path aggregation network with a deformable convolution block added to the feature fusion branch to improve the flexibility of feature point sampling. The deformable convolution block is implemented by repeated stacking of a deformable convolution cell. The DePAN neck can be plugged in and easily applied to various models for object detection. We apply the proposed neck to the baseline models of Yolov6-N and YOLOV6-T, and test the improved models on COCO2017 and PASCAL VOC2012 datasets, as well as a medical image dataset. The experimental results verify the effectiveness and applicability in real-world object detection.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3083"},"PeriodicalIF":2.5000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453868/pdf/","citationCount":"0","resultStr":"{\"title\":\"A path aggregation network with deformable convolution for visual object detection.\",\"authors\":\"Chengming Rao, Zunhao Hu, QiMing Zhao, Min Shan, Li Mao\",\"doi\":\"10.7717/peerj-cs.3083\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named the deformable convolution and path aggregation network (DePAN), is an integration of a path aggregation network with a deformable convolution block added to the feature fusion branch to improve the flexibility of feature point sampling. The deformable convolution block is implemented by repeated stacking of a deformable convolution cell. The DePAN neck can be plugged in and easily applied to various models for object detection. We apply the proposed neck to the baseline models of Yolov6-N and YOLOV6-T, and test the improved models on COCO2017 and PASCAL VOC2012 datasets, as well as a medical image dataset. The experimental results verify the effectiveness and applicability in real-world object detection.\",\"PeriodicalId\":54224,\"journal\":{\"name\":\"PeerJ Computer Science\",\"volume\":\"11 \",\"pages\":\"e3083\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453868/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PeerJ Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.7717/peerj-cs.3083\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.3083","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在视觉目标检测中遇到的主要挑战之一是多尺度问题。已经提出了许多方法来解决这个问题。在本文中，我们提出了一种新的颈部，它可以有效地融合单级目标检测器的多尺度特征。该颈部称为可变形卷积与路径聚合网络（DePAN），是一种路径聚合网络的集成，在特征融合分支中加入了可变形卷积块，以提高特征点采样的灵活性。可变形卷积块通过可变形卷积单元的重复堆叠实现。DePAN颈部可以插入，很容易应用于各种模型的目标检测。我们将提出的颈部应用于Yolov6-N和YOLOV6-T的基线模型，并在COCO2017和PASCAL VOC2012数据集以及医学图像数据集上对改进的模型进行了测试。实验结果验证了该方法在实际目标检测中的有效性和适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A path aggregation network with deformable convolution for visual object detection.

One of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named the deformable convolution and path aggregation network (DePAN), is an integration of a path aggregation network with a deformable convolution block added to the feature fusion branch to improve the flexibility of feature point sampling. The deformable convolution block is implemented by repeated stacking of a deformable convolution cell. The DePAN neck can be plugged in and easily applied to various models for object detection. We apply the proposed neck to the baseline models of Yolov6-N and YOLOV6-T, and test the improved models on COCO2017 and PASCAL VOC2012 datasets, as well as a medical image dataset. The experimental results verify the effectiveness and applicability in real-world object detection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PeerJ Computer Science Computer Science-General Computer Science

CiteScore

6.10

自引率

5.30%

发文量

332

审稿时长

10 weeks

期刊介绍： PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.