Chengming Rao, Zunhao Hu, QiMing Zhao, Min Shan, Li Mao
{"title":"一种用于视觉目标检测的可变形卷积路径聚合网络。","authors":"Chengming Rao, Zunhao Hu, QiMing Zhao, Min Shan, Li Mao","doi":"10.7717/peerj-cs.3083","DOIUrl":null,"url":null,"abstract":"<p><p>One of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named the deformable convolution and path aggregation network (DePAN), is an integration of a path aggregation network with a deformable convolution block added to the feature fusion branch to improve the flexibility of feature point sampling. The deformable convolution block is implemented by repeated stacking of a deformable convolution cell. The DePAN neck can be plugged in and easily applied to various models for object detection. We apply the proposed neck to the baseline models of Yolov6-N and YOLOV6-T, and test the improved models on COCO2017 and PASCAL VOC2012 datasets, as well as a medical image dataset. The experimental results verify the effectiveness and applicability in real-world object detection.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3083"},"PeriodicalIF":2.5000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453868/pdf/","citationCount":"0","resultStr":"{\"title\":\"A path aggregation network with deformable convolution for visual object detection.\",\"authors\":\"Chengming Rao, Zunhao Hu, QiMing Zhao, Min Shan, Li Mao\",\"doi\":\"10.7717/peerj-cs.3083\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>One of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named the deformable convolution and path aggregation network (DePAN), is an integration of a path aggregation network with a deformable convolution block added to the feature fusion branch to improve the flexibility of feature point sampling. The deformable convolution block is implemented by repeated stacking of a deformable convolution cell. The DePAN neck can be plugged in and easily applied to various models for object detection. We apply the proposed neck to the baseline models of Yolov6-N and YOLOV6-T, and test the improved models on COCO2017 and PASCAL VOC2012 datasets, as well as a medical image dataset. The experimental results verify the effectiveness and applicability in real-world object detection.</p>\",\"PeriodicalId\":54224,\"journal\":{\"name\":\"PeerJ Computer Science\",\"volume\":\"11 \",\"pages\":\"e3083\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453868/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PeerJ Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.7717/peerj-cs.3083\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.3083","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A path aggregation network with deformable convolution for visual object detection.
One of the main challenges encountered in visual object detection is the multi-scale issue. Many approaches have been proposed to tackle this issue. In this article, we propose a novel neck that can perform effective fusion of multi-scale features for a single-stage object detector. This neck, named the deformable convolution and path aggregation network (DePAN), is an integration of a path aggregation network with a deformable convolution block added to the feature fusion branch to improve the flexibility of feature point sampling. The deformable convolution block is implemented by repeated stacking of a deformable convolution cell. The DePAN neck can be plugged in and easily applied to various models for object detection. We apply the proposed neck to the baseline models of Yolov6-N and YOLOV6-T, and test the improved models on COCO2017 and PASCAL VOC2012 datasets, as well as a medical image dataset. The experimental results verify the effectiveness and applicability in real-world object detection.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.