{"title":"SPAFPN: Wear a multi-scale feature fusion scarf around neck for real-time object detector","authors":"Zhentian Bian, Bin Yao, Qing Li","doi":"10.1016/j.inffus.2025.103034","DOIUrl":null,"url":null,"abstract":"<div><div>Nowadays, the feature extraction module of the backbone network develops rapidly in the field of real-time object detection, while the multi-scale design for the neck iterates slowly. The traditional PAFPN is short in cross-scale propagation efficiency. However, many newly proposed multi-scale fusion methods, which make up for this drawback, are difficult to widely apply because of their complex fusion modules and unfriendly training. In this paper, we propose the Scarf Path Aggregation Feature Pyramid Network (SPAFPN), an advanced neck structure of multi-scale fusion for real-time object detection. SPAFPN adheres to the decentralized multi-scale fusion idea of ”Light Fusion, Heavy Decouple” while inheriting the concept of versatility and portability design. SPAFPN can promote cross-scale low-loss transfer of features and improve the performance of the model, which mainly consists of Pyramid Fusion and Multi-Concat modules. The experimental results on the MS COCO dataset show that SPAFPN-HG-N/S/M can achieve a 5.2%/3.2%/1.1% mAP improvement over YOLOv8, respectively. We also adopted some latest backbone networks to verify the scalability of SPAFPN. Specifically, SPAFPN combined with C2f and GELAN as backbone network modules both perform well. In addition, SPAFPN still maintains good performance when applied to the DETR-based real-time object detector. The code is available at <span><span>https://github.com/ztbian-bzt/SPAFPN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"119 ","pages":"Article 103034"},"PeriodicalIF":14.7000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525001071","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Nowadays, the feature extraction module of the backbone network develops rapidly in the field of real-time object detection, while the multi-scale design for the neck iterates slowly. The traditional PAFPN is short in cross-scale propagation efficiency. However, many newly proposed multi-scale fusion methods, which make up for this drawback, are difficult to widely apply because of their complex fusion modules and unfriendly training. In this paper, we propose the Scarf Path Aggregation Feature Pyramid Network (SPAFPN), an advanced neck structure of multi-scale fusion for real-time object detection. SPAFPN adheres to the decentralized multi-scale fusion idea of ”Light Fusion, Heavy Decouple” while inheriting the concept of versatility and portability design. SPAFPN can promote cross-scale low-loss transfer of features and improve the performance of the model, which mainly consists of Pyramid Fusion and Multi-Concat modules. The experimental results on the MS COCO dataset show that SPAFPN-HG-N/S/M can achieve a 5.2%/3.2%/1.1% mAP improvement over YOLOv8, respectively. We also adopted some latest backbone networks to verify the scalability of SPAFPN. Specifically, SPAFPN combined with C2f and GELAN as backbone network modules both perform well. In addition, SPAFPN still maintains good performance when applied to the DETR-based real-time object detector. The code is available at https://github.com/ztbian-bzt/SPAFPN.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.