{"title":"FM-RTDETR: Small Object Detection Algorithm Based on Enhanced Feature Fusion With Mamba","authors":"Yuchuan Yang;Jiahui Dai;Yong Wang;Yafei Chen","doi":"10.1109/LSP.2025.3553426","DOIUrl":null,"url":null,"abstract":"Traditional real-time object detection networks deployed in autonomous aerial vehicles (AAVs) struggle to extract features from small objects in complex backgrounds with occlusions and overlapping objects. To address this challenge, we propose FM-RTDETR, a real-time object detection algorithm optimized for small object detection. We redesign the encoder of RT-DETRv2 by integrating the Feature Aggregation and Diffusion Network (FADN), improving the algorithm's ability to capture contextual information. Subsequently, we introduce the Parallel Atrous Mamba Feature Fusion Module (PAMFFM), which combines shallow and deep semantic information to better capture small object features. Furthermore, we propose the Cross-stage Enhanced Feature Fusion Module (CEFFM), merging features for small objects to provide richer and more detailed information. Finally, we propose STIoU Loss, which incorporates a penalty term to adjust the scaling of the loss function, improving detection granularity for small objects. FM-RTDETR achieves AP<inline-formula><tex-math>$_{50}$</tex-math></inline-formula> scores of 54.0% and 56.3% on the VisDrone2019-DET and AI-TOD datasets. Compared with other state-of-the-art methods, our method shows great potential in small object detection from drones.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1570-1574"},"PeriodicalIF":3.2000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10935299/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Traditional real-time object detection networks deployed in autonomous aerial vehicles (AAVs) struggle to extract features from small objects in complex backgrounds with occlusions and overlapping objects. To address this challenge, we propose FM-RTDETR, a real-time object detection algorithm optimized for small object detection. We redesign the encoder of RT-DETRv2 by integrating the Feature Aggregation and Diffusion Network (FADN), improving the algorithm's ability to capture contextual information. Subsequently, we introduce the Parallel Atrous Mamba Feature Fusion Module (PAMFFM), which combines shallow and deep semantic information to better capture small object features. Furthermore, we propose the Cross-stage Enhanced Feature Fusion Module (CEFFM), merging features for small objects to provide richer and more detailed information. Finally, we propose STIoU Loss, which incorporates a penalty term to adjust the scaling of the loss function, improving detection granularity for small objects. FM-RTDETR achieves AP$_{50}$ scores of 54.0% and 56.3% on the VisDrone2019-DET and AI-TOD datasets. Compared with other state-of-the-art methods, our method shows great potential in small object detection from drones.
期刊介绍:
The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.