{"title":"频率- detr:用于无人机图像中实时小目标检测的频率感知变压器","authors":"Jiayi Chen , Ningzhong Liu , Han Sun , Yu Wang","doi":"10.1016/j.eswa.2025.129710","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements in unmanned aerial vehicle (UAV) and remote sensing technologies have propelled UAV object detection to the forefront of computer vision research. Despite significant progress in deep learning-based detection algorithms, critical challenges persist in small object detection, including high-frequency information loss, inadequate multiscale feature representation, etc. To address these limitations, this paper proposes Freq-DETR, a frequency-aware real-time transformer detection framework leveraging frequency domain analysis to enhance edge detail preservation and global contextual modeling through three novel innovations. First, the frequency-enhanced convolution module (FECM) synergistically integrates spatial and frequency features via dual-branch processing; Second, the decoupled intra-feature scale interaction module (DSC-Clo block) facilitates the integration of high-frequency local and low-frequency global information; Finally, the attention-guided selective feature pyramid network (AGS-FPN) employs context-aware attention for high-level screening feature fusion. Extensive evaluations on the VisDrone2019 benchmark demonstrate that Freq-DETR outperforms the baseline RT-DETR by 4.9 % <span><math><mrow><mi>m</mi><mi>a</mi><mi>p</mi><mo>@</mo><mn>50</mn></mrow></math></span> gain while maintaining computational efficiency. There are also remarkable improvements on both UAVDT and HIT-UAV datasets. Ablation investigations and visual interpretability analyses further confirm the complementary benefits of its frequency-domain components and the framework’s robustness in complex aerial scenarios.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129710"},"PeriodicalIF":7.5000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Freq-DETR: Frequency-aware transformer for real-time small object detection in unmanned aerial vehicle imagery\",\"authors\":\"Jiayi Chen , Ningzhong Liu , Han Sun , Yu Wang\",\"doi\":\"10.1016/j.eswa.2025.129710\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recent advancements in unmanned aerial vehicle (UAV) and remote sensing technologies have propelled UAV object detection to the forefront of computer vision research. Despite significant progress in deep learning-based detection algorithms, critical challenges persist in small object detection, including high-frequency information loss, inadequate multiscale feature representation, etc. To address these limitations, this paper proposes Freq-DETR, a frequency-aware real-time transformer detection framework leveraging frequency domain analysis to enhance edge detail preservation and global contextual modeling through three novel innovations. First, the frequency-enhanced convolution module (FECM) synergistically integrates spatial and frequency features via dual-branch processing; Second, the decoupled intra-feature scale interaction module (DSC-Clo block) facilitates the integration of high-frequency local and low-frequency global information; Finally, the attention-guided selective feature pyramid network (AGS-FPN) employs context-aware attention for high-level screening feature fusion. Extensive evaluations on the VisDrone2019 benchmark demonstrate that Freq-DETR outperforms the baseline RT-DETR by 4.9 % <span><math><mrow><mi>m</mi><mi>a</mi><mi>p</mi><mo>@</mo><mn>50</mn></mrow></math></span> gain while maintaining computational efficiency. There are also remarkable improvements on both UAVDT and HIT-UAV datasets. Ablation investigations and visual interpretability analyses further confirm the complementary benefits of its frequency-domain components and the framework’s robustness in complex aerial scenarios.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"298 \",\"pages\":\"Article 129710\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425033251\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425033251","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Freq-DETR: Frequency-aware transformer for real-time small object detection in unmanned aerial vehicle imagery
Recent advancements in unmanned aerial vehicle (UAV) and remote sensing technologies have propelled UAV object detection to the forefront of computer vision research. Despite significant progress in deep learning-based detection algorithms, critical challenges persist in small object detection, including high-frequency information loss, inadequate multiscale feature representation, etc. To address these limitations, this paper proposes Freq-DETR, a frequency-aware real-time transformer detection framework leveraging frequency domain analysis to enhance edge detail preservation and global contextual modeling through three novel innovations. First, the frequency-enhanced convolution module (FECM) synergistically integrates spatial and frequency features via dual-branch processing; Second, the decoupled intra-feature scale interaction module (DSC-Clo block) facilitates the integration of high-frequency local and low-frequency global information; Finally, the attention-guided selective feature pyramid network (AGS-FPN) employs context-aware attention for high-level screening feature fusion. Extensive evaluations on the VisDrone2019 benchmark demonstrate that Freq-DETR outperforms the baseline RT-DETR by 4.9 % gain while maintaining computational efficiency. There are also remarkable improvements on both UAVDT and HIT-UAV datasets. Ablation investigations and visual interpretability analyses further confirm the complementary benefits of its frequency-domain components and the framework’s robustness in complex aerial scenarios.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.