Air-to-Ground Target Detection and Tracking Based on Dual-Stream Fusion of Unmanned Aerial Vehicle

IF 5.2 2区计算机科学 Q2 ROBOTICS

Journal of Field Robotics Pub Date : 2025-05-18 DOI:10.1002/rob.22592

Chuanyun Wang, Jianqi Yang, Dongdong Sun, Qian Gao, Qiong Liu, Tian Wang, Anqi Hu, Linlin Wang

{"title":"Air-to-Ground Target Detection and Tracking Based on Dual-Stream Fusion of Unmanned Aerial Vehicle","authors":"Chuanyun Wang, Jianqi Yang, Dongdong Sun, Qian Gao, Qiong Liu, Tian Wang, Anqi Hu, Linlin Wang","doi":"10.1002/rob.22592","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Both visible and infrared images are important sources of intelligence information on the battlefield, and air-to-ground reconnaissance by UAV is an important means to obtain intelligence. However, there are great challenges in ground target detection and tracking, especially in complex battlefield environments. Aiming at the problem of insufficient accuracy of target detection by a single type of sensor in the battlefield environment at this stage, a target detection method by fusion of visible and infrared images is proposed in this paper, which is called ReconnaissanceFusion-YOLO (RF-YOLO), and with the help of infrared imagery, it can effectively improve the accuracy of target detection in the case of insufficient light. The performance of target detection in the battlefield is significantly improved by introducing two key innovative modules: dual feature fusion (DFF) module and feature fusion corrector (FFC) module. The DFF module enhances multi-channel feature fusion through a novel concatenation and channel-wise attention mechanism, while the FFC module performs feature correction between parallel streams using spatial and channel-wise weights, addressing noise and uncertainty in different modalities. These modules are integrated on top of a dual-stream YOLO architecture, allowing for effective fusion of visible and infrared information. RF-YOLO was trained and evaluated using the FLIR data set, containing 5142 pairs of strictly aligned visible and infrared images. Results demonstrate that RF-YOLO significantly outperforms benchmark networks in terms of robustness requirements. Specifically, the large model of RF-YOLO achieves an mAP of 0.831, which is a significant improvement compared to the YOLOv5l inf benchmark's 0.739. This represents an improvement of over 12% in detection accuracy. Additionally, RF-YOLO offers a Nano version that balances accuracy and speed. The Nano version achieves an mAP of 0.765, while maintaining a model size of only 11.5 MB, making it suitable for deployment on UAV edge computing devices with limited resources. To validate the practical applicability of our approach, this paper successfully implements target detection and tracking on a real UAV's edge computing device using the ROS system and SiameseRPN, combined with the proposed RF-YOLO. Real-world flight tests were conducted on an internal playground, demonstrating the effectiveness of our method in actual UAV applications. The system achieved a processing rate of approximately 10 fps at 640 × 640 resolution on an NVIDIA TX2 edge computing device, showcasing its real-time performance capability in practical scenarios. This study contributes to enhancing UAV-based battlefield reconnaissance capabilities by improving the accuracy and robustness of target detection and tracking in complex environments. The proposed RF-YOLO method, along with its successful implementation on a real UAV platform, provides a promising solution for advanced military intelligence gathering and decision-making support.</p>\n </div>","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"42 7","pages":"3582-3599"},"PeriodicalIF":5.2000,"publicationDate":"2025-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/rob.22592","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Both visible and infrared images are important sources of intelligence information on the battlefield, and air-to-ground reconnaissance by UAV is an important means to obtain intelligence. However, there are great challenges in ground target detection and tracking, especially in complex battlefield environments. Aiming at the problem of insufficient accuracy of target detection by a single type of sensor in the battlefield environment at this stage, a target detection method by fusion of visible and infrared images is proposed in this paper, which is called ReconnaissanceFusion-YOLO (RF-YOLO), and with the help of infrared imagery, it can effectively improve the accuracy of target detection in the case of insufficient light. The performance of target detection in the battlefield is significantly improved by introducing two key innovative modules: dual feature fusion (DFF) module and feature fusion corrector (FFC) module. The DFF module enhances multi-channel feature fusion through a novel concatenation and channel-wise attention mechanism, while the FFC module performs feature correction between parallel streams using spatial and channel-wise weights, addressing noise and uncertainty in different modalities. These modules are integrated on top of a dual-stream YOLO architecture, allowing for effective fusion of visible and infrared information. RF-YOLO was trained and evaluated using the FLIR data set, containing 5142 pairs of strictly aligned visible and infrared images. Results demonstrate that RF-YOLO significantly outperforms benchmark networks in terms of robustness requirements. Specifically, the large model of RF-YOLO achieves an mAP of 0.831, which is a significant improvement compared to the YOLOv5l inf benchmark's 0.739. This represents an improvement of over 12% in detection accuracy. Additionally, RF-YOLO offers a Nano version that balances accuracy and speed. The Nano version achieves an mAP of 0.765, while maintaining a model size of only 11.5 MB, making it suitable for deployment on UAV edge computing devices with limited resources. To validate the practical applicability of our approach, this paper successfully implements target detection and tracking on a real UAV's edge computing device using the ROS system and SiameseRPN, combined with the proposed RF-YOLO. Real-world flight tests were conducted on an internal playground, demonstrating the effectiveness of our method in actual UAV applications. The system achieved a processing rate of approximately 10 fps at 640 × 640 resolution on an NVIDIA TX2 edge computing device, showcasing its real-time performance capability in practical scenarios. This study contributes to enhancing UAV-based battlefield reconnaissance capabilities by improving the accuracy and robustness of target detection and tracking in complex environments. The proposed RF-YOLO method, along with its successful implementation on a real UAV platform, provides a promising solution for advanced military intelligence gathering and decision-making support.

Abstract Image

查看原文本刊更多论文

基于双流融合的无人机对地目标检测与跟踪

可见图像和红外图像都是战场上重要的情报信息来源，无人机对地侦察是获取情报的重要手段。然而，在复杂的战场环境下，地面目标的探测与跟踪面临着巨大的挑战。针对现阶段战场环境中单一类型传感器对目标检测精度不足的问题，本文提出了一种可见光与红外图像融合的目标检测方法，称为reconnaissance - fusion - yolo (RF-YOLO)，该方法借助红外图像，可以有效提高在光照不足情况下的目标检测精度。通过引入双特征融合（DFF）模块和特征融合校正（FFC）模块两个关键创新模块，显著提高了战场目标检测性能。DFF模块通过一种新颖的连接和通道智能注意机制增强多通道特征融合，而FFC模块使用空间和通道智能权重在并行流之间进行特征校正，解决不同模式下的噪声和不确定性。这些模块集成在双流YOLO架构之上，可以有效地融合可见光和红外信息。RF-YOLO使用包含5142对严格对齐的可见光和红外图像的FLIR数据集进行训练和评估。结果表明，RF-YOLO在鲁棒性要求方面显著优于基准网络。具体来说，RF-YOLO的大型模型mAP达到了0.831，与YOLOv5l inf benchmark的0.739相比，有了显著的提高。这意味着检测精度提高了12%以上。此外，RF-YOLO提供了一个纳米版本，平衡精度和速度。Nano版本实现了0.765的mAP，同时保持仅11.5 MB的模型大小，使其适合部署在资源有限的无人机边缘计算设备上。为了验证我们的方法的实用性，本文利用ROS系统和SiameseRPN，结合提出的RF-YOLO，在实际无人机的边缘计算设备上成功实现了目标检测和跟踪。在一个内部操场上进行了真实飞行测试，证明了我们的方法在实际无人机应用中的有效性。该系统在NVIDIA TX2边缘计算设备上以640 × 640分辨率实现了约10 fps的处理速率，在实际场景中展示了其实时性能。该研究通过提高复杂环境下目标探测和跟踪的精度和鲁棒性，有助于增强无人机战场侦察能力。所提出的RF-YOLO方法及其在实际无人机平台上的成功实施，为先进的军事情报收集和决策支持提供了一个有前途的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Field Robotics 工程技术-机器人学

CiteScore

15.00

自引率

3.60%

发文量

审稿时长

6 months

期刊介绍： The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments. The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.