Full-Scale Feature Aggregation and Grouping Feature Reconstruction-Based UAV Image Target Detection

IF 8.6 1区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2024-04-23 DOI:10.1109/TGRS.2024.3392794

Yunzuo Zhang;Cunyu Wu;Tian Zhang;Yuxin Zheng

{"title":"Full-Scale Feature Aggregation and Grouping Feature Reconstruction-Based UAV Image Target Detection","authors":"Yunzuo Zhang;Cunyu Wu;Tian Zhang;Yuxin Zheng","doi":"10.1109/TGRS.2024.3392794","DOIUrl":null,"url":null,"abstract":"Unmanned aerial vehicle (UAV) image target detection holds significant value for a wide range of applications in modern society. However, due to the variable flight altitude of UAV, the captured images often exhibit significant differences at the target scale and contain a large number of small targets. The existing methods are difficult to adapt to these changes, resulting in a decrease in detection accuracy. To address this issue, this article proposes a new method for UAV image object detection based on full-scale feature aggregation (FFA) and grouped feature reconstruction FFAGRNet. First, existing feature fusion methods are hindered by the layer-by-layer transfer structure, which limits effective information exchange between feature maps of different scales. In response, we propose the FFA module, which performs scale adaptation and information aggregation across multiple sets of feature maps, producing high-quality aggregated feature maps. Second, to further refine aggregation features and eliminate redundancy, we introduce the grouping feature reconstruction (GFR) module. This module subdivides aggregation features into multiple sublevel features, allowing them to autonomously learn channel and spatial layouts of target features. Finally, we present the parallel super-resolution semantic enhancement (PSSE) module to reconstruct deep feature maps and incorporate spatial contextual information, effectively increasing the proportion of semantic information and enhancing the model’s ability to classify ambiguous targets. To validate the effectiveness of our proposed method, extensive experiments were conducted on the VisDrone2021 and UAVDT datasets. The results demonstrate that compared with the baseline, our method achieves a significant improvement in mAP50, with increases of 7.6% and 4.6%, respectively, showcasing excellent performance compared with existing methods.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-11"},"PeriodicalIF":8.6000,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10507058/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Unmanned aerial vehicle (UAV) image target detection holds significant value for a wide range of applications in modern society. However, due to the variable flight altitude of UAV, the captured images often exhibit significant differences at the target scale and contain a large number of small targets. The existing methods are difficult to adapt to these changes, resulting in a decrease in detection accuracy. To address this issue, this article proposes a new method for UAV image object detection based on full-scale feature aggregation (FFA) and grouped feature reconstruction FFAGRNet. First, existing feature fusion methods are hindered by the layer-by-layer transfer structure, which limits effective information exchange between feature maps of different scales. In response, we propose the FFA module, which performs scale adaptation and information aggregation across multiple sets of feature maps, producing high-quality aggregated feature maps. Second, to further refine aggregation features and eliminate redundancy, we introduce the grouping feature reconstruction (GFR) module. This module subdivides aggregation features into multiple sublevel features, allowing them to autonomously learn channel and spatial layouts of target features. Finally, we present the parallel super-resolution semantic enhancement (PSSE) module to reconstruct deep feature maps and incorporate spatial contextual information, effectively increasing the proportion of semantic information and enhancing the model’s ability to classify ambiguous targets. To validate the effectiveness of our proposed method, extensive experiments were conducted on the VisDrone2021 and UAVDT datasets. The results demonstrate that compared with the baseline, our method achieves a significant improvement in mAP50, with increases of 7.6% and 4.6%, respectively, showcasing excellent performance compared with existing methods.

查看原文本刊更多论文

基于大规模特征聚合和分组特征重构的无人机图像目标检测

无人飞行器（UAV）图像目标检测在现代社会的广泛应用中具有重要价值。然而，由于无人飞行器的飞行高度不固定，所捕获的图像在目标尺度上往往表现出显著差异，并包含大量小目标。现有方法难以适应这些变化，导致检测精度下降。针对这一问题，本文提出了一种基于全尺度特征聚合（FFA）和分组特征重构 FFAGRNet 的无人机图像目标检测新方法。首先，现有的特征融合方法受到逐层传递结构的阻碍，限制了不同尺度特征图之间的有效信息交换。为此，我们提出了 FFA 模块，该模块可在多组特征图之间进行尺度适应和信息聚合，从而生成高质量的聚合特征图。其次，为了进一步完善聚合特征并消除冗余，我们引入了分组特征重构（GFR）模块。该模块将聚合特征细分为多个子级特征，使其能够自主学习目标特征的信道和空间布局。最后，我们提出了并行超分辨率语义增强（PSSE）模块，以重建深度特征图并纳入空间上下文信息，从而有效提高语义信息的比例，增强模型对模糊目标的分类能力。为了验证我们所提方法的有效性，我们在 VisDrone2021 和 UAVDT 数据集上进行了大量实验。结果表明，与基线相比，我们的方法显著提高了 mAP50，分别提高了 7.6% 和 4.6%，与现有方法相比表现出色。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理

CiteScore

11.50

自引率

28.00%

发文量

1912

审稿时长

4.0 months

期刊介绍： IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.