Teaching in adverse scenes: a statistically feedback-driven threshold and mask adjustment teacher-student framework for object detection in UAV images under adverse scenes

IF 12.2 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-06-24 DOI:10.1016/j.isprsjprs.2025.06.009

Hongyu Chen , Jiping Liu , Yong Wang , Jun Zhu , Dejun Feng , Yakun Xie

{"title":"Teaching in adverse scenes: a statistically feedback-driven threshold and mask adjustment teacher-student framework for object detection in UAV images under adverse scenes","authors":"Hongyu Chen , Jiping Liu , Yong Wang , Jun Zhu , Dejun Feng , Yakun Xie","doi":"10.1016/j.isprsjprs.2025.06.009","DOIUrl":null,"url":null,"abstract":"<div><div>Unmanned Aerial Vehicles (UAVs) have become a key platform for aerial object detection, but their performance in real-world scenarios is often severely impacted by adverse environmental conditions, such as fog and haze. Achieving robust UAV object detection under these challenging conditions is crucial for enhancing the all-weather situational awareness capabilities of UAVs. This is especially critical in key application scenarios, such as rapid disaster response and information interpretation, which demand reliable visual perception around the clock. Unsupervised Domain Adaptation (UDA) has shown promise in effectively alleviating the performance degradation caused by domain gaps between source and target domains, and it can potentially be generalized to UAV object detection in adverse scenes. However, existing UDA studies are based on natural images or clear UAV imagery, and research focused on UAV imagery in adverse conditions is still in its infancy. Moreover, due to the unique perspective of UAVs and the interference from adverse conditions, these methods often fail to accurately align features and are influenced by limited or noisy pseudo-labels. To address this, we propose the first benchmark for UAV object detection in adverse scenes, the Statistical Feedback-Driven Threshold and Mask Adjustment Teacher-Student Framework (SF-TMAT). Specifically, SF-TMAT introduces a design called Dynamic Step Feedback Mask Adjustment Autoencoder (DSFMA), which dynamically adjusts the mask ratio and reconstructs feature maps by integrating training progress and loss feedback. This approach dynamically adjusts the learning focus at different training stages to meet the model’s needs for learning features at varying levels of granularity. Additionally, we propose a unique Variance Feedback Smoothing Threshold (VFST) strategy, which statistically computes the mean confidence of each class and dynamically adjusts the selection threshold by incorporating a variance penalty term. This strategy improves the quality of pseudo-labels and uncovers potentially valid labels, thus mitigating domain bias. Extensive experiments demonstrate the superiority and generalization capability of the proposed SF-TMAT in UAV object detection under adverse scene conditions. The Code is released at <span><span>https://github.com/ChenHuyoo</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"227 ","pages":"Pages 332-348"},"PeriodicalIF":12.2000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271625002357","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Unmanned Aerial Vehicles (UAVs) have become a key platform for aerial object detection, but their performance in real-world scenarios is often severely impacted by adverse environmental conditions, such as fog and haze. Achieving robust UAV object detection under these challenging conditions is crucial for enhancing the all-weather situational awareness capabilities of UAVs. This is especially critical in key application scenarios, such as rapid disaster response and information interpretation, which demand reliable visual perception around the clock. Unsupervised Domain Adaptation (UDA) has shown promise in effectively alleviating the performance degradation caused by domain gaps between source and target domains, and it can potentially be generalized to UAV object detection in adverse scenes. However, existing UDA studies are based on natural images or clear UAV imagery, and research focused on UAV imagery in adverse conditions is still in its infancy. Moreover, due to the unique perspective of UAVs and the interference from adverse conditions, these methods often fail to accurately align features and are influenced by limited or noisy pseudo-labels. To address this, we propose the first benchmark for UAV object detection in adverse scenes, the Statistical Feedback-Driven Threshold and Mask Adjustment Teacher-Student Framework (SF-TMAT). Specifically, SF-TMAT introduces a design called Dynamic Step Feedback Mask Adjustment Autoencoder (DSFMA), which dynamically adjusts the mask ratio and reconstructs feature maps by integrating training progress and loss feedback. This approach dynamically adjusts the learning focus at different training stages to meet the model’s needs for learning features at varying levels of granularity. Additionally, we propose a unique Variance Feedback Smoothing Threshold (VFST) strategy, which statistically computes the mean confidence of each class and dynamically adjusts the selection threshold by incorporating a variance penalty term. This strategy improves the quality of pseudo-labels and uncovers potentially valid labels, thus mitigating domain bias. Extensive experiments demonstrate the superiority and generalization capability of the proposed SF-TMAT in UAV object detection under adverse scene conditions. The Code is released at https://github.com/ChenHuyoo.

查看原文本刊更多论文

不利场景下的教学：不利场景下无人机图像目标检测的统计反馈驱动阈值与掩模调整师生框架

无人驾驶飞行器（uav）已经成为空中目标检测的关键平台，但它们在现实场景中的性能往往受到恶劣环境条件（如雾霾）的严重影响。在这些具有挑战性的条件下实现鲁棒的无人机目标检测对于增强无人机的全天候态势感知能力至关重要。这在关键应用场景中尤其重要，例如快速灾难响应和信息解释，这需要全天候可靠的视觉感知。无监督域自适应（UDA）在有效缓解源域和目标域之间的域间隙导致的性能下降方面表现出了良好的前景，并有可能推广到不利场景下的无人机目标检测中。然而，现有的UDA研究基于自然图像或清晰的无人机图像，针对不利条件下无人机图像的研究仍处于起步阶段。此外，由于无人机的独特视角和不利条件的干扰，这些方法往往不能准确对准特征，并受到有限或有噪声的伪标签的影响。为了解决这个问题，我们提出了不利场景下无人机目标检测的第一个基准，即统计反馈驱动阈值和掩码调整师生框架（SF-TMAT）。具体来说，SF-TMAT引入了一种动态阶跃反馈掩码调整自编码器（DSFMA）的设计，通过整合训练进度和损失反馈，动态调整掩码比例，重构特征图。该方法在不同的训练阶段动态调整学习焦点，以满足模型在不同粒度级别学习特征的需要。此外，我们提出了一种独特的方差反馈平滑阈值（VFST）策略，该策略统计计算每个类的平均置信度，并通过纳入方差惩罚项来动态调整选择阈值。这种策略提高了伪标签的质量，揭示了潜在的有效标签，从而减轻了领域偏差。大量的实验证明了该算法在恶劣场景条件下对无人机目标进行检测的优越性和泛化能力。该准则发布于https://github.com/ChenHuyoo。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ISPRS Journal of Photogrammetry and Remote Sensing 工程技术-成像科学与照相技术

CiteScore

21.00

自引率

6.30%

发文量

273

审稿时长

40 days

期刊介绍： The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive. P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields. In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.