Shadow-Aware Moving Target Detection in ViSAR: A Multiscale CNN–Transformer Hybrid Detection Framework

IF 5.3 2区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Pub Date : 2025-09-17 DOI:10.1109/JSTARS.2025.3611080

Shangqu Yan;Yaowen Fu;Ruofeng Yu;Chenyang Luo;Wenpeng Zhang;Wei Yang

{"title":"Shadow-Aware Moving Target Detection in ViSAR: A Multiscale CNN–Transformer Hybrid Detection Framework","authors":"Shangqu Yan;Yaowen Fu;Ruofeng Yu;Chenyang Luo;Wenpeng Zhang;Wei Yang","doi":"10.1109/JSTARS.2025.3611080","DOIUrl":null,"url":null,"abstract":"Based on the all-weather imaging capability of synthetic aperture radar (SAR), video synthetic aperture radar (ViSAR) enables dynamic ground target monitoring through high-frame-rate continuous observation. Since moving targets’ shadows remain stable and unaffected by defocusing in ViSAR images, this property is widely leveraged for ViSAR moving target detection. However, ViSAR moving target detection faces challenges including low contrast in moving targets’ shadows, complex background clutter interference, and the difficulty of simultaneously detecting small and medium-sized shadow targets. Existing deep learning-based detection methods for ViSAR struggle to address these challenges effectively. To overcome these limitations, we propose a multiscale convolutional neural network (CNN)–Transformer hybrid detection framework, which consists of three key components. First, in the preprocessing stage of ViSAR images, an improved low-rank representation algorithm is used to suppress background clutter and stationary targets’ shadows, and enhance the contrast of moving targets’ shadows. Second, the CNN’s feature representation capability for small and medium-sized shadow targets is enhanced by improving the backbone and designing a novel feature pyramid network structure. This addresses the challenge of detecting small and medium-sized shadow targets simultaneously. Finally, within the Transformer architecture, a novel proposal generation module and a contrastive denoising training strategy are integrated into the Deformable Transformer to mitigate ambiguous semantic encoding and accelerate convergence in DETR-like detectors. Experimental results on the real-world ViSAR dataset demonstrate that the proposed framework achieves state-of-the-art performance.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"24801-24815"},"PeriodicalIF":5.3000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11168256","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11168256/","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Based on the all-weather imaging capability of synthetic aperture radar (SAR), video synthetic aperture radar (ViSAR) enables dynamic ground target monitoring through high-frame-rate continuous observation. Since moving targets’ shadows remain stable and unaffected by defocusing in ViSAR images, this property is widely leveraged for ViSAR moving target detection. However, ViSAR moving target detection faces challenges including low contrast in moving targets’ shadows, complex background clutter interference, and the difficulty of simultaneously detecting small and medium-sized shadow targets. Existing deep learning-based detection methods for ViSAR struggle to address these challenges effectively. To overcome these limitations, we propose a multiscale convolutional neural network (CNN)–Transformer hybrid detection framework, which consists of three key components. First, in the preprocessing stage of ViSAR images, an improved low-rank representation algorithm is used to suppress background clutter and stationary targets’ shadows, and enhance the contrast of moving targets’ shadows. Second, the CNN’s feature representation capability for small and medium-sized shadow targets is enhanced by improving the backbone and designing a novel feature pyramid network structure. This addresses the challenge of detecting small and medium-sized shadow targets simultaneously. Finally, within the Transformer architecture, a novel proposal generation module and a contrastive denoising training strategy are integrated into the Deformable Transformer to mitigate ambiguous semantic encoding and accelerate convergence in DETR-like detectors. Experimental results on the real-world ViSAR dataset demonstrate that the proposed framework achieves state-of-the-art performance.

查看原文本刊更多论文

ViSAR中阴影感知运动目标检测：一种多尺度CNN-Transformer混合检测框架

视频合成孔径雷达（ViSAR）基于合成孔径雷达（SAR）的全天候成像能力，通过高帧率连续观测实现对地面目标的动态监测。由于运动目标的阴影在ViSAR图像中保持稳定且不受散焦的影响，因此这一特性被广泛用于ViSAR运动目标检测。然而，ViSAR运动目标检测面临着运动目标阴影对比度低、背景杂波干扰复杂、中小型阴影目标难以同时检测等挑战。现有的基于深度学习的ViSAR检测方法难以有效应对这些挑战。为了克服这些限制，我们提出了一种多尺度卷积神经网络(CNN) -变压器混合检测框架，该框架由三个关键组件组成。首先，在ViSAR图像预处理阶段，采用改进的低秩表示算法抑制背景杂波和静止目标阴影，增强运动目标阴影对比度；其次，通过改进主干，设计一种新颖的特征金字塔网络结构，增强CNN对中小阴影目标的特征表示能力。这解决了同时检测中小型阴影目标的挑战。最后，在Transformer架构中，一个新的提议生成模块和对比去噪训练策略被集成到transformable Transformer中，以减轻模糊的语义编码并加速类der检测器的收敛。在真实ViSAR数据集上的实验结果表明，所提出的框架达到了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 地学-成像科学与照相技术

CiteScore

9.30

自引率

10.90%

发文量

563

审稿时长

4.7 months

期刊介绍： The IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing addresses the growing field of applications in Earth observations and remote sensing, and also provides a venue for the rapidly expanding special issues that are being sponsored by the IEEE Geosciences and Remote Sensing Society. The journal draws upon the experience of the highly successful “IEEE Transactions on Geoscience and Remote Sensing” and provide a complementary medium for the wide range of topics in applied earth observations. The ‘Applications’ areas encompasses the societal benefit areas of the Global Earth Observations Systems of Systems (GEOSS) program. Through deliberations over two years, ministers from 50 countries agreed to identify nine areas where Earth observation could positively impact the quality of life and health of their respective countries. Some of these are areas not traditionally addressed in the IEEE context. These include biodiversity, health and climate. Yet it is the skill sets of IEEE members, in areas such as observations, communications, computers, signal processing, standards and ocean engineering, that form the technical underpinnings of GEOSS. Thus, the Journal attracts a broad range of interests that serves both present members in new ways and expands the IEEE visibility into new areas.