Shangqu Yan;Yaowen Fu;Ruofeng Yu;Chenyang Luo;Wenpeng Zhang;Wei Yang
{"title":"Shadow-Aware Moving Target Detection in ViSAR: A Multiscale CNN–Transformer Hybrid Detection Framework","authors":"Shangqu Yan;Yaowen Fu;Ruofeng Yu;Chenyang Luo;Wenpeng Zhang;Wei Yang","doi":"10.1109/JSTARS.2025.3611080","DOIUrl":null,"url":null,"abstract":"Based on the all-weather imaging capability of synthetic aperture radar (SAR), video synthetic aperture radar (ViSAR) enables dynamic ground target monitoring through high-frame-rate continuous observation. Since moving targets’ shadows remain stable and unaffected by defocusing in ViSAR images, this property is widely leveraged for ViSAR moving target detection. However, ViSAR moving target detection faces challenges including low contrast in moving targets’ shadows, complex background clutter interference, and the difficulty of simultaneously detecting small and medium-sized shadow targets. Existing deep learning-based detection methods for ViSAR struggle to address these challenges effectively. To overcome these limitations, we propose a multiscale convolutional neural network (CNN)–Transformer hybrid detection framework, which consists of three key components. First, in the preprocessing stage of ViSAR images, an improved low-rank representation algorithm is used to suppress background clutter and stationary targets’ shadows, and enhance the contrast of moving targets’ shadows. Second, the CNN’s feature representation capability for small and medium-sized shadow targets is enhanced by improving the backbone and designing a novel feature pyramid network structure. This addresses the challenge of detecting small and medium-sized shadow targets simultaneously. Finally, within the Transformer architecture, a novel proposal generation module and a contrastive denoising training strategy are integrated into the Deformable Transformer to mitigate ambiguous semantic encoding and accelerate convergence in DETR-like detectors. Experimental results on the real-world ViSAR dataset demonstrate that the proposed framework achieves state-of-the-art performance.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"24801-24815"},"PeriodicalIF":5.3000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11168256","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11168256/","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Based on the all-weather imaging capability of synthetic aperture radar (SAR), video synthetic aperture radar (ViSAR) enables dynamic ground target monitoring through high-frame-rate continuous observation. Since moving targets’ shadows remain stable and unaffected by defocusing in ViSAR images, this property is widely leveraged for ViSAR moving target detection. However, ViSAR moving target detection faces challenges including low contrast in moving targets’ shadows, complex background clutter interference, and the difficulty of simultaneously detecting small and medium-sized shadow targets. Existing deep learning-based detection methods for ViSAR struggle to address these challenges effectively. To overcome these limitations, we propose a multiscale convolutional neural network (CNN)–Transformer hybrid detection framework, which consists of three key components. First, in the preprocessing stage of ViSAR images, an improved low-rank representation algorithm is used to suppress background clutter and stationary targets’ shadows, and enhance the contrast of moving targets’ shadows. Second, the CNN’s feature representation capability for small and medium-sized shadow targets is enhanced by improving the backbone and designing a novel feature pyramid network structure. This addresses the challenge of detecting small and medium-sized shadow targets simultaneously. Finally, within the Transformer architecture, a novel proposal generation module and a contrastive denoising training strategy are integrated into the Deformable Transformer to mitigate ambiguous semantic encoding and accelerate convergence in DETR-like detectors. Experimental results on the real-world ViSAR dataset demonstrate that the proposed framework achieves state-of-the-art performance.
期刊介绍:
The IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing addresses the growing field of applications in Earth observations and remote sensing, and also provides a venue for the rapidly expanding special issues that are being sponsored by the IEEE Geosciences and Remote Sensing Society. The journal draws upon the experience of the highly successful “IEEE Transactions on Geoscience and Remote Sensing” and provide a complementary medium for the wide range of topics in applied earth observations. The ‘Applications’ areas encompasses the societal benefit areas of the Global Earth Observations Systems of Systems (GEOSS) program. Through deliberations over two years, ministers from 50 countries agreed to identify nine areas where Earth observation could positively impact the quality of life and health of their respective countries. Some of these are areas not traditionally addressed in the IEEE context. These include biodiversity, health and climate. Yet it is the skill sets of IEEE members, in areas such as observations, communications, computers, signal processing, standards and ocean engineering, that form the technical underpinnings of GEOSS. Thus, the Journal attracts a broad range of interests that serves both present members in new ways and expands the IEEE visibility into new areas.