{"title":"PVSSNet: Progressive Feature Interaction Visual State-Space Network for Multispectral Pansharpening","authors":"Guoxia Xu;Zhenwei Xu;Lizhen Deng;Hu Zhu","doi":"10.1109/LGRS.2025.3576291","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3576291","url":null,"abstract":"Pansharpening involves extracting spectral information from multispectral images (MSs) and structural details from panchromatic images (PAN), then fusing them to produce high-resolution multispectral (HRMS) remote sensing images. However, high-resolution MSs often suffer from spectral or structural information loss. In this letter, we introduce a pansharpening algorithm based on a progressive feature interaction visual state-space network. It enables interaction between local and global features of multispectral and PAN and facilitates the injection of spectral and spatial details through distinct attention modules. This approach effectively preserves both spectral characteristics and spatial structure through interbranch information interaction and complementation. Additionally, by integrating a visual state-space network, the proposed model achieves deep reconstruction of multiscale global information, enhancing robustness and generalization. Extensive experimental results demonstrate that the proposed network achieves highly competitive performance in both visual assessments and objective metric evaluations.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144331616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"U-Shaped Feature Extraction and Fusion Network for Object Detection in Low-Altitude UAV Images","authors":"Lingjie Jiang;Yu Gu;Dongliang Peng","doi":"10.1109/LGRS.2025.3575169","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3575169","url":null,"abstract":"In the past decade, object detection technology has developed rapidly. However, in the field of unmanned aerial vehicle (UAV) image object detection, challenges such as complex environments, numerous and dense small objects, and weak features make object detection from the UAV perspective a highly challenging task. To address these issues, this letter proposes a U-shaped feature extraction and fusion network (U-ShapeNet). Specifically: first, to enhance the network’s feature extraction capability and improve the perception of small objects, we design a novel U-shaped feature extraction network (U-SFEN) and introduce a tiny object detection head. Second, a large kernel feature selection module (LKFSM) is constructed to strengthen the network’s contextual information learning ability and effectively distinguish small objects from complex background noise. Third, a same-scale feature enhancement module (SFEM) is proposed to mitigate information decay by reusing same-scale feature maps. Experiments on the VisDrone2019 and HazyDet datasets demonstrate that U-ShapeNet outperforms current mainstream object detectors, achieving state-of-the-art performance.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144299057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junzhe Wang;Xinke Chen;Anbang Dai;Yan Liu;Guanying Huo
{"title":"LS-DETR: Lightweight Transformer for Object Detection in Forward-Looking Sonar Images","authors":"Junzhe Wang;Xinke Chen;Anbang Dai;Yan Liu;Guanying Huo","doi":"10.1109/LGRS.2025.3575615","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3575615","url":null,"abstract":"A transformer-based end-to-end detection method lightweight sonar detection transformer (LS-DETR) is proposed, which is specifically tailored for enhancing detection accuracy in forward-looking sonar images while significantly reducing the computational load. Despite the challenges posed by the complexity of underwater environments that have led to suboptimal detection performance and the lack of lightweight optimization for underwater devices, LS-DETR addresses these issues effectively. In LS-DETR, the backbone employs a newly proposed lightweight-gated attention block (LGABlock), which reduces computational redundancy through low-complexity convolutions and gated attention. A lightweight hybrid encoder (LHE) is designed to facilitate scale-internal feature interaction and optimize the feature fusion approach. Furthermore, wise complete IoU (WCIoU)-aware query selection is proposed and integrated with NWDLoss in the decoder, enabling the scores to integrate classification and positional information while focusing on the small targets. Results demonstrate that on the multibeam forward-looking sonar dataset UATD, LS-DETR achieved a 2.8% increase in accuracy and a 31.5% reduction in parameter count, proving the effectiveness and superiority.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144255612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scribble-Guided Structural Regression Fusion for Multimodal Remote Sensing Change Detection","authors":"Yongjie Zheng;Sicong Liu;Lorenzo Bruzzone","doi":"10.1109/LGRS.2025.3575620","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3575620","url":null,"abstract":"Accurate change detection (CD) in multitemporal multimodal remote sensing images is crucial for numerous applications. However, existing unsupervised CD methods often face challenges in suppressing background noise, preserving fine-grained boundaries, and maintaining spatial coherence of target regions. To overcome these limitations, this study proposes a novel scribble-guided structural regression fusion (SG-SRF) framework, which integrates sparse scribble annotations as lightweight priors into a dynamic regression mechanism. Specifically, the framework employs a scribble distance map to refine hypergraph Laplacian matrices, thereby optimizing feature representation for critical targets while suppressing irrelevant backgrounds. The experimental results demonstrate that the proposed method significantly outperforms traditional unsupervised methods in detecting complete and accurate change objects with minimal scribble input. Notably, the scribble guidance offers an efficient and cost-effective solution to the inherent limitations of unsupervised approaches, enabling more precise CD without extensive labeled datasets. This work aims to bridge the gap between unsupervised adaptability and supervised accuracy, offering significant potential for practical CD applications. The source code will be made publicly available at <uri>https://github.com/MissYongjie/SG-SRF</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144255613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Deep-Learning-Based Targeted Interpolation Method for Seismic Data: A Consecutively Missing Trace VSP Case","authors":"Wen Yang;Qianggong Song;Le Li;Xiaobin Li;Zhonglin Cao;Pengfei Duan","doi":"10.1109/LGRS.2025.3565742","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3565742","url":null,"abstract":"Seismic data interpolation is an important processing method for improving the quality of seismic data. Traditional interpolation methods often face limitations due to their dependence on prior information and their challenges in processing continuous missing data. Vertical seismic profiling (VSP) data, owing to its unique acquisition approach, generally do not suffer from missing receivers but can have missing shots, with the locations of these missing shots being known. To address this specific issue of missing shots, a specialized interpolation technique has been proposed for targeted missing data. This technique involves creating datasets from the original complete data that are tailored to fixed missing shot scenarios, allowing for a more effective application of the trained network to field data. In addition, we have optimized the network structure based on UNet to meet the specific requirements for handling consecutive gaps. Both synthetic and field data demonstrate the effectiveness of this targeted interpolation method.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144072855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiping Yu;Chenyang Liu;Chuyu Zhong;Zhengxia Zou;Zhenwei Shi
{"title":"Multi-Grained Guided Diffusion for Quantity-Controlled Remote Sensing Object Generation","authors":"Zhiping Yu;Chenyang Liu;Chuyu Zhong;Zhengxia Zou;Zhenwei Shi","doi":"10.1109/LGRS.2025.3565817","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3565817","url":null,"abstract":"Accurate object counts represent essential semantical information in remote sensing imagery, significantly impacting applications such as traffic monitoring and urban planning. Despite the recent advances in text-to-image (T2I) generation in remote sensing, existing methods still face challenges in precisely controlling the number of object instances in generated images. To address this challenge, we propose a novel method, multi-grained guided diffusion (MGDiff). During training, unlike previous methods that relied solely on latent-space noise constraints, MGDiff imposes constraints at three distinct granularities: latent pixel, global counting, and spatial distribution. The multi-grained guidance mechanism matches the quantity prompts with object spatial layouts in the feature space, enabling our model to achieve precise control over object quantities. To benchmark this new task, we present Levir-QCG, a dataset comprising 10504 remote sensing images across five object categories, annotated with precise object counts and segmentation masks. We conducted extensive experiments to benchmark our method against previous methods on the Levir-QCG dataset. Compared to previous models, the MGDiff achieves an approximately +40% improvement in counting accuracy while maintaining higher visual fidelity and strong zero-shot generalization. To the best of our knowledge, this is the first work to research accurate object quantity control in remote sensing T2I generation. The dataset and code will be publicly available at <uri>https://github.com/YZPioneer/MGDiff</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Establishing Nuanced Multimodal Attention for Weakly Supervised Semantic Segmentation of Remote Sensing Scenes","authors":"Qiming Zhang;Junjie Zhang;Huaxi Huang;Fangyu Wu;Hongwen Yu","doi":"10.1109/LGRS.2025.3565710","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3565710","url":null,"abstract":"Weakly supervised semantic segmentation (WSSS) with image-level labels reduces reliance on pixel-level annotations for remote sensing (RS) imagery. However, in natural scenes, WSSS frequently faces challenges such as imprecise localization, extraneous activations, and class ambiguity. These challenges are particularly pronounced in RS images, characterized by complex backgrounds, substantial scale variations, and dense small-object distributions, complicating the distinction between intraclass variations and interclass similarities. To tackle these challenges, we introduce a class-constrained multimodal attention framework aimed at enhancing the localization accuracy of class activation maps (CAMs). Specifically, we design class-specific tokens to capture the visual characteristics of each target class. As these tokens initially lack explicit constraints, we integrate the textual branch of the RemoteCLIP model to leverage class-related linguistic priors, which collaborate with visual features to encode the specific semantics of diverse objects. Furthermore, the multimodal collaborative optimization module dynamically establishes tailored attention mechanisms for both global and regional features, thereby improving class discriminability among targets to mitigate challenges such as interclass similarity and dense small-object distributions. By refining class-specific attention, textual semantic attention, and patch-level pairwise affinity weights, the quality of generated pseudomasks is markedly enhanced. Concurrently, to ensure domain-invariant feature learning, we align the backbone features with the CLIP visual embedding by minimizing the distribution disparity between the two in the latent space, and semantic consistency is, therefore, preserved. The experimental results validate the effectiveness and robustness of our proposed method, achieving significant performance improvements on two representative RS WSSS datasets.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143943919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xi Bin;Zhang Yu;Li Wenmei;Zhao Lei;Xu Kunpeng;Ma Yunmei;He Yuhong
{"title":"An Advanced Approach for Understory Terrain Extraction Utilizing TomoSAR and MCSF Algorithm","authors":"Xi Bin;Zhang Yu;Li Wenmei;Zhao Lei;Xu Kunpeng;Ma Yunmei;He Yuhong","doi":"10.1109/LGRS.2025.3565785","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3565785","url":null,"abstract":"The understory terrain is an essential component of forest vertical structure and ecosystem health, providing crucial insights for resource assessment and forestry surveys. This letter proposes a novel method for extracting understory terrain through forest backscattering power profiles and the modified cloth simulation filtering (MCSF) algorithm. It innovatively reconstructs synthetic aperture radar (SAR) signals into a 3-D point cloud, eliminating sidelobe signals to reduce noise while only retaining the mainlobe signals. The MCSF algorithm is subsequently utilized to extract ground and nonground points based on the vertical distribution of the mainlobe signals. The extracted ground points offer a more precise representation of actual terrain conditions. The feasibility of the method was validated utilizing airborne P-band multi baseline SAR data obtained from the Saihanba test site in Hebei Province. The outcomes clearly indicate that our approach exhibits superior correlation (0.999) and a smaller root mean square error (RMSE) (3.07 m) in comparison to conventional methods when compared with the reference digital elevation model (DEM).","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143938007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Covariance Matrix Estimation via Geometric Median in Highly Heterogeneous PolSAR Images","authors":"Dehbia Hanis;Luca Pallotta;Karima Hadj-Rabah;Azzedine Bouaraba;Aichouche Belhadj-Aissa","doi":"10.1109/LGRS.2025.3565808","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3565808","url":null,"abstract":"The Wishart distribution is a well-established statistical model for characterizing the density of random variables in polarimetric synthetic aperture radar (PolSAR) data, particularly within homogeneous regions where Gaussian assumptions hold. However, as PolSAR applications expand into heterogeneous environments, alternative statistical models have been developed to better capture the complexity of such areas, playing an important role in tasks such as classification. In this study, we examine the effectiveness of covariance matrix estimation using the median matrix, a technique grounded in optimal transport theory and validated in prior research for its effectiveness. Building on this foundation, we propose the application of a statistical model tailored for heterogeneous regions, i.e., following the <inline-formula> <tex-math>$mathcal {G}^{0}_{P}$ </tex-math></inline-formula> distribution, addressing the limitations of traditional assumptions. This method is particularly suitable for high-resolution PolSAR datasets, where the homogeneity hypothesis often does not hold. The experimental results obtained using L-band PolSAR images acquired over Foulum in Denmark demonstrate the robustness of our proposed variant.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144073101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zheng Zhang;Tingfa Xu;Peng Lou;Peng Lv;Tiehong Tian;Jianan Li
{"title":"BiMAConv: Bimodal Adaptive Convolution for Multispectral Point Cloud Segmentation","authors":"Zheng Zhang;Tingfa Xu;Peng Lou;Peng Lv;Tiehong Tian;Jianan Li","doi":"10.1109/LGRS.2025.3565739","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3565739","url":null,"abstract":"Multispectral point cloud segmentation, leveraging both spatial and spectral information to classify individual points, is crucial for applications such as remote sensing, autonomous driving, and urban planning. However, existing methods primarily focus on spatial information and merge it with spectral data without fully considering their differences, limiting the effective use of spectral information. In this letter, we introduce a novel approach, bimodal adaptive convolution (BiMAConv), which fully exploits information from different modalities, based on the divide-and-conquer philosophy. Specifically, BiMAConv leverages the spectral features provided by the spectral information divergence (SID) and the weight information provided by the modal-weight block (MW-Block) module. The SID highlights slight differences in spectral information, providing detailed differential feature information. The MW-Block module utilizes an attention mechanism to combine generated features with the original point cloud, thereby generating weights to maintain learning balance sharply. In addition, we reconstruct a large-scale urban point cloud dataset GRSS_DFC_2018_3D based on dataset GRSS_DFC_2018 to advance the field of multispectral remote sensing point cloud, with a greater number of categories, more precise annotations, and registered multispectral channels. BiMAConv is fundamentally plug-and-play and supports different shared-multilayer perceptron (MLP) methods with almost no architectural changes. Extensive experiments on GRSS_DFC_2018_3D and Toronto-3D benchmarks demonstrate that our method significantly boosts the performance of popular detectors.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}