ISPRS Journal of Photogrammetry and Remote Sensing最新文献

筛选
英文 中文
URSimulator: Human-perception-driven prompt tuning for enhanced virtual urban renewal via diffusion models URSimulator:人类感知驱动的快速调整,通过扩散模型增强虚拟城市更新
IF 10.6 1区 地球科学
ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-07-24 DOI: 10.1016/j.isprsjprs.2025.07.016
Chuanbo Hu , Shan Jia , Xin Li
{"title":"URSimulator: Human-perception-driven prompt tuning for enhanced virtual urban renewal via diffusion models","authors":"Chuanbo Hu ,&nbsp;Shan Jia ,&nbsp;Xin Li","doi":"10.1016/j.isprsjprs.2025.07.016","DOIUrl":"10.1016/j.isprsjprs.2025.07.016","url":null,"abstract":"<div><div>Tackling Urban Physical Disorder (UPD) – such as abandoned buildings, litter, messy vegetation, and graffiti – is essential, as it negatively impacts the safety, well-being, and psychological state of communities. Urban Renewal is the process of revitalizing these neglected and decayed areas within a city to improve their physical environment and quality of life for residents. Effective urban renewal efforts can transform these environments, enhancing their appeal and livability. However, current research lacks simulation tools that can quantitatively assess and visualize the impacts of urban renewal efforts, often relying on subjective judgments. Such simulation tools are essential for planning and implementing effective renewal strategies by providing a clear visualization of potential changes and their impacts. This paper presents a novel framework that addresses this gap by using human perception feedback to simulate the enhancement of street environment. We develop a prompt tuning approach that integrates text-driven Stable Diffusion with human perception feedback. This method iteratively edits local areas of street view images, aligning them more closely with human perceptions of beauty, liveliness, and safety. Our experiments show that this framework significantly improves people’s perceptions of urban environments, with increases of 17.60% in safety, 31.15% in beauty, and 28.82% in liveliness. In comparison, other advanced text-driven image editing methods like DiffEdit only achieve improvements of 2.31% in safety, 11.87% in beauty, and 15.84% in liveliness. We applied this framework across various virtual scenarios, including neighborhood improvement, building redevelopment, green space expansion, and community garden creation. The results demonstrate its effectiveness in simulating urban renewal, offering valuable insights for real-world urban planning and policy-making. This method not only enhances the visual appeal of neglected urban areas but also serves as a powerful tool for city planners and policymakers, ultimately improving urban landscapes and the quality of life for residents.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 356-369"},"PeriodicalIF":10.6,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144695047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HATFormer: Height-aware Transformer for multimodal 3D change detection HATFormer:用于多模态3D变化检测的高度感知变压器
IF 10.6 1区 地球科学
ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-07-23 DOI: 10.1016/j.isprsjprs.2025.06.022
Biyuan Liu , Zhou Huang , Yanxi Li , Rongrong Gao , Huai-Xin Chen , Tian-Zhu Xiang
{"title":"HATFormer: Height-aware Transformer for multimodal 3D change detection","authors":"Biyuan Liu ,&nbsp;Zhou Huang ,&nbsp;Yanxi Li ,&nbsp;Rongrong Gao ,&nbsp;Huai-Xin Chen ,&nbsp;Tian-Zhu Xiang","doi":"10.1016/j.isprsjprs.2025.06.022","DOIUrl":"10.1016/j.isprsjprs.2025.06.022","url":null,"abstract":"<div><div>Understanding the three-dimensional dynamics of the Earth’s surface is essential for urban planning and environmental monitoring. In the absence of consistent bitemporal 3D data, recent advancements in change detection have increasingly turned to combining multimodal data sources, including digital surface models (DSMs) and optical remote sensing imagery. However, significant inter-modal differences and intra-class variance — particularly with imbalances between foreground and background classes — continue to pose major challenges for achieving accurate change detection. To address these challenges, we propose a height-aware Transformer network, termed HATFormer, for multimodal semantic and height change detection, which explicitly correlates features across different modalities to reduce modality gaps and incorporates additional background supervision to mitigate foreground-to-background imbalances. Specifically, we first introduce a Background Height Estimation (BHE) module that incorporates height-awareness learning within the background to predict height information directly from lateral image features. This module enhances discriminative background feature learning and reduces the modality gap between monocular images and DSM data. To alleviate the interference of noisy background heights, a Height Uncertainty Suppression (HUS) module is designed to suppress the regions with height uncertainty. Secondly, we propose a Foreground Mask Estimation (FME) module to identify foreground change regions from DSM features, guided by discriminative background features. This module also acts as a regularizer, supporting more effective feature learning within the BHE module. Finally, an Auxiliary Feature Aggregation (AFA) module is designed to integrate features from the FME and BHE modules, which are then decoded by a multi-task decoder to generate precise change predictions. Extensive experiments on the Hi-BCD Plus and SMARS datasets demonstrate that our proposed method outperforms eight state-of-the-art methods, achieving superior performance in semantic and height change detection from multimodal bitemporal data. The code and dataset will be publicly available at: <span><span>https://github.com/HATFormer/HATFormer</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 340-355"},"PeriodicalIF":10.6,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144686373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAR ship detection across different spaceborne platforms with confusion-corrected self-training and region-aware alignment framework 基于混淆校正自训练和区域感知对准框架的不同星载平台SAR舰船检测
IF 10.6 1区 地球科学
ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-07-22 DOI: 10.1016/j.isprsjprs.2025.07.017
Shuang Liu , Dong Li , Haibo Song , Caizhi Fan , Ke Li , Jun Wan , Ruining Liu
{"title":"SAR ship detection across different spaceborne platforms with confusion-corrected self-training and region-aware alignment framework","authors":"Shuang Liu ,&nbsp;Dong Li ,&nbsp;Haibo Song ,&nbsp;Caizhi Fan ,&nbsp;Ke Li ,&nbsp;Jun Wan ,&nbsp;Ruining Liu","doi":"10.1016/j.isprsjprs.2025.07.017","DOIUrl":"10.1016/j.isprsjprs.2025.07.017","url":null,"abstract":"<div><div>Synthetic Aperture Radar (SAR) ship detection is a vital technology for transforming reconnaissance data into actionable intelligence. As spaceborne SAR platforms increase, significant distribution shifts arise among SAR data from different platforms due to diverse imaging conditions and technical parameters. Traditional deep learning detectors, typically optimized for single-platform data, struggle with such shifts and annotation scarcity, limiting cross-platform applicability. Mainstream methods employ unsupervised domain adaptation (UDA) techniques to transfer detectors from a labeled source domain (existing platform data) to a novel unlabeled target domain (new platform data). However, the inherent complexity of SAR images, particularly strong background scattering regions, causes high confusion between ships and non-target regions, making these methods vulnerable to background interference and reducing their effectiveness in cross-platform detection. To alleviate this, we propose a <u>C</u>onfusion-Corrected <u>S</u>elf-Training with <u>R</u>egion-Aware <u>F</u>eature <u>A</u>lignment (CSRFA) framework for cross-platform SAR ship detection. First, a Confusion-corrected Self-training Mechanism (CSM) refines and corrects misclassified proposals to suppress background interference and enhance pseudo-label reliability on unlabeled target domains. Then, a Foreground Guidance Mechanism (FGM) further improves proposal quality by exploiting the consistency between region proposal classification and localization. Finally, a Region-Aware Feature Alignment (RAFA) module aligns ship regions based on RPN-generated foreground probabilities, enabling fine-grained, target-aware domain adaptation. Experiments on GF-3, SEN-1, and HRSID datasets show that CSRFA consistently outperforms existing UDA methods, achieving an average AP improvement of 2% across six cross-platform tasks compared to the second-best approach, demonstrating its robustness and adaptability for practical deployment.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 305-322"},"PeriodicalIF":10.6,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144680105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Colour-informed ecoregion analysis highlights a satellite capability gap for spatially and temporally consistent freshwater cyanobacteria monitoring 颜色通知生态区域分析突出了空间和时间一致的淡水蓝藻监测卫星能力差距
IF 10.6 1区 地球科学
ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-07-22 DOI: 10.1016/j.isprsjprs.2025.07.030
Davide Lomeo , Stefan G.H. Simis , Nick Selmes , Anne D. Jungblut , Emma J. Tebbs
{"title":"Colour-informed ecoregion analysis highlights a satellite capability gap for spatially and temporally consistent freshwater cyanobacteria monitoring","authors":"Davide Lomeo ,&nbsp;Stefan G.H. Simis ,&nbsp;Nick Selmes ,&nbsp;Anne D. Jungblut ,&nbsp;Emma J. Tebbs","doi":"10.1016/j.isprsjprs.2025.07.030","DOIUrl":"10.1016/j.isprsjprs.2025.07.030","url":null,"abstract":"<div><div>Cyanobacteria blooms pose significant risks to water quality in freshwater ecosystems worldwide, with implications for human and animal health. Constructing consistent records of cyanobacteria dynamics in complex inland waters from satellite imagery remains challenged by discontinuous sensor capabilities, particularly with regard to spectral coverage. Comparing 11 satellite sensors, we show that the number and positioning of wavebands fundamentally alter bloom detection capability, with wavebands centred at 412, 620, 709, 754 and 779 nm proving most critical for capturing cyanobacteria dynamics. Specifically, analysis of observations from the Medium Resolution Imaging Spectrometer (MERIS) and Ocean and Land Colour Instrument (OLCI), coincident with the Moderate Resolution Imaging Spectroradiometer (MODIS) demonstrates how the spectral band configuration of the latter affects bloom detection. Using an Optical Water Types (OWT) library understood to capture cyanobacterial biomass through varying vertical mixing states, this analysis shows that MODIS can identify optically distinct conditions like surface accumulations but fails to resolve initial bloom evolution in well-mixed conditions, particularly in optically complex regions. Investigation of coherent ecoregions formed using Self-organising Maps trained on OWT membership scores confirm that MODIS captures broad spatial patterns seen with more capable sensors but compresses optical gradients into fewer optical types. These constraints have significant implications for interpreting spatial–temporal dynamics of cyanobacteria in large waterbodies, particularly during 2012–2016 when MERIS and OLCI sensors were absent, and small waterbodies, where high spatial resolution sensors not originally design to study water are used. In addition, these findings underscore the importance of key wavebands in future sensor design and the development of approaches to maintain consistent long-term records across evolving satellite capabilities. Our findings suggest that attempts at quantitatively harmonising cyanobacteria bloom detection across sensors may not be ecologically appropriate unless these observation biases are addressed. For example, analysing the frequency and intensity of surfacing blooms, while considering the meteorological factors that may drive these phenomena, could be considered over decadal timescales, whereas trend analysis of mixed-column biomass should only concern appropriate sensor observation periods.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 323-339"},"PeriodicalIF":10.6,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144680018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
STFMamba: Spatiotemporal satellite image fusion network based on visual state space model STFMamba:基于视觉状态空间模型的时空卫星图像融合网络
IF 10.6 1区 地球科学
ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-07-22 DOI: 10.1016/j.isprsjprs.2025.07.011
Min Zhao , Xiaolu Jiang , Bo Huang
{"title":"STFMamba: Spatiotemporal satellite image fusion network based on visual state space model","authors":"Min Zhao ,&nbsp;Xiaolu Jiang ,&nbsp;Bo Huang","doi":"10.1016/j.isprsjprs.2025.07.011","DOIUrl":"10.1016/j.isprsjprs.2025.07.011","url":null,"abstract":"<div><div>Remote sensing images provide extensive information about Earth’s surface, supporting a wide range of applications. Individual sensors often encounter a trade-off between spatial and temporal resolutions, spatiotemporal fusion (STF) aims to overcome this shortcoming by combining multisource data. Existing deep learning-based STF methods struggle with capturing long-range dependencies (CNN-based) or incur high computational cost (Transformer-based). To overcome these limitations, we propose STFMamba, a two-step state space model that effectively captures global information while maintaining linear complexity. Specifically, a super-resolution (SR) network is firstly utilized to mitigate sensor heterogeneity of multisource data, then a dual U-Net is designed to fully leverage spatio-temporal correlations and capture temporal variations. Our STFMamba contains the following three key components: 1) the multidimensional scanning mechanism for global relationship modeling to eliminate information loss, 2) a spatio-spectral–temporal fusion scanning strategy to integrate multiscale contextual features, and 3) a multi-head cross-attention module for adaptive selection and fusion. Additionally, we develop a lightweight version of STFMamba for deployment on resource-constrained devices, incorporating a knowledge distillation strategy to align its features with the base model and enhance performance. Extensive experiments on three benchmark datasets demonstrate the superiority of the proposed method. Specifically, our method outperforms compared methods, including FSDAF, FVSDF, EDCSTFN, GANSTFM, SwinSTFM, and DDPMSTF, with average RMSE reductions of 24.25%, 25.94%, 18.15%, 14.36%, 9.63%, and 12.82%, respectively. Our code is available at: <span><span>https://github.com/zhaomin0101/STFMamba</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 288-304"},"PeriodicalIF":10.6,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144680104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mind the modality gap: Towards a remote sensing vision-language model via cross-modal alignment 注意模态差距:通过跨模态对齐实现遥感视觉语言模型
IF 10.6 1区 地球科学
ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-07-21 DOI: 10.1016/j.isprsjprs.2025.06.019
Angelos Zavras , Dimitrios Michail , Begüm Demir , Ioannis Papoutsis
{"title":"Mind the modality gap: Towards a remote sensing vision-language model via cross-modal alignment","authors":"Angelos Zavras ,&nbsp;Dimitrios Michail ,&nbsp;Begüm Demir ,&nbsp;Ioannis Papoutsis","doi":"10.1016/j.isprsjprs.2025.06.019","DOIUrl":"10.1016/j.isprsjprs.2025.06.019","url":null,"abstract":"&lt;div&gt;&lt;div&gt;Deep Learning (DL) is undergoing a paradigm shift with the emergence of foundation models. In this work, we focus on Contrastive Language-Image Pre-training (CLIP), a Vision-Language foundation model that achieves high accuracy across various image classification tasks and often rivals fully supervised baselines, despite not being explicitly trained for those tasks. Nevertheless, there are still domains where zero-shot CLIP performance is far from optimal, such as Remote Sensing (RS) and medical imagery. These domains do not only exhibit fundamentally different distributions compared to natural images, but also commonly rely on complementary modalities, beyond RGB, to derive meaningful insights. To this end, we propose a methodology to align distinct RS image modalities with the visual and textual modalities of CLIP. Our two-stage procedure addresses the aforementioned distribution shift, extends the zero-shot capabilities of CLIP and enriches CLIP’s shared embedding space with domain-specific knowledge. Initially, we robustly fine-tune CLIP according to the PAINT (Ilharco et al., 2022) patching protocol, in order to deal with the distribution shift. Building upon this foundation, we facilitate the cross-modal alignment of a RS modality encoder by distilling knowledge from the CLIP visual and textual encoders. We empirically show that both patching and cross-modal alignment translate to significant performance gains, across several RS imagery classification and cross-modal retrieval benchmark datasets. Patching dramatically improves RS imagery (RGB) classification (BigEarthNet-5: +39.76% mAP, BigEarthNet-19: +56.86% mAP, BigEarthNet-43: +28.43% mAP, SEN12MS: +20.61% mAP, EuroSAT: +5.98% Acc), while it maintains performance on the representative supported task (ImageNet), and most critically it outperforms existing RS-specialized CLIP variants such as RemoteCLIP (Liu et al., 2023a) and SkyCLIP (Wang et al., 2024). Cross-modal alignment extends zero-shot capabilities to multi-spectral data, surpassing our patched CLIP classification performance and establishing strong cross-modal retrieval baselines. Linear probing further confirms the quality of learned representations of our aligned multi-spectral encoder, outperforming existing RS foundation models such as SatMAE (Cong et al., 2022). Notably, these enhancements are achieved without the reliance on textual descriptions, without introducing any task-specific parameters, without training from scratch and without catastrophic forgetting. Our work highlights the potential of leveraging existing VLMs’ large-scale pre-training and extending their zero-shot capabilities to specialized fields, paving the way for resource efficient establishment of in-domain multi-modal foundation models in RS and beyond. We make our code implementation and weights for all experiments publicly available on our project’s GitHub repository &lt;span&gt;&lt;span&gt;https://github.com/Orion-AI-Lab/MindTheModalityGap&lt;/span&gt;&lt;svg&gt;&lt;","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 270-287"},"PeriodicalIF":10.6,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-dimensional reconstruction of shallow seabed topographic surface based on fusion of side-scan sonar and echo sounding data 基于侧扫声纳与回波测深数据融合的浅海地形面三维重建
IF 10.6 1区 地球科学
ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-07-21 DOI: 10.1016/j.isprsjprs.2025.07.018
Chunqing Ran , Luotao Zhang , Shuo Han , Xiaobo Zhang , Shengli Wang , Xinghua Zhou
{"title":"Three-dimensional reconstruction of shallow seabed topographic surface based on fusion of side-scan sonar and echo sounding data","authors":"Chunqing Ran ,&nbsp;Luotao Zhang ,&nbsp;Shuo Han ,&nbsp;Xiaobo Zhang ,&nbsp;Shengli Wang ,&nbsp;Xinghua Zhou","doi":"10.1016/j.isprsjprs.2025.07.018","DOIUrl":"10.1016/j.isprsjprs.2025.07.018","url":null,"abstract":"<div><div>High-precision topographic mapping of offshore shallow seabed has great significance in a number of fields, including shipping navigation, disaster warning, environmental monitoring and resource management. However, conventional side-scan sonar (SSS) techniques are difficult to obtain seabed elevation data, which limits their application in the field of three-dimensional (3D) topographic reconstruction. Meanwhile, although single-beam echo sounder (SBES) can provide accurate depth information, it is difficult to capture the details of complex terrain due to sparse spatial coverage. In order to overcome the limitations of a single technique in 3D seafloor topographic reconstruction applications, this study fuses SSS and SBES data, and proposes the Multi-Scale Gradient Fusion Shape From Shading (MSGF-SFS) algorithm. This algorithm extracts and fuses surface gradient information by analyzing the intensity variations in SSS images at multiple scales. This enables the construction of a 3D discrete elevation model from two-dimensional (2D) SSS data. In order to reduce the inherent elevation error of SSS, the topographic feature extraction and least squares optimization for multi-source data alignment and correction algorithm is introduced, which combines terrain feature extraction and least squares optimization to fuse the SBES depth data with the 3D discrete elevation model for calibration. The quality of the 3D discrete elevation model was then optimized by the data filtering based on quadtree domain partitioning and least squares function. Finally, a high-resolution 3D continuous seabed model was constructed on the basis of the filtered data using implicit function based on Undirected Distance Function (IF-UDF) deep learning algorithm. Based on the above methods, this study realized the 3D seabed topography reconstruction of an offshore area in the Yellow Sea of China and conducted comparative experiments. The findings demonstrate that a series of methods in this paper can effectively reconstruct a fine 3D seabed model, and the obtained model is better than the existing 3D reconstruction techniques in terms of normal consistency and continuity, and shows stronger robustness and higher accuracy than the traditional algorithms. This method provides a systematic and practical solution for high-resolution offshore topographic mapping, especially for high-precision requirements in complex environments, and can effectively serve as an alternative to multibeam systems in the field of offshore topography mapping.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 249-269"},"PeriodicalIF":10.6,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-scenario damaged building extraction network: Methodology, application, and efficiency using single-temporal HRRS imagery 跨场景受损建筑提取网络:使用单时间HRRS图像的方法、应用和效率
IF 10.6 1区 地球科学
ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-07-19 DOI: 10.1016/j.isprsjprs.2025.06.028
Haifeng Wang , Wei He , Zhuohong Li , Naoto Yokoya
{"title":"Cross-scenario damaged building extraction network: Methodology, application, and efficiency using single-temporal HRRS imagery","authors":"Haifeng Wang ,&nbsp;Wei He ,&nbsp;Zhuohong Li ,&nbsp;Naoto Yokoya","doi":"10.1016/j.isprsjprs.2025.06.028","DOIUrl":"10.1016/j.isprsjprs.2025.06.028","url":null,"abstract":"<div><div>The extraction of damaged buildings is of significant importance in various fields, such as disaster assessment and resource allocation. Although multi-temporal-based methods exhibit remarkable advantages in detecting damaged buildings, single-temporal extraction remains crucial in real-world emergency responses due to its immediate usability. However, single-temporal cross-scenario extraction at high-resolution remote sensing (HRRS) encounters the following challenges: (i) morphological heterogeneity of building damage which causes by the interplay of unknown disaster types with unpredictable geographic contexts, and (ii) scarcity of fine-grained annotated datasets for unseen disaster scenarios which limits the accuracy of rapid damage mapping. Confronted with these challenges, our main idea is to decompose complex features of damaged building into five attribute-features, which can be trained using historical disaster data to enable the independent learning of both building styles and damage features. Consequently, we propose a novel Correlation Feature Decomposition Network (CFDNet) along with a coarse-to-fine training strategy for the cross-scenario damaged building extraction. In detail, at the coarse training stage, the CFDNet is trained to decompose the damaged building segmentation task into the extraction of multiple attribute-features. At the fine training stage, specific attribute-features, such as building feature and damage feature, are trained using auxiliary datasets. We have evaluated CFDNet on several datasets that cover different types of disasters and have demonstrated its superiority and robustness compared with state-of-the-art methods. Finally, we also apply the proposed model for the damaged building extraction in areas historically affected by major disasters, namely, the Turkey–Syria earthquakes on 6 February 2023, Cyclone Mocha in the Bay of Bengal on 23 May 2023, and Hurricane Ian in Florida, USA in September 2022. Results from practical applications also emphasize the significant advantages of our proposed CFDNet.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 228-248"},"PeriodicalIF":10.6,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144663363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CSW-SAM: a cross-scale algorithm for very-high-resolution water body segmentation based on segment anything model 2 CSW-SAM:一种基于分段任意模型2的高分辨率水体分割跨尺度算法
IF 10.6 1区 地球科学
ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-07-19 DOI: 10.1016/j.isprsjprs.2025.07.008
Tianyi Zhang , Yi Ren , Weibin Li , Chenhao Qin , Licheng Jiao , Hua Su
{"title":"CSW-SAM: a cross-scale algorithm for very-high-resolution water body segmentation based on segment anything model 2","authors":"Tianyi Zhang ,&nbsp;Yi Ren ,&nbsp;Weibin Li ,&nbsp;Chenhao Qin ,&nbsp;Licheng Jiao ,&nbsp;Hua Su","doi":"10.1016/j.isprsjprs.2025.07.008","DOIUrl":"10.1016/j.isprsjprs.2025.07.008","url":null,"abstract":"<div><div>Large-scale high-resolution water body (WB) extraction is one of the research hotspots in remote sensing image processing. However, accurate training labels for various WBs at Very-High-Resolution (VHR) are extremely scarce. Considering that low-resolution (LR) images and labels are more easily accessible, the challenge lies in fully leveraging LR data to guide high-precision WB extraction from VHR images. To address this issue, we propose a novel cross-scale CSW-SAM algorithm based on SAM2, which learns spectral information of WBs from easily accessible 10 m resolution LR images and maps it to 0.3 m resolution VHR remote sensing images for high-precision WB segmentation. In addition to fine-tuning the decoder, we enhance the encoder’s ability to effectively learn the mapping relationship between images of different resolutions by Adapter Tuning. We have designed the Automated Clustering Layer (ACL) based on the principle of feature similarity and local structure information clustering, to enhance the performance of SAM-based methods in cross-scale WB segmentation. To validate the robustness and generalization ability of the proposed CSW-SAM, we conducted extensive experiments on both a self-constructed cross-scale WB dataset and the publicly available GLH-Water dataset. The results confirm that CSW-SAM achieves strong performance across datasets with diverse WB conditions, demonstrating its potential for scalable and low-cost VHR WB mapping. Additionally, the model can be generalized with minimal cost, making it highly promising for large-scale global VHR WB mapping.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 208-227"},"PeriodicalIF":10.6,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144663364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FTG-Net: A facade topology-aware graph network for class imbalance structural segmentation of building facades FTG-Net:用于建筑立面类不平衡结构分割的立面拓扑感知图网络
IF 10.6 1区 地球科学
ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-07-17 DOI: 10.1016/j.isprsjprs.2025.07.014
Yufu Zang , Liu Xu , Zhen Cui , Xiongwu Xiao , Haiyan Guan , Bisheng Yang
{"title":"FTG-Net: A facade topology-aware graph network for class imbalance structural segmentation of building facades","authors":"Yufu Zang ,&nbsp;Liu Xu ,&nbsp;Zhen Cui ,&nbsp;Xiongwu Xiao ,&nbsp;Haiyan Guan ,&nbsp;Bisheng Yang","doi":"10.1016/j.isprsjprs.2025.07.014","DOIUrl":"10.1016/j.isprsjprs.2025.07.014","url":null,"abstract":"<div><div>Digital twin city and realistic 3D scene have triggered an ever-increasing demand for high-precision building models. As an important component for urban models, façade segmentation based on point clouds has gained significant attention. However, most existed networks suffer from class imbalance of façade elements and inherent limitations in point clouds (e.g., various occlusions, significant noise or outliers, varying point densities). To address these issues, we propose a novel FTG-Net (Façade Topology-aware Graph Network) combining the façade topology and hierarchical geometric features for robust segmentation. Our framework comprises three key modules: (1) A Façade Topology Extraction (FTE) module that encodes object-level spatial relationships via a 2D manifold grid and topology-aware graph convolutions; (2) A Sampling-enhanced Geometry Extraction (SGE) module leveraging adaptive reweighted sampling and strip pooling to enhance rare-class feature learning; (3) A Dual-feature Attentive Fusion (DAF) module that adaptively fuses topology and geometric features. To validate the performance of FTG-Net, we annotated two building façade datasets (NUIST Façade dataset and Commercial Street dataset) and selected two benchmark datasets (ArCH and ZAHA datasets) for evaluation. Extensive experiments on annotated datasets demonstrate state-of-the-art performance, achieving 98.33 % overall accuracy (OA) and 96.08 % mIoU. Evaluations on benchmark datasets show mIoU improvements of 1.05 ∼ 6.7 % over existing methods, with improvements focused on rare-class categories. Ablation studies confirm the critical role of our topology-aware design in capturing spatial regularities (e.g., repetitive arranged balconies) and the adaptive sampling strategy in mitigating class imbalance. These demonstrate the effectiveness of our FTG-Net for diverse architectural styles and the applicability in digital twin city modeling. Code and datasets are publicly available: <span><span>https://github.com/zangyufus/FTG-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"228 ","pages":"Pages 179-207"},"PeriodicalIF":10.6,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144656656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信