Peng Chen , Peixian Li , Bing Wang , Sihai Zhao , Yongliang Zhang , Tao Zhang , Xingcheng Ding
{"title":"B3-CDG: A pseudo-sample diffusion generator for bi-temporal building binary change detection","authors":"Peng Chen , Peixian Li , Bing Wang , Sihai Zhao , Yongliang Zhang , Tao Zhang , Xingcheng Ding","doi":"10.1016/j.isprsjprs.2024.10.021","DOIUrl":"10.1016/j.isprsjprs.2024.10.021","url":null,"abstract":"<div><div>Building change detection (CD) plays a crucial role in urban planning, land resource management, and disaster monitoring. Currently, deep learning has become a key approach in building CD, but challenges persist. Obtaining large-scale, accurately registered bi-temporal images is difficult, and annotation is time-consuming. Therefore, we propose B<sup>3</sup>-CDG, a bi-temporal building binary CD pseudo-sample generator based on the principle of latent diffusion. This generator treats building change processes as local semantic states transformations. It utilizes textual instructions and mask prompts to generate specific class changes in designated regions of single-temporal images, creating different temporal images with clear semantic transitions. B<sup>3</sup>-CDG is driven by large-scale pretrained models and utilizes external adapters to guide the model in learning remote sensing image distributions. To generate seamless building boundaries, B<sup>3</sup>-CDG adopts a simple and effective approach—dilation masks—to compel the model to learn boundary details. In addition, B<sup>3</sup>-CDG incorporates diffusion guidance and data augmentation to enhance image realism. In the generation experiments, B<sup>3</sup>-CDG achieved the best performance with the lowest FID (26.40) and the highest IS (4.60) compared to previous baseline methods (such as Inpaint and IAug). This method effectively addresses challenges such as boundary continuity, shadow generation, and vegetation occlusion while ensuring that the generated building roof structures and colors are realistic and diverse. In the application experiments, B<sup>3</sup>-CDG improved the IOU of the validation model (SFFNet) by 6.34 % and 7.10 % on the LEVIR and WHUCD datasets, respectively. When the real data is extremely limited (using only 5 % of the original data), the improvement further reaches 33.68 % and 32.40 %. Moreover, B<sup>3</sup>-CDG can enhance the baseline performance of advanced CD models, such as SNUNet and ChangeFormer. Ablation studies further confirm the effectiveness of the B<sup>3</sup>-CDG design. This study introduces a novel research paradigm for building CD, potentially advancing the field. Source code and datasets will be available at <span><span>https://github.com/ABCnutter/B3-CDG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 408-429"},"PeriodicalIF":10.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mesh refinement method for multi-view stereo with unary operations","authors":"Jianchen Liu, Shuang Han, Jin Li","doi":"10.1016/j.isprsjprs.2024.10.023","DOIUrl":"10.1016/j.isprsjprs.2024.10.023","url":null,"abstract":"<div><div>3D reconstruction is an important part of digital city, high-accuracy 3D modeling method has been widely studied as an important pathway to visualizing 3D city scenes. However, the problems of image resolution, noise, and occlusion result in low quality and smooth features in the mesh model. Therefore, the model needs to be refined to improve the mesh quality and enhance the visual effect. This paper proposes a mesh refinement algorithm to fine-tune the vertices of the mesh and constrain their evolution direction on the normal vector, reducing their freedom degrees to one. The evolution of vertices only involves one motion distance parameter on the normal vector, simplifying the complexity of the energy function derivation. Meanwhile, Gaussian curvature is used as a regularization term, which is anisotropic and preserves the edge features during the reconstruction process. The mesh refinement algorithm with unary operations fully utilizes the original image information and effectively enriches the local detail features of the mesh model. This paper utilizes five public datasets to conduct comparative experiments, and the experimental results show that the proposed algorithm can better restore the detailed features of the model and has a better refinement effect in the same number of iterations compared with OpenMVS library refinement algorithm. At the same time, in the comparison of refinement results with fewer iterations, the algorithm in this paper can achieve more desirable results.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 361-375"},"PeriodicalIF":10.6,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaokun Guo , Jie Dong , Yian Wang , Mingsheng Liao
{"title":"Fast and accurate SAR geocoding with a plane approximation","authors":"Shaokun Guo , Jie Dong , Yian Wang , Mingsheng Liao","doi":"10.1016/j.isprsjprs.2024.10.031","DOIUrl":"10.1016/j.isprsjprs.2024.10.031","url":null,"abstract":"<div><div>Geocoding is the procedure of finding the mapping between the Synthetic Aperture Radar (SAR) image and the imaged scene. The inverse form of the Range-Doppler (RD) model has been adopted to approximate the geocoding results. However, with advances in SAR imaging geodesy, its imprecise nature becomes more perceptible. The forward RD model gives reliable solutions but is time-consuming and unable to detect geometric distortions. This study proposes a highly optimized forward geocoding method to find the precise ground position of each image sample with a Digital Elevation Model (DEM). By following the intersection of the terrain and the so-called solution surface of an azimuth line, which can be locally approximated by a plane, it produces geo-location results almost identical to the analytical solutions of the RD model. At the same time, the non-unique geocoding solutions and the geometric distortions are determined. Deviations from the employed approximations are assessed, showing that they are highly predictable and lead to negligible range/azimuth residuals. The general robustness is verified by experiments on SAR images of different resolutions covering diversified terrains in the native or zero Doppler geometry. Comparisons with other forward algorithms demonstrate that, with extra geometric distortions detection ability, its accuracy and efficiency are comparable to them. For a Sentinel-1 IW burst of high topographic relief, the algorithm ends in a 3 s using 16 parallel cores, with an average residual smaller than one millimeter. Its impressive blend of efficiency, accuracy, and geometric distortion detection capabilities makes it ideal for large-scale remote sensing applications.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 344-360"},"PeriodicalIF":10.6,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Przemysław Dąbek, Jacek Wodecki, Paulina Kujawa, Adam Wróblewski, Arkadiusz Macek, Radosław Zimroz
{"title":"3D point cloud regularization method for uniform mesh generation of mining excavations","authors":"Przemysław Dąbek, Jacek Wodecki, Paulina Kujawa, Adam Wróblewski, Arkadiusz Macek, Radosław Zimroz","doi":"10.1016/j.isprsjprs.2024.10.024","DOIUrl":"10.1016/j.isprsjprs.2024.10.024","url":null,"abstract":"<div><div>Mine excavation systems are usually dozens of kilometers long with varying geometry on a small scale (roughness and shape of the walls) and on a large scale (varying widths of the tunnels, turns, and crossings). In this article, the authors address the problem of analyzing laser scanning data from large mining structures that can be used for various purposes, with focus on ventilation simulations. Together with the quality of the measurement data (diverse point-cloud density, missing samples, holes induced by obstructions in the field of view, measurement noise), this creates problems that require multi-stage processing of the obtained data. The authors propose a robust methodology to process a single segmented section of the mining system. The presented approach focuses on obtaining a point cloud ready for application in the computational fluid dynamics (CFD) analysis of airflow with minimal need for additional manual corrections on the generated mesh model. This requires the point cloud to have evenly distributed points and reduced noise (together with removal of objects inside) while keeping the unique geometrical properties and shape of the scanned tunnels. Proposed methodology uses trajectory of the excavation either obtained during the measurements or by skeletonization process explained in the article. Cross-sections obtained on planes perpendicular to the trajectory are processed towards the equalization of point distribution, removing measurement noise, holes in the point cloud and objects inside the excavation. The effects of the proposed algorithm are validated by comparing the processed cloud with the original cloud and testing within the CFD environment. The algorithm proved high effectiveness in improving skewness rate of the obtained mesh and geometry mapping accuracy (standard deviation below 5 centimeters in cloud-to-mesh comparison).</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 324-343"},"PeriodicalIF":10.6,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Pulella , Francescopaolo Sica , Carlos Villamil Lopez , Harald Anglberger , Ronny Hänsch
{"title":"Generalization in deep learning-based aircraft classification for SAR imagery","authors":"Andrea Pulella , Francescopaolo Sica , Carlos Villamil Lopez , Harald Anglberger , Ronny Hänsch","doi":"10.1016/j.isprsjprs.2024.10.030","DOIUrl":"10.1016/j.isprsjprs.2024.10.030","url":null,"abstract":"<div><div>Automatic Target Recognition (ATR) from Synthetic Aperture Radar (SAR) data covers a wide range of applications. SAR ATR helps to detect and track vehicles and other objects, e.g. in disaster relief and surveillance operations. Aircraft classification covers a significant part of this research area, which differs from other SAR-based ATR tasks, such as ship and ground vehicle detection and classification, in that aircrafts are usually a static target, often remaining at the same location and in a given orientation for longer time frames. Today, there is a significant mismatch between the abundance of deep learning-based aircraft classification models and the availability of corresponding datasets. This mismatch has led to models with improved classification performance on specific datasets, but the challenge of generalizing to conditions not present in the training data (which are expected to occur in operational conditions) has not yet been satisfactorily analyzed. This paper aims to evaluate how classification performance and generalization capabilities of deep learning models are influenced by the diversity of the training dataset. Our goal is to understand the model’s competence and the conditions under which it can achieve proficiency in aircraft classification tasks for high-resolution SAR images while demonstrating generalization capabilities when confronted with novel data that include different geographic locations, environmental conditions, and geometric variations. We address this gap by using manually annotated high-resolution SAR data from TerraSAR-X and TanDEM-X and show how the classification performance changes for different application scenarios requiring different training and evaluation setups. We find that, as expected, the type of aircraft plays a crucial role in the classification problem, since it will vary in shape and dimension. However, these aspects are secondary to how the SAR image is acquired, with the acquisition geometry playing the primary role. Therefore, we find that the characteristics of the acquisition are much more relevant for generalization than the complex geometry of the target. We show this for various models selected among the standard classification algorithms.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 312-323"},"PeriodicalIF":10.6,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chuanpeng Zhao , Yubin Li , Mingming Jia , Chengbin Wu , Rong Zhang , Chunying Ren , Zongming Wang
{"title":"Advancing mangrove species mapping: An innovative approach using Google Earth images and a U-shaped network for individual-level Sonneratia apetala detection","authors":"Chuanpeng Zhao , Yubin Li , Mingming Jia , Chengbin Wu , Rong Zhang , Chunying Ren , Zongming Wang","doi":"10.1016/j.isprsjprs.2024.10.016","DOIUrl":"10.1016/j.isprsjprs.2024.10.016","url":null,"abstract":"<div><div>The exotic mangrove species <em>Sonneratia apetala</em> has been colonizing coastal China for several decades, sparking attention and debates from the public and policy-makers about its reproduction, dispersal, and spread. Existing local-scale studies have relied on fine but expensive data sources to map mangrove species, limiting their applicability for detecting <em>S. apetala</em> in large areas due to cost constraints. A previous study utilized freely available Sentinel-2 images to construct a 10-m-resolution <em>S. apetala</em> map in China but did not capture small clusters of <em>S. apetala</em> due to resolution limitations. To precisely detect <em>S. apetala</em> in coastal China, we proposed an approach that integrates freely accessible submeter-resolution Google Earth images to control expenses, a 10-m-resolution <em>S. apetala</em> map to retrieve well-distributed samples, and several U-shaped networks to capture <em>S. apetala</em> in the form of clusters and individuals. Comparisons revealed that the lite U-squared network was most suitable for detecting <em>S. apetala</em> among the five U-shaped networks. The resulting map achieved an overall accuracy of 98.2 % using testing samples and an accuracy of 91.0 % using field sample plots. Statistics indicated that the total area covered by <em>S. apetala</em> in China was 4000.4 ha in 2022, which was 33.4 % greater than that of the 10-m-resolution map. The excessive area suggested the presence of a large number of small clusters beyond the discrimination capacity of medium-resolution images. Furthermore, the mechanism of the approach was interpreted using an example-based method that altered image color, shape, orientation, and textures. Comparisons showed that textures were the key feature for identifying <em>S. apetala</em> based on submeter-resolution Google Earth images. The detection accuracy rapidly decreased with the blurring of textures, and images at zoom levels of 20, 19, and 18 were applicable to the trained network. Utilizing the first individual-level map, we estimated the number of mature <em>S. apetala</em> trees to be approximately 2.35 million with a 95 % confidence interval between 2.30 and 2.40 million, providing a basis for managing this exotic mangrove species. This study deepens existing research on <em>S. apetala</em> by providing an approach with a clear mechanism, an individual-level distribution with a much larger area, and an estimation of the number of mature trees. This study advances mangrove species mapping by combining the advantages of freely accessible medium- and high-resolution images: the former provides abundant spectral information to integrate discrete local-scale maps to generate a large-scale map, while the latter offers textural information from submeter-resolution Google Earth images to detect mangrove species in detail.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 276-293"},"PeriodicalIF":10.6,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Pan , Jiangong Xu , Xiaoyu Yu , Guo Ye , Mi Wang , Yumin Chen , Jianshen Ma
{"title":"HDRSA-Net: Hybrid dynamic residual self-attention network for SAR-assisted optical image cloud and shadow removal","authors":"Jun Pan , Jiangong Xu , Xiaoyu Yu , Guo Ye , Mi Wang , Yumin Chen , Jianshen Ma","doi":"10.1016/j.isprsjprs.2024.10.026","DOIUrl":"10.1016/j.isprsjprs.2024.10.026","url":null,"abstract":"<div><div>Clouds and shadows often contaminate optical remote sensing images, resulting in missing information. Consequently, continuous spatiotemporal monitoring of the Earth’s surface requires the efficient removal of clouds and shadows. Unlike optical satellites, synthetic aperture radar (SAR) has active imaging capabilities in all weather conditions, supplying valuable supplementary information for reconstructing missing regions. Nevertheless, the reconstruction of high-fidelity cloud-free images based on SAR-optical data fusion remains challenging due to differences in imaging mechanisms and the considerable contamination from speckle noise inherent in SAR imagery. To solve the aforementioned challenges, this paper presents a novel hybrid dynamic residual self-attention network (HDRSA-Net), aiming to fully exploit the potential of SAR images in reconstructing missing regions. The proposed HDRSA-Net comprises multiple dynamic interaction residual (DIR) groups organized into an end-to-end trainable deep hierarchical stacked architecture. Specifically, the omni-dimensional dynamic local exploration (ODDLE) module and the sparse global context aggregation (SGCA) module are used to form a local–global feature adaptive extraction and implicit enhancement. A multi-task cooperative optimization loss function is designed to ensure that the results exhibit high spectral fidelity and coherent spatial structures. Additionally, this paper releases a large dataset that can comprehensively evaluate the reconstruction quality under different cloud coverages and various types of ground cover, providing a solid foundation for restoring satisfactory sensory effects and reliable semantic application value. In comparison to the current representative algorithms, the presented approach exhibits effectiveness and advancement in reconstructing missing regions with stability. The project is accessible at: <span><span>https://github.com/RSIIPAC/LuojiaSET-OSFCR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 258-275"},"PeriodicalIF":10.6,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Wang , Yizhi Zhang , Quanhua Dong , Hao Guo , Yingchun Tao , Fan Zhang
{"title":"A multi-view graph neural network for building age prediction","authors":"Yi Wang , Yizhi Zhang , Quanhua Dong , Hao Guo , Yingchun Tao , Fan Zhang","doi":"10.1016/j.isprsjprs.2024.10.011","DOIUrl":"10.1016/j.isprsjprs.2024.10.011","url":null,"abstract":"<div><div>Building age is crucial for inferring building energy consumption and understanding the interactions between human behavior and urban infrastructure. Limited by the challenges of surveys, some machine learning methods have been utilized to predict and fill in missing building age data using building footprint. However, the existing methods lack explicit modeling of spatial effects and semantic relationships between buildings. To alleviate these challenges, we propose a novel multi-view graph neural network called Building Age Prediction Network (BAPN). The features of spatial autocorrelation, spatial heterogeneity and semantic similarity were extracted and integrated using multiple graph convolutional networks. Inspired by the spatial regime model, a heterogeneity-aware graph convolutional network (HGCN) based on spatial grouping is designed to capture the spatial heterogeneity. Systematic experiments on three large-scale building footprint datasets demonstrate that BAPN outperforms existing machine learning and graph learning models, achieving high accuracy ranging from 61% to 80%. Moreover, missing building age data within the Fifth Ring Road of Beijing was filled, validating the feasibility of BAPN. This research offers new insights for filling the intra-city building age gaps and understanding multiple spatial effects essential for sustainable urban planning.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 294-311"},"PeriodicalIF":10.6,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dansheng Yao , Mengqi Zhu , Hehua Zhu , Wuqiang Cai , Long Zhou
{"title":"Integrating synthetic datasets with CLIP semantic insights for single image localization advancements","authors":"Dansheng Yao , Mengqi Zhu , Hehua Zhu , Wuqiang Cai , Long Zhou","doi":"10.1016/j.isprsjprs.2024.10.027","DOIUrl":"10.1016/j.isprsjprs.2024.10.027","url":null,"abstract":"<div><div>Accurate localization of pedestrians and mobile robots is critical for navigation, emergency response, and autonomous driving. Traditional localization methods, such as satellite signals, often prove ineffective in certain environments, and acquiring sufficient positional data can be challenging. Single image localization techniques have been developed to address these issues. However, current deep learning frameworks for single image localization that rely on domain adaptation fail to effectively utilize semantically rich high-level features obtained from large-scale pretraining. This paper introduces a novel framework that leverages the Contrastive Language-Image Pre-training model and prompts to enhance feature extraction and domain adaptation through semantic information. The proposed framework generates an integrated score map from scene-specific prompts to guide feature extraction and employs adversarial components to facilitate domain adaptation. Furthermore, a reslink component is incorporated to mitigate the precision loss in high-level features compared to the original data. Experimental results demonstrate that the use of prompts reduces localization errors by 26.4 % in indoor environments and 24.3 % in outdoor settings. The model achieves localization errors as low as 0.75 m and 8.09 degrees indoors, and 4.56 m and 7.68 degrees outdoors. Analysis of prompts from labeled datasets confirms the model’s capability to effectively interpret scene information. The weights of the integrated score map enhance the model’s transparency, thereby improving interpretability. This study underscores the efficacy of integrating semantic information into image localization tasks.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 198-213"},"PeriodicalIF":10.6,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Selective weighted least square and piecewise bilinear transformation for accurate satellite DSM generation","authors":"Nazila Mohammadi, Amin Sedaghat","doi":"10.1016/j.isprsjprs.2024.11.001","DOIUrl":"10.1016/j.isprsjprs.2024.11.001","url":null,"abstract":"<div><div>One of the main products of multi-view stereo (MVS) high-resolution satellite (HRS) images in photogrammetry and remote sensing is digital surface model (DSM). Producing DSMs from MVS HRS images still faces serious challenges due to various reasons such as complexity of imaging geometry and exterior orientation model in HRS, as well as large dimensions and various geometric and illumination variations. The main motivation for conducting this research is to provide a novel and efficient method that enhances the accuracy and completeness of extracting DSM from HRS images compared to existing recent methods. The proposed method called Sat-DSM, consists of five main stages. Initially, a very dense set of tie-points is extracted from the images using a tile-based matching method, phase congruency-based feature detectors and descriptors, and a local geometric consistency correspondence method. Then, the process of Rational Polynomial Coefficients (RPC) block adjustment is performed to compensate the RPC bias errors. After that, a dense matching process is performed to generate 3D point clouds for each pair of input HRS images using a new geometric transformation called PWB (pricewise bilinear) and an accurate area-based matching method called SWLSM (selective weighted least square matching). The key innovations of this research include the introduction of SWLSM and PWB methods for an accurate dense matching process. The PWB is a novel and simple piecewise geometric transformation model based on superpixel over-segmentation that has been proposed for accurate registration of each pair of HRS images. The SWLSM matching method is based on phase congruency measure and a selection strategy to improve the well-known LSM (least square matching) performance. After dense matching process, the final stage is spatial intersection to generate 3D point clouds, followed by elevation interpolation to produce DSM. To evaluate the Sat-DSM method, 12 sets of MVS-HRS data from IRS-P5, ZY3-1, ZY3-2, and Worldview-3 sensors were selected from areas with different landscapes such as urban, mountainous, and agricultural areas. The results indicate the superiority of the proposed Sat-DSM method over four other methods CATALYST, SGM (Semi-global matching), SS-DSM (structural similarity based DSM extraction), and Sat-MVSF in terms of completeness, RMSE, and MEE. The demo code is available at <span><span>https://www.researchgate.net/publication/377721674_SatDSM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 214-230"},"PeriodicalIF":10.6,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}