Muhammad Salah , Salem Ibrahim Salem , Nobuyuki Utsumi , Hiroto Higa , Joji Ishizaka , Kazuo Oki
{"title":"3LATNet: Attention based deep learning model for global Chlorophyll-a retrieval from GCOM-C satellite","authors":"Muhammad Salah , Salem Ibrahim Salem , Nobuyuki Utsumi , Hiroto Higa , Joji Ishizaka , Kazuo Oki","doi":"10.1016/j.isprsjprs.2024.12.019","DOIUrl":"10.1016/j.isprsjprs.2024.12.019","url":null,"abstract":"<div><div>Chlorophyll-a (Chla) retrieval from satellite observations is crucial for assessing water quality and the health of aquatic ecosystems. Utilizing satellite data, while invaluable, poses challenges including inherent satellite biases, the necessity for precise atmospheric correction (AC), and the complexity of water bodies, all of which complicate establishing a reliable relationship between remote sensing reflectance (R<sub>rs</sub>) and Chla concentrations. Furthermore, the Global Change Observation Mission − Climate (GCOM-C) satellite operated by Japan Aerospace Exploration Agency (JAXA) has brought a significant leap forward in ocean color monitoring, featuring a 250 m spatial resolution and integrating the 380 nm band, enhancing the detection capabilities for aquatic environments. JAXA’s standard Chla product grounded in empirical algorithms, coupled with the limited research on the impact of atmospheric correction (AC) on R<sub>rs</sub> products, underscores the need for further analysis of these factors. This study introduces the three bidirectional Long short–term memory and ATtention mechanism Network (3LATNet) model that was trained on a large dataset incorporating 5610 in-situ R<sub>rs</sub> measurements and their corresponding Chla concentrations collected from global locations to cover broad trophic status. The R<sub>rs</sub> spectra have been resampled to the Second-Generation Global Imager (SGLI) aboard GCOM-C. The model was also trained using satellite matchup data, aiming to achieve a generalized deep-learning model. 3LATNet was evaluated compared to conventional Chla algorithms and ML algorithms, including JAXA’s standard Chla product. Our findings reveal a remarkable reduction in Chla estimation error, marked by a 42.5 % (from 17 to 9.77 mg/m<sup>3</sup>) reduction in mean absolute error (MAE) and a 57.3 % (from 43.12 to 18.43 mg/m<sup>3</sup>) reduction in root mean square error (RMSE) compared to JAXA’s standard Chla algorithm using in-situ data, and nearly a twofold improvement in absolute errors when evaluating using matchup SGLI R<sub>rs</sub>. Furthermore, we conduct an in-depth assessment of the impact of AC on the models’ performance. SeaDAS predominantly exhibited invalid reflectance values at the 412 nm band, while OC-SMART displayed more significant variability in percentage errors. In comparison, JAXA’s AC proved more precise in retrieving R<sub>rs</sub>. We comprehensively evaluated the spatial consistency of Chla models under clear and harmful algal bloom events. 3LATNet effectively captured Chla patterns across various ranges. Conversely, the RF algorithm frequently overestimates Chla concentrations in the low to mid-range. JAXA’s Chla algorithm, on the other hand, consistently tends to underestimate Chla concentrations, a trend that is particularly pronounced in high-range Chla areas and during harmful algal bloom events. These outcomes underscore the potential of our innovative approach for enhancing g","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 490-508"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142967837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PolSAR2PolSAR: A semi-supervised despeckling algorithm for polarimetric SAR images","authors":"Cristiano Ulondu Mendes , Emanuele Dalsasso , Yi Zhang , Loïc Denis , Florence Tupin","doi":"10.1016/j.isprsjprs.2025.01.008","DOIUrl":"10.1016/j.isprsjprs.2025.01.008","url":null,"abstract":"<div><div>Polarimetric Synthetic Aperture Radar (PolSAR) imagery is a valuable tool for Earth observation. This imaging technique finds wide application in various fields, including agriculture, forestry, geology, and disaster monitoring. However, due to the inherent presence of speckle noise, filtering is often necessary to improve the interpretability and reliability of PolSAR data. The effectiveness of a speckle filter is measured by its ability to attenuate fluctuations without introducing artifacts or degrading spatial and polarimetric information. Recent advancements in this domain leverage the power of deep learning. These approaches adopt a supervised learning strategy, which requires a large amount of speckle-free images that are costly to produce. In contrast, this paper presents PolSAR2PolSAR, a semi-supervised learning strategy that only requires, from the sensor under consideration, pairs of noisy images of the same location and acquired in the same configuration (same incidence angle and mode as during the revisit of the satellite on its orbit). Our approach applies to a wide range of sensors. Experiments on RADARSAT-2 and RADARSAT Constellation Mission (RCM) data demonstrate the capacity of the proposed method to effectively reduce speckle noise and retrieve fine details. The code of the trained models is made freely available at <span><span>https://gitlab.telecom-paris.fr/ring/polsar2polsar</span><svg><path></path></svg></span> The repository additionally contains a model fine-tuned on SLC PolSAR images from NASA’s UAVSAR sensor.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 783-798"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143162316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feng Li , Xiaojing Yang , Liang Zhang , Yanhua Wang , Yuqi Han , Xin Zhang , Yang Li
{"title":"Scattering mechanism-guided zero-shot PolSAR target recognition","authors":"Feng Li , Xiaojing Yang , Liang Zhang , Yanhua Wang , Yuqi Han , Xin Zhang , Yang Li","doi":"10.1016/j.isprsjprs.2024.12.022","DOIUrl":"10.1016/j.isprsjprs.2024.12.022","url":null,"abstract":"<div><div>In response to the challenges posed by the difficulty in obtaining polarimetric synthetic aperture radar (PolSAR) data for certain specific categories of targets, we present a zero-shot target recognition method for PolSAR images. Based on a generative model, the method leverages the unique characteristics of polarimetric SAR images and incorporates two key modules: the scattering characteristics-guided semantic embedding generation module (SE) and the polarization characteristics-guided distributional correction module (DC). The former ensures the stability of synthetic features for unseen classes by controlling scattering characteristics. At the same time, the latter enhances the quality of synthetic features by utilizing polarimetric features, thereby improving the accuracy of zero-shot recognition. The proposed method is evaluated on the GOTCHA dataset to assess its performance in recognizing unseen classes. The experiment results demonstrate that the proposed method achieves SOTA performance in zero-shot PolSAR target recognition (<em>e.g.,</em> improving the recognition accuracy of unseen categories by nearly 20%). Our codes are available at <span><span>https://github.com/chuyihuan/Zero-shot-PolSAR-target-recognition</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 428-439"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of SAR-Optical fusion to extract shoreline position from Cloud-Contaminated satellite images","authors":"Yongjing Mao, Kristen D. Splinter","doi":"10.1016/j.isprsjprs.2025.01.013","DOIUrl":"10.1016/j.isprsjprs.2025.01.013","url":null,"abstract":"<div><div>Shorelines derived from optical satellite images are increasingly being used for regional to global scale analysis of sandy coastline dynamics. The optical satellite record, however, is contaminated by cloud cover, which can substantially reduce the temporal resolution of available images for shoreline analysis. Meanwhile, with the development of deep learning methods, optical images are increasingly fused with Synthetic Aperture Radar (SAR) images that are unaffected by clouds to reconstruct the cloud-contaminated pixels. Such SAR-Optical fusion methods have been shown successful for different land surface applications, but the unique characteristics of coastal areas make the applicability of this method unknown in these dynamic zones.</div><div>Herein we apply a deep internal learning (DIL) method to reconstruct cloud-contaminated optical images and explore its applicability to retrieve shorelines obscured by clouds. Our approach uses a mixed sequence of SAR and Gaussian noise images as the prior and the cloudy Modified Normalized Difference Water Index (MNDWI) as the target. The DIL encodes the target with priors and synthesizes plausible pixels under cloud cover. A unique aspect of our workflow is the inclusion of Gaussian noise in the prior sequence for MNDWI images when SAR images collected within a 1-day temporal lag are not available. A novel loss function of DIL model is also introduced to optimize the image reconstruction near the shoreline. These new developments have significant contribution to the model accuracy.</div><div>The DIL method is tested at four different sites with varying tide, wave, and shoreline dynamics. Shorelines derived from the reconstructed and true MNDWI images are compared to quantify the internal accuracy of shoreline reconstruction. For microtidal environments with mean springs tidal range less than 2 m, the mean absolute error (MAE) of shoreline reconstruction is less than 7.5 m with the coefficient of determination (<span><math><mrow><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></mrow></math></span>) more than 0.78 regardless of shoreline and wave dynamics. The method is less skilful in macro- and mesotidal environments due to the larger water level difference in the paired optical and SAR images, resulting in the MAE of 12.59 m and <span><math><mrow><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></mrow></math></span> of 0.43. The proposed SAR-Optical fusion method demonstrates substantially better accuracy in retrieving cloud-obscured shoreline positions compared to interpolation methods relying solely on optical images. Results from our work highlight the great potential of SAR-Optical fusion to derive shorelines even under the cloudiest conditions, thus increasing the temporal resolution of shoreline datasets.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 563-579"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143035308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Di Wang , Guorui Ma , Haiming Zhang , Xiao Wang , Yongxian Zhang
{"title":"Refined change detection in heterogeneous low-resolution remote sensing images for disaster emergency response","authors":"Di Wang , Guorui Ma , Haiming Zhang , Xiao Wang , Yongxian Zhang","doi":"10.1016/j.isprsjprs.2024.12.010","DOIUrl":"10.1016/j.isprsjprs.2024.12.010","url":null,"abstract":"<div><div>Heterogeneous Remote Sensing Images Change Detection (HRSICD) is a significant challenge in remote sensing image processing, with substantial application value in rapid natural disaster response. However, significant differences in imaging modalities often result in poor comparability of their features, affecting the recognition accuracy. To address the issue, we propose a novel HRSICD method based on image structure relationships and semantic information. First, we employ a Multi-scale Pyramid Convolution Encoder to efficiently extract the multi-scale and detailed features. Next, the Cross-domain Feature Alignment Module aligns the structural relationships and semantic features of the heterogeneous images, enhancing the comparability between heterogeneous image features. Finally, the Multi-level Decoder fuses the structural and semantic features, achieving refined identification of change areas. We validated the advancement of proposed method on five publicly available HRSICD datasets. Additionally, zero-shot generalization experiments and real-world applications were conducted to assess its generalization capability. Our method achieved favorable results in all experiments, demonstrating its effectiveness. The code of the proposed method will be made available at <span><span>https://github.com/Lucky-DW/HRSICD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 139-155"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142823148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaolong Wu , Chi Chen , Bisheng Yang , Zhengfei Yan , Zhiye Wang , Shangzhe Sun , Qin Zou , Jing Fu
{"title":"PylonModeler: A hybrid-driven 3D reconstruction method for power transmission pylons from LiDAR point clouds","authors":"Shaolong Wu , Chi Chen , Bisheng Yang , Zhengfei Yan , Zhiye Wang , Shangzhe Sun , Qin Zou , Jing Fu","doi":"10.1016/j.isprsjprs.2024.12.003","DOIUrl":"10.1016/j.isprsjprs.2024.12.003","url":null,"abstract":"<div><div>As the power grid is an indispensable foundation of modern society, creating a digital twin of the grid is of great importance. Pylons serve as components in the transmission corridor, and their precise 3D reconstruction is essential for the safe operation of power grids. However, 3D pylon reconstruction from LiDAR point clouds presents numerous challenges due to data quality and the diversity and complexity of pylon structures. To address these challenges, we introduce PylonModeler: a hybrid-driven method for 3D pylon reconstruction using airborne LiDAR point clouds, thereby enabling accurate, robust, and efficient real-time pylon reconstruction. Different strategies are employed to achieve independent reconstructions and assemblies for various structures. We propose Pylon Former, a lightweight transformer network for real-time pylon recognition and decomposition. Subsequently, we apply a data-driven approach for the pylon body reconstruction. Considering structural characteristics, fitting and clustering algorithms are used to reconstruct both external and internal structures. The pylon head is reconstructed using a hybrid approach. A pre-built pylon head parameter model library defines different pylons by a series of parameters. The coherent point drift (CPD) algorithm is adopted to establish the topological relationships between pylon head structures and set initial model parameters, which are refined through optimization for accurate pylon head reconstruction. Finally, the pylon body and head models are combined to complete the reconstruction. We collected an airborne LiDAR dataset, which includes a total of 3398 pylon data across eight types. The dataset consists of transmission lines of various voltage levels, such as 110 kV, 220 kV, and 500 kV. PylonModeler is validated on this dataset. The average reconstruction time of a pylon is 1.10 s, with an average reconstruction accuracy of 0.216 m. In addition, we evaluate the performance of PylonModeler on public airborne LiDAR data from Luxembourg. Compared to previous state-of-the-art methods, reconstruction accuracy improved by approximately 26.28 %. With superior performance, PylonModeler is tens of times faster than the current model-driven methods, enabling real-time pylon reconstruction.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 100-124"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142823151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhangfeng Ma , Nanxin Wang , Yingbao Yang , Yosuke Aoki , Shengji Wei
{"title":"Unwrapping error and fading signal correction on multi-looked InSAR data","authors":"Zhangfeng Ma , Nanxin Wang , Yingbao Yang , Yosuke Aoki , Shengji Wei","doi":"10.1016/j.isprsjprs.2024.12.006","DOIUrl":"10.1016/j.isprsjprs.2024.12.006","url":null,"abstract":"<div><div>Multi-looking, aimed at reducing data size and improving the signal-to-noise ratio, is indispensable for large-scale InSAR data processing. However, the resulting “Fading Signal” caused by multi-looking breaks the phase consistency among triplet interferograms and introduces bias into the estimated displacements. This inconsistency challenges the assumption that only unwrapping errors are involved in triplet phase closure. Therefore, untangling phase unwrapping errors and fading signals from triplet phase closure is critical to achieving more precise InSAR measurements. To address this challenge, we propose a new method that mitigates phase unwrapping errors and fading signals. This new method consists of two key steps. The first step is triplet phase closure-based stacking, which allows for the direct estimation of fading signals in each interferogram. The second step is Basis Pursuit Denoising-based unwrapping error correction, which transforms unwrapping error correction into sparse signal recovery. Through these two procedures, the new method can be seamlessly integrated into the traditional InSAR workflow. Additionally, the estimated fading signal can be directly used to derive soil moisture as a by-product of our method. Experimental results on the San Francisco Bay area demonstrate that the new method reduces velocity estimation errors by approximately 9 %–19 %, effectively addressing phase unwrapping errors and fading signals. This performance outperforms both ILP and Lasso methods, which only account for unwrapping errors in the triplet closure. Additionally, the derived by-product, soil moisture, shows strong consistency with most external soil moisture products.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 51-63"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142823154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Li , Guangjian Yan , Weihua Li , Donghui Xie , Hailan Jiang , Linyuan Li , Jianbo Qi , Ronghai Hu , Xihan Mu , Xiao Chen , Shanshan Wei , Hao Tang
{"title":"Accurate spaceborne waveform simulation in heterogeneous forests using small-footprint airborne LiDAR point clouds","authors":"Yi Li , Guangjian Yan , Weihua Li , Donghui Xie , Hailan Jiang , Linyuan Li , Jianbo Qi , Ronghai Hu , Xihan Mu , Xiao Chen , Shanshan Wei , Hao Tang","doi":"10.1016/j.isprsjprs.2024.11.020","DOIUrl":"10.1016/j.isprsjprs.2024.11.020","url":null,"abstract":"<div><div>Spaceborne light detection and ranging (LiDAR) waveform sensors require accurate signal simulations to facilitate prelaunch calibration, postlaunch validation, and the development of land surface data products. However, accurately simulating spaceborne LiDAR waveforms over heterogeneous forests remains challenging because data-driven methods do not account for complicated pulse transport within heterogeneous canopies, whereas analytical radiative transfer models overly rely on assumptions about canopy structure and distribution. Thus, a comprehensive simulation method is needed to account for both the complexity of pulse transport within canopies and the structural heterogeneity of forests. In this study, we propose a framework for spaceborne LiDAR waveform simulation by integrating a new radiative transfer model – the canopy voxel radiative transfer (CVRT) model – with reconstructed three-dimensional (3D) voxel forest scenes from small-footprint airborne LiDAR (ALS) point clouds. The CVRT model describes the radiative transfer process within canopy voxels and uses fractional crown cover to account for within-voxel heterogeneity, minimizing the need for assumptions about canopy shape and distribution and significantly reducing the number of input parameters. All the parameters for scene construction and model inputs can be obtained from the ALS point clouds. The performance of the proposed framework was assessed by comparing the results to the simulated LiDAR waveforms from DART, Global Ecosystem Dynamics Investigation (GEDI) data over heterogeneous forest stands, and Land, Vegetation, and Ice Sensor (LVIS) data from the National Ecological Observatory Network (NEON) site. The results suggest that compared with existing models, the new framework with the CVRT model achieved improved agreement with both simulated and measured data, with an average R<sup>2</sup> improvement of approximately 2% to 5% and an average RMSE reduction of approximately 0.5% to 3%. The proposed framework was also highly adaptive and robust to variations in model configurations, input data quality, and environmental attributes. In summary, this work extends current research on accurate and robust large-footprint LiDAR waveform simulations over heterogeneous forest canopies and could help refine product development for emerging spaceborne LiDAR missions.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 246-263"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142874574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huanyu Li , Hao Wang , Ying Zhang , Li Li , Peng Ren
{"title":"Underwater image captioning: Challenges, models, and datasets","authors":"Huanyu Li , Hao Wang , Ying Zhang , Li Li , Peng Ren","doi":"10.1016/j.isprsjprs.2024.12.002","DOIUrl":"10.1016/j.isprsjprs.2024.12.002","url":null,"abstract":"<div><div>We delve into the nascent field of underwater image captioning from three perspectives: challenges, models, and datasets. One challenge arises from the disparities between natural images and underwater images, which hinder the use of the former to train models for the latter. Another challenge exists in the limited feature extraction capabilities of current image captioning models, impeding the generation of accurate underwater image captions. The final challenge, albeit not the least significant, revolves around the insufficiency of data available for underwater image captioning. This insufficiency not only complicates the training of models but also poses challenges for evaluating their performance effectively. To address these challenges, we make three novel contributions. First, we employ a physics-based degradation technique to transform natural images into degraded images that closely resemble realistic underwater images. Based on the degraded images, we develop a meta-learning strategy specifically tailored for underwater tasks. Second, we develop an underwater image captioning model based on scene-object feature fusion. It fuses underwater scene features extracted by ResNeXt and object features localized by YOLOv8, yielding comprehensive features for underwater image captioning. Last but not least, we construct an underwater image captioning dataset covering various underwater scenes, with each underwater image annotated with five accurate captions for the purpose of comprehensive training and validation. Experimental results on the new dataset validate the effectiveness of our novel models. The code and datasets are released at <span><span>https://gitee.com/LHY-CODE/UICM-SOFF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 440-453"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ke Wang , Yue Wu , Xiaolan Qiu , Jinbiao Zhu , Donghai Zheng , Songtao Shangguan , Jie Pan , Yuquan Liu , Liming Jiang , Xin Li
{"title":"A novel airborne TomoSAR 3-D focusing method for accurate ice thickness and glacier volume estimation","authors":"Ke Wang , Yue Wu , Xiaolan Qiu , Jinbiao Zhu , Donghai Zheng , Songtao Shangguan , Jie Pan , Yuquan Liu , Liming Jiang , Xin Li","doi":"10.1016/j.isprsjprs.2025.01.011","DOIUrl":"10.1016/j.isprsjprs.2025.01.011","url":null,"abstract":"<div><div>High-altitude mountain glaciers are highly responsive to environmental changes. However, their remote locations limit the applicability of traditional mapping methods, such as probing and Ground Penetrating Radar (GPR), in tracking changes in ice thickness and glacier volume. Over the past two decades, airborne Tomographic Synthetic Aperture Radar (TomoSAR) has shown promise for mapping the internal structures of mountain glaciers. Yet, its 3D mapping capabilities are limited by the radar signal’s relatively shallow penetration depth, with bedrock echoes rarely detected beyond 60 meters. Additionally, most TomoSAR studies ignored the air-ice refraction during the image-focusing step, reducing the 3D focusing accuracy for deeper subsurface targets. In this study, we developed a novel algorithm that integrates refraction path calculations into SAR image focusing. We also introduced a new method to construct the 3D TomoSAR cube by stacking InSAR phase coherence images, enabling the retrieval of deep bedrock signals even at low signal-to-noise ratios.</div><div>We tested our algorithms on 14 P-band SAR images acquired on April 8, 2023, over Bayi Glacier in the Qilian Mountains, located on the Qinghai-Tibet Plateau. For the first time, we successfully mapped the ice thickness across an entire mountain glacier using the airborne TomoSAR technique, detecting bedrock signals at depths reaching up to 120 m. Our ice thickness estimates showed strong agreement with in situ measurements from three GPR transects totaling 3.8 km in length, with root-mean-square errors (RMSE) ranging from 3.18 to 4.66 m. For comparison, we applied the state-of-the-art 3D focusing algorithm used in the AlpTomoSAR campaign for ice thickness estimation, which resulted in RMSE values between 5.67 and 5.81 m. Our proposed method reduced the RMSE by 18% to 44% relative to the AlpTomoSAR algorithm. Based on these measurements, we calculated a total ice volume of 0.121 km<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>, reflecting a decline of approximately 20.92% since the last reported volume in 2009, which was estimated from sparse GPR data. These results demonstrate that the proposed algorithm can effectively map ice thickness, providing a cost-efficient solution for large-scale glacier surveys in high-mountain regions.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 593-607"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142989646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}