Yi Li , Guangjian Yan , Weihua Li , Donghui Xie , Hailan Jiang , Linyuan Li , Jianbo Qi , Ronghai Hu , Xihan Mu , Xiao Chen , Shanshan Wei , Hao Tang
{"title":"Accurate spaceborne waveform simulation in heterogeneous forests using small-footprint airborne LiDAR point clouds","authors":"Yi Li , Guangjian Yan , Weihua Li , Donghui Xie , Hailan Jiang , Linyuan Li , Jianbo Qi , Ronghai Hu , Xihan Mu , Xiao Chen , Shanshan Wei , Hao Tang","doi":"10.1016/j.isprsjprs.2024.11.020","DOIUrl":"10.1016/j.isprsjprs.2024.11.020","url":null,"abstract":"<div><div>Spaceborne light detection and ranging (LiDAR) waveform sensors require accurate signal simulations to facilitate prelaunch calibration, postlaunch validation, and the development of land surface data products. However, accurately simulating spaceborne LiDAR waveforms over heterogeneous forests remains challenging because data-driven methods do not account for complicated pulse transport within heterogeneous canopies, whereas analytical radiative transfer models overly rely on assumptions about canopy structure and distribution. Thus, a comprehensive simulation method is needed to account for both the complexity of pulse transport within canopies and the structural heterogeneity of forests. In this study, we propose a framework for spaceborne LiDAR waveform simulation by integrating a new radiative transfer model – the canopy voxel radiative transfer (CVRT) model – with reconstructed three-dimensional (3D) voxel forest scenes from small-footprint airborne LiDAR (ALS) point clouds. The CVRT model describes the radiative transfer process within canopy voxels and uses fractional crown cover to account for within-voxel heterogeneity, minimizing the need for assumptions about canopy shape and distribution and significantly reducing the number of input parameters. All the parameters for scene construction and model inputs can be obtained from the ALS point clouds. The performance of the proposed framework was assessed by comparing the results to the simulated LiDAR waveforms from DART, Global Ecosystem Dynamics Investigation (GEDI) data over heterogeneous forest stands, and Land, Vegetation, and Ice Sensor (LVIS) data from the National Ecological Observatory Network (NEON) site. The results suggest that compared with existing models, the new framework with the CVRT model achieved improved agreement with both simulated and measured data, with an average R<sup>2</sup> improvement of approximately 2% to 5% and an average RMSE reduction of approximately 0.5% to 3%. The proposed framework was also highly adaptive and robust to variations in model configurations, input data quality, and environmental attributes. In summary, this work extends current research on accurate and robust large-footprint LiDAR waveform simulations over heterogeneous forest canopies and could help refine product development for emerging spaceborne LiDAR missions.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 246-263"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142874574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huanyu Li , Hao Wang , Ying Zhang , Li Li , Peng Ren
{"title":"Underwater image captioning: Challenges, models, and datasets","authors":"Huanyu Li , Hao Wang , Ying Zhang , Li Li , Peng Ren","doi":"10.1016/j.isprsjprs.2024.12.002","DOIUrl":"10.1016/j.isprsjprs.2024.12.002","url":null,"abstract":"<div><div>We delve into the nascent field of underwater image captioning from three perspectives: challenges, models, and datasets. One challenge arises from the disparities between natural images and underwater images, which hinder the use of the former to train models for the latter. Another challenge exists in the limited feature extraction capabilities of current image captioning models, impeding the generation of accurate underwater image captions. The final challenge, albeit not the least significant, revolves around the insufficiency of data available for underwater image captioning. This insufficiency not only complicates the training of models but also poses challenges for evaluating their performance effectively. To address these challenges, we make three novel contributions. First, we employ a physics-based degradation technique to transform natural images into degraded images that closely resemble realistic underwater images. Based on the degraded images, we develop a meta-learning strategy specifically tailored for underwater tasks. Second, we develop an underwater image captioning model based on scene-object feature fusion. It fuses underwater scene features extracted by ResNeXt and object features localized by YOLOv8, yielding comprehensive features for underwater image captioning. Last but not least, we construct an underwater image captioning dataset covering various underwater scenes, with each underwater image annotated with five accurate captions for the purpose of comprehensive training and validation. Experimental results on the new dataset validate the effectiveness of our novel models. The code and datasets are released at <span><span>https://gitee.com/LHY-CODE/UICM-SOFF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 440-453"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ke Wang , Yue Wu , Xiaolan Qiu , Jinbiao Zhu , Donghai Zheng , Songtao Shangguan , Jie Pan , Yuquan Liu , Liming Jiang , Xin Li
{"title":"A novel airborne TomoSAR 3-D focusing method for accurate ice thickness and glacier volume estimation","authors":"Ke Wang , Yue Wu , Xiaolan Qiu , Jinbiao Zhu , Donghai Zheng , Songtao Shangguan , Jie Pan , Yuquan Liu , Liming Jiang , Xin Li","doi":"10.1016/j.isprsjprs.2025.01.011","DOIUrl":"10.1016/j.isprsjprs.2025.01.011","url":null,"abstract":"<div><div>High-altitude mountain glaciers are highly responsive to environmental changes. However, their remote locations limit the applicability of traditional mapping methods, such as probing and Ground Penetrating Radar (GPR), in tracking changes in ice thickness and glacier volume. Over the past two decades, airborne Tomographic Synthetic Aperture Radar (TomoSAR) has shown promise for mapping the internal structures of mountain glaciers. Yet, its 3D mapping capabilities are limited by the radar signal’s relatively shallow penetration depth, with bedrock echoes rarely detected beyond 60 meters. Additionally, most TomoSAR studies ignored the air-ice refraction during the image-focusing step, reducing the 3D focusing accuracy for deeper subsurface targets. In this study, we developed a novel algorithm that integrates refraction path calculations into SAR image focusing. We also introduced a new method to construct the 3D TomoSAR cube by stacking InSAR phase coherence images, enabling the retrieval of deep bedrock signals even at low signal-to-noise ratios.</div><div>We tested our algorithms on 14 P-band SAR images acquired on April 8, 2023, over Bayi Glacier in the Qilian Mountains, located on the Qinghai-Tibet Plateau. For the first time, we successfully mapped the ice thickness across an entire mountain glacier using the airborne TomoSAR technique, detecting bedrock signals at depths reaching up to 120 m. Our ice thickness estimates showed strong agreement with in situ measurements from three GPR transects totaling 3.8 km in length, with root-mean-square errors (RMSE) ranging from 3.18 to 4.66 m. For comparison, we applied the state-of-the-art 3D focusing algorithm used in the AlpTomoSAR campaign for ice thickness estimation, which resulted in RMSE values between 5.67 and 5.81 m. Our proposed method reduced the RMSE by 18% to 44% relative to the AlpTomoSAR algorithm. Based on these measurements, we calculated a total ice volume of 0.121 km<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>, reflecting a decline of approximately 20.92% since the last reported volume in 2009, which was estimated from sparse GPR data. These results demonstrate that the proposed algorithm can effectively map ice thickness, providing a cost-efficient solution for large-scale glacier surveys in high-mountain regions.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 593-607"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142989646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shi Yi , Mengting Chen , Xuesong Yuan , Si Guo , Jiashuai Wang
{"title":"An interactive fusion attention-guided network for ground surface hot spring fluids segmentation in dual-spectrum UAV images","authors":"Shi Yi , Mengting Chen , Xuesong Yuan , Si Guo , Jiashuai Wang","doi":"10.1016/j.isprsjprs.2025.01.022","DOIUrl":"10.1016/j.isprsjprs.2025.01.022","url":null,"abstract":"<div><div>Investigating the distribution of ground surface hot spring fluids is crucial for the exploitation and utilization of geothermal resources. The detailed information provided by dual-spectrum images captured by unmanned aerial vehicles (UAVs) flew at low altitudes is beneficial to accurately segment ground surface hot spring fluids. However, existing image segmentation methods face significant challenges of hot spring fluids segmentation due to the frequent and irregular variations in fluid boundaries, meanwhile the presence of substances within such fluids lead to segmentation uncertainties. In addition, there is currently no benchmark dataset dedicated to ground surface hot spring fluid segmentation in dual-spectrum UAV images. To this end, in this study, a benchmark dataset called the dual-spectrum hot spring fluid segmentation (DHFS) dataset was constructed for segmenting ground surface hot spring fluids in dual-spectrum UAV images. Additionally, a novel interactive fusion attention-guided RGB-Thermal (RGB-T) semantic segmentation network named IFAGNet was proposed in this study for accurately segmenting ground surface hot spring fluids in dual-spectrum UAV images. The proposed IFAGNet consists of two sub-networks that leverage two feature fusion architectures and the two-stage feature fusion module is designed to achieve optimal intermediate feature fusion. Furthermore, IFAGNet utilizes an interactive fusion attention-guided architecture to guide the two sub-networks further process the extracted features through complementary information exchange, resulting in a significant boost in hot spring fluid segmentation accuracy. Additionally, two down-up full scale feature pyramid network (FPN) decoders are developed for each sub-network to fully utilize multi-stage fused features and improve the preservation of detailed information during hot spring fluid segmentation. Moreover, a hybrid consistency learning strategy is implemented to train the IFAGNet, which combines fully supervised learning with consistency learning between each sub-network and their fusion results to further optimize the segmentation accuracy of hot spring fluid in RGB-T UAV images. The optimal model of the IFAGNet was tested on the proposed DHFS dataset, and the experimental results demonstrated that the IFAGNet outperforms existing image segmentation frameworks in terms of segmentation accuracy for hot spring fluids segmentation in dual-spectrum UAV images which achieved Pixel Accuracy (PA) of 96.1%, Precision of 93.2%, Recall of 85.9%, Intersection over Union (IoU) of 78.3%, and F1-score (F1) of 89.4%, respectively. And overcomes segmentation uncertainties to a great extent, while maintaining competitive computational efficiency. The ablation studies have confirmed the effectiveness of each main innovation in IFAGNet for improving the accuracy of hot spring fluid segmentation. Therefore, the proposed DHFS dataset and IFAGNet lay the foundation for segmentation of ","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 661-691"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143035286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhenghui Zhao , Chen Wu , Lixiang Ru , Di Wang , Hongruixuan Chen , Cuiqun Chen
{"title":"Plug-and-play DISep: Separating dense instances for scene-to-pixel weakly-supervised change detection in high-resolution remote sensing images","authors":"Zhenghui Zhao , Chen Wu , Lixiang Ru , Di Wang , Hongruixuan Chen , Cuiqun Chen","doi":"10.1016/j.isprsjprs.2025.01.007","DOIUrl":"10.1016/j.isprsjprs.2025.01.007","url":null,"abstract":"<div><div>Change Detection (CD) focuses on identifying specific pixel-level landscape changes in multi-temporal remote sensing images. The process of obtaining pixel-level annotations for CD is generally both time-consuming and labor-intensive. Faced with this annotation challenge, there has been a growing interest in research on Weakly-Supervised Change Detection (WSCD). WSCD aims to detect pixel-level changes using only scene-level (i.e., image-level) change labels, thereby offering a more cost-effective approach. Despite considerable efforts to precisely locate changed regions, existing WSCD methods often encounter the problem of “instance lumping” under scene-level supervision, particularly in scenarios with a dense distribution of changed instances (i.e., changed objects). In these scenarios, unchanged pixels between changed instances are also mistakenly identified as changed, causing multiple changes to be mistakenly viewed as one. In practical applications, this issue prevents the accurate quantification of the number of changes. To address this issue, we propose a Dense Instance Separation (DISep) method as a plug-and-play solution, refining pixel features from a unified instance perspective under scene-level supervision. Specifically, our DISep comprises a three-step iterative training process: (1) Instance Localization: We locate instance candidate regions for changed pixels using high-pass class activation maps. (2) Instance Retrieval: We identify and group these changed pixels into different instance IDs through connectivity searching. Then, based on the assigned instance IDs, we extract corresponding pixel-level features on a per-instance basis. (3) Instance Separation: We introduce a separation loss to enforce intra-instance pixel consistency in the embedding space, thereby ensuring separable instance feature representations. The proposed DISep adds only minimal training cost and no inference cost. It can be seamlessly integrated to enhance existing WSCD methods. We achieve state-of-the-art performance by enhancing three Transformer-based and four ConvNet-based methods on the LEVIR-CD, WHU-CD, DSIFN-CD, SYSU-CD, and CDD datasets. Additionally, our DISep can be used to improve fully-supervised change detection methods. Code is available at <span><span>https://github.com/zhenghuizhao/Plug-and-Play-DISep-for-Change-Detection</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 770-782"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143072523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FO-Net: An advanced deep learning network for individual tree identification using UAV high-resolution images","authors":"Jian Zeng, Xin Shen, Kai Zhou, Lin Cao","doi":"10.1016/j.isprsjprs.2024.12.020","DOIUrl":"10.1016/j.isprsjprs.2024.12.020","url":null,"abstract":"<div><div>The identification of individual trees can reveal the competitive and symbiotic relationships among trees within forest stands, which is fundamental understand biodiversity and forest ecosystems. Highly precise identification of individual trees can significantly improve the efficiency of forest resource inventory, and is valuable for biomass measurement and forest carbon storage assessment. In previous studies through deep learning approaches for identifying individual tree, feature extraction is usually difficult to adapt to the variation of tree crown architecture, and the loss of feature information in the multi-scale fusion process is also a marked challenge for extracting trees by remote sensing images. Based on the one-stage deep learning network structure, this study improves and optimizes the three stages of feature extraction, feature fusion and feature identification in deep learning methods, and constructs a novel feature-oriented individual tree identification network (FO-Net) suitable for UAV high-resolution images. Firstly, an adaptive feature extraction algorithm based on variable position drift convolution was proposed, which improved the feature extraction ability for the individual tree with various crown size and shape in UAV images. Secondly, to enhance the network’s ability to fuse multiscale forest features, a feature fusion algorithm based on the “gather-and-distribute” mechanism is proposed in the feature pyramid network, which realizes the lossless cross-layer transmission of feature map information. Finally, in the stage of individual tree identification, a unified self-attention identification head is introduced to enhanced FO-Net’s perception ability to identify the trees with small crown diameters. FO-Net achieved the best performance in quantitative analysis experiments on self-constructed datasets, with mAP50, F1-score, Precision, and Recall of 90.7%, 0.85, 85.8%, and 82.8%, respectively, realizing a relatively high accuracy for individual tree identification compared to the traditional deep learning methods. The proposed feature extraction and fusion algorithms have improved the accuracy of individual tree identification by 1.1% and 2.7% respectively. The qualitative experiments based on Grad-CAM heat maps also demonstrate that FO-Net can focus more on the contours of an individual tree in high-resolution images, and reduce the influence of background factors during feature extraction and individual tree identification. FO-Net deep learning network improves the accuracy of individual trees identification in UAV high-resolution images without significantly increasing the parameters of the network, which provides a reliable method to support various tasks in fine-scale precision forestry.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 323-338"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142889390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junyuan Fei , Xuan Zhang , Chong Li , Fanghua Hao , Yahui Guo , Yongshuo Fu
{"title":"A deep data fusion-based reconstruction of water index time series for intermittent rivers and ephemeral streams monitoring","authors":"Junyuan Fei , Xuan Zhang , Chong Li , Fanghua Hao , Yahui Guo , Yongshuo Fu","doi":"10.1016/j.isprsjprs.2024.12.015","DOIUrl":"10.1016/j.isprsjprs.2024.12.015","url":null,"abstract":"<div><div>Intermittent Rivers and Ephemeral Streams (IRES) are the major sources of flowing water on Earth. Yet, their dynamics are challenging for optical and radar satellites to monitor due to the heavy cloud cover and narrow water surfaces. The significant backscattering mechanism change and image mismatch further hinder the joint use of optical-SAR images in IRES monitoring. Here, a <strong>D</strong>eep data fusion-based <strong>R</strong>econstruction of the wide-accepted Modified Normalized Difference Water Index (MNDWI) time series is conducted for <strong>I</strong>RES <strong>M</strong>onitoring (DRIM). The study utilizes 3 categories of explanatory variables, i.e., the cross-orbits Sentinel-1 SAR for the continuous IRES observation, anchor data for the implicit co-registration, and auxiliary data that reflects the dynamics of IRES. A tight-coupled CNN-RNN architecture is designed to achieve pixel-level SAR-to-optical reconstruction under significant backscattering mechanism changes. The 10 m MNDWI time series with a 12-day interval is effectively regressed, <span><math><mrow><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></mrow></math></span> > 0.80, on the experimental catchment. The comparison with the RF, RNN, and CNN methods affirms the advantage of the tight-coupled CNN-RNN system in the SAR-to-optical regression with the <span><math><mrow><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></mrow></math></span> increasing by 0.68 at least. The ablation test highlights the contributions of the Sentinel-1 to the precise MNDWI time series reconstruction, and the anchor and auxiliary data to the effective multi-source data fusion, respectively. The reconstructions highly match the observations of IRES with river widths ranging from 2 m to 300 m. Furthermore, the DRIM method shows excellent applicability, i.e., average <span><math><mrow><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></mrow></math></span> of 0.77, in IRES under polar, temperate, tropical, and arid climates. In conclusion, the proposed method is powerful in reconstructing the MNDWI time series of sub-pixel to multi-pixel scale IRES under the problem of backscattering mechanism change and image mismatch. The reconstructed MNDWI time series are essential for exploring the hydrological processes of IRES dynamics and optimizing water resource management at the basin scale.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 339-353"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corrigendum to “Comparison of detectability of ship wake components between C-Band and X-Band synthetic aperture radar sensors operating under different slant ranges” [ISPRS J. Photogramm. Remote Sens. 196 (2023) 306-324]","authors":"Björn Tings, Andrey Pleskachevsky, Stefan Wiehle","doi":"10.1016/j.isprsjprs.2025.01.026","DOIUrl":"10.1016/j.isprsjprs.2025.01.026","url":null,"abstract":"","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Page 740"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143072526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Target-aware attentional network for rare class segmentation in large-scale LiDAR point clouds","authors":"Xinlong Zhang , Dong Lin , Uwe Soergel","doi":"10.1016/j.isprsjprs.2024.11.012","DOIUrl":"10.1016/j.isprsjprs.2024.11.012","url":null,"abstract":"<div><div>Semantic interpretation of 3D scenes poses a formidable challenge in point cloud processing, which also stands as a requisite undertaking across various fields of application involving point clouds. Although a number of point cloud segmentation methods have achieved leading performance, 3D rare class segmentation continues to be a challenge owing to the imbalanced distribution of fine-grained classes and the complexity of large scenes. In this paper, we present target-aware attentional network (TaaNet), a novel mask-constrained attention framework to address 3D semantic segmentation of imbalanced classes in large-scale point clouds. Adapting the self-attention mechanism, a hierarchical aggregation strategy is first applied to enhance the learning of point-wise features across various scales, which leverages both global and local perspectives to guarantee presence of fine-grained patterns in the case of scenes with high complexity. Subsequently, rare target masks are imposed by a contextual module on the hierarchical features. Specifically, a target-aware aggregator is proposed to boost discriminative features of rare classes, which constrains hierarchical features with learnable adaptive weights and simultaneously embeds confidence constraints of rare classes. Furthermore, a target pseudo-labeling strategy based on strong contour cues of rare classes is designed, which effectively delivers instance-level supervisory signals restricted to rare targets only. We conducted thorough experiments on four multi-platform LiDAR benchmarks, i.e., airborne, mobile and terrestrial platforms, to assess the performance of our framework. Results demonstrate that compared to other commonly used advanced segmentation methods, our method can obtain not only high segmentation accuracy but also remarkable F1-scores in rare classes. In a submission to the official ranking page of Hessigheim 3D benchmark, our approach achieves a state-of-the-art mean F1-score of 83.84% and an outstanding overall accuracy (OA) of 90.45%. In particular, the F1-scores of rare classes namely vehicles and chimneys notably exceed the average of other published methods by a wide margin, boosting by 32.00% and 32.46%, respectively. Additionally, extensive experimental analysis on benchmarks collected from multiple platforms, Paris-Lille-3D, Semantic3D and WHU-Urban3D, validates the robustness and effectiveness of the proposed method.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 32-50"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142789962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinyan Tian , Le Wang , Chunyuan Diao , Yameng Zhang , Mingming Jia , Lin Zhu , Meng Xu , Xiaojuan Li , Huili Gong
{"title":"National scale sub-meter mangrove mapping using an augmented border training sample method","authors":"Jinyan Tian , Le Wang , Chunyuan Diao , Yameng Zhang , Mingming Jia , Lin Zhu , Meng Xu , Xiaojuan Li , Huili Gong","doi":"10.1016/j.isprsjprs.2024.12.009","DOIUrl":"10.1016/j.isprsjprs.2024.12.009","url":null,"abstract":"<div><div>This study presents the development of China’s first national-scale sub-meter mangrove map, addressing the need for high-resolution mapping to accurately delineate mangrove boundaries and identify fragmented patches. To overcome the current limitation of 10-m resolution, we developed a novel Semi-automatic Sub-meter Mapping Method (SSMM). The SSMM enhances the spectral separability of mangroves from other land covers by selecting nine critical features from both Sentinel-2 and Google Earth imagery. We also developed an innovative automated sample collection method to ensure ample and precise training samples, increasing sample density in areas susceptible to misclassification and reducing it in uniform regions. This method surpasses traditional uniform sampling in representing the national-scale study area. The classification is performed using a random forest classifier and is manually refined, culminating in the production of the pioneering Large-scale Sub-meter Mangrove Map (LSMM).</div><div>Our study showcases the LSMM’s superior performance over the established High-resolution Global Mangrove Forest (HGMF) map. The LSMM demonstrates enhanced classification accuracy, improved spatial delineation, and more precise area calculations, along with a robust framework of spatial analysis. Notably, compared to the HGMF, the LSMM achieves a 22.0 % increase in overall accuracy and a 0.27 improvement in the F1 score. In terms of mangrove coverage within China, the LSMM estimates a reduction of 4,345 ha (15.4 %), decreasing from 32,598 ha in the HGMF to 28,253 ha. This reduction is further underscored by a significant 61.7 % discrepancy in spatial distribution areas when compared to the HGMF, indicative of both commission and omission errors associated with the 10-m HGMF. Additionally, the LSMM identifies a fivefold increase in the number of mangrove patches, totaling 40,035, compared to the HGMF’s 7,784. These findings underscore the substantial improvements offered by sub-meter resolution products over those with a 10-m resolution. The LSMM and its automated mapping methodology establish new benchmarks for comprehensive, long-term mangrove mapping at sub-meter scales, as well as for the detailed mapping of extensive land cover types. Our study is expected to catalyze a shift toward high-resolution mangrove mapping on a large scale.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 156-171"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142823149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}