Test-time adaptation for geospatial point cloud semantic segmentation with distinct domain shifts

IF 12.2 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-09-09 DOI:10.1016/j.isprsjprs.2025.08.022

Puzuo Wang , Wei Yao , Jie Shao , Zhiyi He

{"title":"Test-time adaptation for geospatial point cloud semantic segmentation with distinct domain shifts","authors":"Puzuo Wang , Wei Yao , Jie Shao , Zhiyi He","doi":"10.1016/j.isprsjprs.2025.08.022","DOIUrl":null,"url":null,"abstract":"<div><div>Domain adaptation (DA) techniques aim to close the gap between source and target domains, enabling deep learning models to generalize across different data shift paradigms for point cloud semantic segmentation (PCSS). Among emerging DA schemes, test-time adaptation (TTA) facilitates direct adaptation of a pre-trained model to unlabeled data during the inference stage without access to source domain data and need for additional training process, which mitigates data privacy concerns and removes the requirement for substantial computational power. To fill the gap of leveraging TTA for geospatial PCSS, we introduce three typical domain shift paradigms in handling geospatial point clouds and construct three practical adaptation benchmarks, including photogrammetric point clouds to airborne LiDAR, airborne LiDAR to mobile LiDAR, and synthetic to mobile LiDAR. Then, a TTA method is proposed by exploiting the domain-specific knowledge embedded within the batch normalization (BN) layers. Given the pre-trained model, BN statistical information is progressively updated by fusing the statistics of each testing batch. Furthermore, we develop a self-supervised module to optimize the learnable BN affine parameters. Information maximization is used to generate confident and category-specific predictions, and reliability constrained pseudo-labeling is further incorporated to create supervisory signals. Extensive experimental analysis demonstrates that our proposed method significantly improves classification accuracy compared to directly applying the inference by up to 20% in terms of mIoU, which not only outperforms other popular counterparts but also maintains a high efficiency while avoiding retraining. In an adaptation of photogrammetric (SensatUrban) to airborne (Hessigheim 3D), our method achieves a mIoU of 59.46% and an OA of 85. 97%.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"229 ","pages":"Pages 422-435"},"PeriodicalIF":12.2000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271625003338","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Domain adaptation (DA) techniques aim to close the gap between source and target domains, enabling deep learning models to generalize across different data shift paradigms for point cloud semantic segmentation (PCSS). Among emerging DA schemes, test-time adaptation (TTA) facilitates direct adaptation of a pre-trained model to unlabeled data during the inference stage without access to source domain data and need for additional training process, which mitigates data privacy concerns and removes the requirement for substantial computational power. To fill the gap of leveraging TTA for geospatial PCSS, we introduce three typical domain shift paradigms in handling geospatial point clouds and construct three practical adaptation benchmarks, including photogrammetric point clouds to airborne LiDAR, airborne LiDAR to mobile LiDAR, and synthetic to mobile LiDAR. Then, a TTA method is proposed by exploiting the domain-specific knowledge embedded within the batch normalization (BN) layers. Given the pre-trained model, BN statistical information is progressively updated by fusing the statistics of each testing batch. Furthermore, we develop a self-supervised module to optimize the learnable BN affine parameters. Information maximization is used to generate confident and category-specific predictions, and reliability constrained pseudo-labeling is further incorporated to create supervisory signals. Extensive experimental analysis demonstrates that our proposed method significantly improves classification accuracy compared to directly applying the inference by up to 20% in terms of mIoU, which not only outperforms other popular counterparts but also maintains a high efficiency while avoiding retraining. In an adaptation of photogrammetric (SensatUrban) to airborne (Hessigheim 3D), our method achieves a mIoU of 59.46% and an OA of 85. 97%.

查看原文本刊更多论文

具有不同域偏移的地理空间点云语义分割的测试时间自适应

领域自适应（DA）技术旨在缩小源域和目标域之间的差距，使深度学习模型能够在点云语义分割（PCSS）中跨越不同的数据转移范式进行泛化。在新兴的数据分析方案中，测试时间自适应（TTA）有助于在推理阶段将预训练模型直接适应未标记的数据，而无需访问源域数据和需要额外的训练过程，从而减轻了数据隐私问题并消除了对大量计算能力的要求。为了填补利用TTA进行地理空间PCSS的空白，本文介绍了处理地理空间点云的三种典型域移范式，并构建了三个实用的自适应基准，包括摄影测量点云对机载激光雷达、机载激光雷达对移动激光雷达和合成激光雷达对移动激光雷达。然后，利用批归一化层中嵌入的领域特定知识，提出了一种TTA方法。给定预训练模型，通过融合每个测试批次的统计信息，逐步更新BN统计信息。此外，我们开发了一个自监督模块来优化可学习的BN仿射参数。信息最大化用于生成自信和特定类别的预测，并进一步结合可靠性约束伪标记来创建监督信号。大量的实验分析表明，与直接应用mIoU推理相比，我们提出的方法显著提高了分类准确率高达20%，不仅优于其他流行的同类方法，而且在避免再训练的同时保持了较高的效率。在将摄影测量（SensatUrban）应用于航空测量（hessighheim 3D）中，我们的方法实现了59.46%的mIoU和85的OA。97%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ISPRS Journal of Photogrammetry and Remote Sensing 工程技术-成像科学与照相技术

CiteScore

21.00

自引率

6.30%

发文量

273

审稿时长

40 days

期刊介绍： The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive. P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields. In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.