High-Quality Pseudo-Labeling for Point Cloud Segmentation With Scene-Level Annotation

IF 18.6
Lunhao Duan;Shanshan Zhao;Xingxing Weng;Jing Zhang;Gui-Song Xia
{"title":"High-Quality Pseudo-Labeling for Point Cloud Segmentation With Scene-Level Annotation","authors":"Lunhao Duan;Shanshan Zhao;Xingxing Weng;Jing Zhang;Gui-Song Xia","doi":"10.1109/TPAMI.2025.3583071","DOIUrl":null,"url":null,"abstract":"This paper investigates indoor point cloud semantic segmentation under scene-level annotation, which is less explored compared to methods relying on sparse point-level labels. In the absence of precise point-level labels, current methods first generate point-level pseudo-labels, which are then used to train segmentation models. However, generating accurate pseudo-labels for each point solely based on scene-level annotations poses a considerable challenge, substantially affecting segmentation performance. Consequently, to enhance accuracy, this paper proposes a high-quality pseudo-label generation framework by exploring contemporary multi-modal information and region-point semantic consistency. Specifically, with a cross-modal feature guidance module, our method utilizes 2D-3D correspondences to align point cloud features with corresponding 2D image pixels, thereby assisting point cloud feature learning. To further alleviate the challenge presented by the scene-level annotation, we introduce a region-point semantic consistency module. It produces regional semantics through a region-voting strategy derived from point-level semantics, which are subsequently employed to guide the point-level semantic predictions. Leveraging the aforementioned modules, our method can rectify inaccurate point-level semantic predictions during training and obtain high-quality pseudo-labels. Significant improvements over previous works on ScanNet v2 and S3DIS datasets under scene-level annotation can demonstrate the effectiveness. Additionally, comprehensive ablation studies validate the contributions of our approach’s individual components.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 10","pages":"9360-9366"},"PeriodicalIF":18.6000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11050997/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper investigates indoor point cloud semantic segmentation under scene-level annotation, which is less explored compared to methods relying on sparse point-level labels. In the absence of precise point-level labels, current methods first generate point-level pseudo-labels, which are then used to train segmentation models. However, generating accurate pseudo-labels for each point solely based on scene-level annotations poses a considerable challenge, substantially affecting segmentation performance. Consequently, to enhance accuracy, this paper proposes a high-quality pseudo-label generation framework by exploring contemporary multi-modal information and region-point semantic consistency. Specifically, with a cross-modal feature guidance module, our method utilizes 2D-3D correspondences to align point cloud features with corresponding 2D image pixels, thereby assisting point cloud feature learning. To further alleviate the challenge presented by the scene-level annotation, we introduce a region-point semantic consistency module. It produces regional semantics through a region-voting strategy derived from point-level semantics, which are subsequently employed to guide the point-level semantic predictions. Leveraging the aforementioned modules, our method can rectify inaccurate point-level semantic predictions during training and obtain high-quality pseudo-labels. Significant improvements over previous works on ScanNet v2 and S3DIS datasets under scene-level annotation can demonstrate the effectiveness. Additionally, comprehensive ablation studies validate the contributions of our approach’s individual components.
基于场景级标注的点云分割高质量伪标记。
本文研究了场景级标注下的室内点云语义分割,与依赖稀疏点级标签的方法相比,这方面的研究较少。在缺乏精确的点级标签的情况下,目前的方法首先生成点级伪标签,然后使用伪标签来训练分割模型。然而,仅基于场景级注释为每个点生成准确的伪标签带来了相当大的挑战,极大地影响了分割性能。因此,为了提高准确性,本文通过探索当代多模态信息和区域点语义一致性,提出了一个高质量的伪标签生成框架。具体来说,我们的方法通过一个跨模态特征引导模块,利用2D- 3d对应关系将点云特征与相应的二维图像像素对齐,从而辅助点云特征学习。为了进一步缓解场景级标注带来的挑战,我们引入了区域点语义一致性模块。它通过从点级语义派生的区域投票策略产生区域语义,然后使用区域语义来指导点级语义预测。利用上述模块,我们的方法可以在训练过程中纠正不准确的点级语义预测,并获得高质量的伪标签。在场景级标注下,对ScanNet v2和S3DIS数据集的显著改进可以证明其有效性。此外,综合消融研究证实了我们的方法的各个组成部分的贡献。代码可在https://github.com/LHDuan/WSegPC上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信