Structured Light Image Planar-Topography Feature Decomposition for Generalizable 3D Shape Measurement

IF 11.1 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Circuits and Systems for Video Technology Pub Date : 2025-04-08 DOI:10.1109/TCSVT.2025.3558732

Mingyang Lei;Jingfan Fan;Long Shao;Hong Song;Deqiang Xiao;Danni Ai;Tianyu Fu;Yucong Lin;Ying Gu;Jian Yang

{"title":"Structured Light Image Planar-Topography Feature Decomposition for Generalizable 3D Shape Measurement","authors":"Mingyang Lei;Jingfan Fan;Long Shao;Hong Song;Deqiang Xiao;Danni Ai;Tianyu Fu;Yucong Lin;Ying Gu;Jian Yang","doi":"10.1109/TCSVT.2025.3558732","DOIUrl":null,"url":null,"abstract":"The application of structured light (SL) techniques has achieved remarkable success in three-dimensional (3D) measurements. Traditional methods generally calculate SL information pixel by pixel to obtain the measurement results. Recently, the rise of deep learning (DL) has led to significant developments in this task. However, existing DL-based methods generally learn all features within the image in an end-to-end manner, ignoring the distinction between SL and non-SL information. Therefore, these methods may encounter difficulties in focusing on subtle variations in SL patterns across different scenes, thereby degrading measurement precision. To overcome this challenge, we propose a novel SL Image Planar-Topography Feature Decomposition Network (SIDNet). To fully utilize the information from different SL modality images (fringe and speckle), we decompose different modalities into topography features (modality-specific) and planar features (modality-shared). A physics-driven decomposition loss is proposed to make the topography/planar features dissimilar/similar, which guides the network to distinguish between SL and non-SL information. Moreover, to obtain modality-fused features with global overview and local detail information, we propose a wrapped phase-driven feature fusion module. Specifically, a novel Tri-modality Mamba block is designed to integrate different sources with the guidance of the wrapped phase features. Extensive experiments demonstrate the superiority of our SIDNet in multiple simulated 3D measurement scenes. Moreover, our method shows better generalization ability than other DL models and can be directly applicable to unseen real-world scenes.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"9517-9529"},"PeriodicalIF":11.1000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10955736/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

The application of structured light (SL) techniques has achieved remarkable success in three-dimensional (3D) measurements. Traditional methods generally calculate SL information pixel by pixel to obtain the measurement results. Recently, the rise of deep learning (DL) has led to significant developments in this task. However, existing DL-based methods generally learn all features within the image in an end-to-end manner, ignoring the distinction between SL and non-SL information. Therefore, these methods may encounter difficulties in focusing on subtle variations in SL patterns across different scenes, thereby degrading measurement precision. To overcome this challenge, we propose a novel SL Image Planar-Topography Feature Decomposition Network (SIDNet). To fully utilize the information from different SL modality images (fringe and speckle), we decompose different modalities into topography features (modality-specific) and planar features (modality-shared). A physics-driven decomposition loss is proposed to make the topography/planar features dissimilar/similar, which guides the network to distinguish between SL and non-SL information. Moreover, to obtain modality-fused features with global overview and local detail information, we propose a wrapped phase-driven feature fusion module. Specifically, a novel Tri-modality Mamba block is designed to integrate different sources with the guidance of the wrapped phase features. Extensive experiments demonstrate the superiority of our SIDNet in multiple simulated 3D measurement scenes. Moreover, our method shows better generalization ability than other DL models and can be directly applicable to unseen real-world scenes.

查看原文本刊更多论文

面向广义三维形状测量的结构光图像平面地形特征分解

结构光（SL）技术在三维测量中的应用取得了显著的成功。传统方法一般是逐像素计算SL信息来获得测量结果。最近，深度学习（DL）的兴起导致了这一任务的重大发展。然而，现有的基于dl的方法一般都是端到端学习图像内的所有特征，忽略了SL和非SL信息的区别。因此，这些方法在关注不同场景中SL模式的细微变化时可能会遇到困难，从而降低测量精度。为了克服这一挑战，我们提出了一种新的SL图像平面地形特征分解网络（SIDNet）。为了充分利用不同SL模态图像（条纹和散斑）的信息，我们将不同模态分解为地形特征（模态特定）和平面特征（模态共享）。提出了一种物理驱动的分解损失，使地形/平面特征不相似/相似，从而指导网络区分SL和非SL信息。此外，为了获得具有全局概况和局部细节信息的模态融合特征，我们提出了一个包裹的相位驱动特征融合模块。具体来说，一个新颖的三模态曼巴块被设计成整合不同的来源与包裹相位特征的指导。大量的实验证明了我们的SIDNet在多个模拟三维测量场景中的优势。此外，该方法比其他深度学习模型具有更好的泛化能力，可以直接应用于未见过的真实场景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Circuits and Systems for Video Technology 工程技术-工程：电子与电气

CiteScore

13.80

自引率

27.40%

发文量

660

审稿时长

5 months

期刊介绍： The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.