Automatic three-dimensional reconstruction of transparent objects with multiple optimization strategies under limited constraints

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Image and Vision Computing Pub Date : 2025-05-24 DOI:10.1016/j.imavis.2025.105580

Xiaopeng Sha , Xiaopeng Si , Yujie Zhu , Shuyu Wang , Yuliang Zhao

{"title":"Automatic three-dimensional reconstruction of transparent objects with multiple optimization strategies under limited constraints","authors":"Xiaopeng Sha , Xiaopeng Si , Yujie Zhu , Shuyu Wang , Yuliang Zhao","doi":"10.1016/j.imavis.2025.105580","DOIUrl":null,"url":null,"abstract":"<div><div>Reconstructing transparent objects with limited constraints has long been considered a highly challenging problem. Due to the complex interaction between transparent objects and light, which involves intricate refraction and reflection relationships, traditional three-dimensional (3D) reconstruction methods are less than effective for transparent objects. To address this issue, this study proposes a 3D reconstruction method specifically designed for transparent objects. Incorporating multiple optimization strategies, the method works under limited constraints to achieve the automatic reconstruction of transparent objects with only a few transparent object images in any known environment, without the need for specific data collection devices or environments. The proposed method makes use of automatic image segmentation and modifies the network interface and structure of the PointNeXt algorithm to introduce the TransNeXt network, which enhances normal features, optimizes weight attenuation, and employs a preheating cosine annealing learning rate. We use several steps to reconstruct the complete 3D shape of transparent objects. First, we initialize the transparent shape with a visual hull reconstructed with the contours obtained by the TOM-Net. Then, we construct the normal reconstruction network to estimate the normal values. Finally, we reconstruct the complete 3D shape using the TransNeXt network. Multiple experiments show that the TransNeXt network exhibits superior reconstruction performance to other networks and can effectively perform the automatic reconstruction of transparent objects even under limited constraints.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"160 ","pages":"Article 105580"},"PeriodicalIF":4.2000,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625001684","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Reconstructing transparent objects with limited constraints has long been considered a highly challenging problem. Due to the complex interaction between transparent objects and light, which involves intricate refraction and reflection relationships, traditional three-dimensional (3D) reconstruction methods are less than effective for transparent objects. To address this issue, this study proposes a 3D reconstruction method specifically designed for transparent objects. Incorporating multiple optimization strategies, the method works under limited constraints to achieve the automatic reconstruction of transparent objects with only a few transparent object images in any known environment, without the need for specific data collection devices or environments. The proposed method makes use of automatic image segmentation and modifies the network interface and structure of the PointNeXt algorithm to introduce the TransNeXt network, which enhances normal features, optimizes weight attenuation, and employs a preheating cosine annealing learning rate. We use several steps to reconstruct the complete 3D shape of transparent objects. First, we initialize the transparent shape with a visual hull reconstructed with the contours obtained by the TOM-Net. Then, we construct the normal reconstruction network to estimate the normal values. Finally, we reconstruct the complete 3D shape using the TransNeXt network. Multiple experiments show that the TransNeXt network exhibits superior reconstruction performance to other networks and can effectively perform the automatic reconstruction of transparent objects even under limited constraints.

查看原文本刊更多论文

有限约束下多优化策略的透明物体三维自动重建

在有限约束条件下重建透明物体一直被认为是一个极具挑战性的问题。由于透明物体与光之间复杂的相互作用，涉及复杂的折射和反射关系，传统的三维重建方法对透明物体的重建效果较差。为了解决这一问题，本研究提出了一种专门针对透明物体的三维重建方法。该方法结合多种优化策略，在有限的约束条件下，在任何已知环境下，仅使用少量透明物体图像即可实现透明物体的自动重建，不需要特定的数据采集设备或环境。该方法利用自动图像分割，修改PointNeXt算法的网络接口和结构，引入TransNeXt网络，增强了正常特征，优化了权值衰减，并采用了预热余弦退火学习率。我们使用几个步骤来重建透明物体的完整3D形状。首先，我们初始化透明形状，并使用TOM-Net获得的轮廓重建视觉船体。然后，我们构造了正态重构网络来估计正态值。最后，我们使用TransNeXt网络重建完整的3D形状。多次实验表明，TransNeXt网络具有优于其他网络的重建性能，即使在有限的约束条件下也能有效地进行透明物体的自动重建。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.