Yanqi Wang , Xinyue Sun , Jun Jia , Zuolin Jin , Yanning Ma
{"title":"High-precision 3D teeth reconstruction based on five-view intra-oral photos","authors":"Yanqi Wang , Xinyue Sun , Jun Jia , Zuolin Jin , Yanning Ma","doi":"10.1016/j.displa.2025.102988","DOIUrl":null,"url":null,"abstract":"<div><div>Reconstructing 3D dental model from multi-view intra-oral photos plays an important role in the process of orthodontic treatment. Compared with cone-beam computed tomography (CBCT) or intra-oral scanner (IOS), 3D reconstruction provides a low-cost solution to monitor teeth, which does not require professional devices and operations. This paper introduces an enhanced fully automated framework for 3D tooth reconstruction using five-view intraoral photos, capable of automatically generating the shapes, alignments, and occlusal relationships of both upper and lower teeth. The proposed framework includes three phases. Initially, a parametric dental model based on a statistical shape is built to represent the shape and position of each tooth. Next, in the feature extraction stage, the segment anything model (SAM) is used to accurately detect the tooth boundaries from intra-oral photos, and the single-view depth estimation approach known as Depth Anything is used to obtain depth information. And grayscale conversion and normalization processing are performed on the photos to extract luminance information separately in order to deal with the problem of tooth surface reflection. Finally, an iterative reconstruction process in two stages is implemented: the first stage involves alternating between searching for point correspondences and optimizing a composite loss function to align the parameterized tooth model with the predicted contours of teeth; in the second stage, image depth and lightness information are utilized for additional refinement. Extensive experiments are conducted to validate the proposed methods. Compared with existing methods, the proposed method not only qualitatively outperforms in misaligned, missing, or complex occlusion cases, but also quantificationally achieve good RMSD and Dice.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102988"},"PeriodicalIF":3.7000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225000253","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Reconstructing 3D dental model from multi-view intra-oral photos plays an important role in the process of orthodontic treatment. Compared with cone-beam computed tomography (CBCT) or intra-oral scanner (IOS), 3D reconstruction provides a low-cost solution to monitor teeth, which does not require professional devices and operations. This paper introduces an enhanced fully automated framework for 3D tooth reconstruction using five-view intraoral photos, capable of automatically generating the shapes, alignments, and occlusal relationships of both upper and lower teeth. The proposed framework includes three phases. Initially, a parametric dental model based on a statistical shape is built to represent the shape and position of each tooth. Next, in the feature extraction stage, the segment anything model (SAM) is used to accurately detect the tooth boundaries from intra-oral photos, and the single-view depth estimation approach known as Depth Anything is used to obtain depth information. And grayscale conversion and normalization processing are performed on the photos to extract luminance information separately in order to deal with the problem of tooth surface reflection. Finally, an iterative reconstruction process in two stages is implemented: the first stage involves alternating between searching for point correspondences and optimizing a composite loss function to align the parameterized tooth model with the predicted contours of teeth; in the second stage, image depth and lightness information are utilized for additional refinement. Extensive experiments are conducted to validate the proposed methods. Compared with existing methods, the proposed method not only qualitatively outperforms in misaligned, missing, or complex occlusion cases, but also quantificationally achieve good RMSD and Dice.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.