Yu Li, Da Chang, Die Luo, Jin Huang, Lan Dong, Du Wang, Liye Mei, Cheng Lei
{"title":"SfMDiffusion: self-supervised monocular depth estimation in endoscopy based on diffusion models.","authors":"Yu Li, Da Chang, Die Luo, Jin Huang, Lan Dong, Du Wang, Liye Mei, Cheng Lei","doi":"10.1007/s11548-025-03333-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>In laparoscopic surgery, accurate 3D reconstruction from endoscopic video is crucial for effective image-guided techniques. Current methods for monocular depth estimation (MDE) face challenges in complex surgical scenes, including limited training data, specular reflections, and varying illumination conditions.</p><p><strong>Methods: </strong>We propose SfMDiffusion, a novel diffusion-based self-supervised framework for MDE. Our approach combines: (1) a denoising diffusion process guided by pseudo-ground-truth depth maps, (2) knowledge distillation from a pre-trained teacher model, and (3) discriminative priors to enhance estimation robustness. Our design enables accurate depth estimation without requiring ground-truth depth data during training.</p><p><strong>Results: </strong>Experiments on the SCARED and Hamlyn datasets demonstrate that SfMDiffusion achieves superior performance: an Absolute relative error (Abs Rel) of 0.049, a Squared relative error (Sq Rel) of 0.366, and a Root Mean Square Error (RMSE) of 4.305 on SCARED dataset, and Abs Rel of 0.067, Sq Rel of 0.800, and RMSE of 7.465 on Hamlyn dataset.</p><p><strong>Conclusion: </strong>SfMDiffusion provides an innovative approach for 3D reconstruction in image-guided surgical techniques. Future work will focus on computational optimization and validation across diverse surgical scenarios. Our code is available at https://github.com/Skylanding/SfM-Diffusion .</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03333-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: In laparoscopic surgery, accurate 3D reconstruction from endoscopic video is crucial for effective image-guided techniques. Current methods for monocular depth estimation (MDE) face challenges in complex surgical scenes, including limited training data, specular reflections, and varying illumination conditions.
Methods: We propose SfMDiffusion, a novel diffusion-based self-supervised framework for MDE. Our approach combines: (1) a denoising diffusion process guided by pseudo-ground-truth depth maps, (2) knowledge distillation from a pre-trained teacher model, and (3) discriminative priors to enhance estimation robustness. Our design enables accurate depth estimation without requiring ground-truth depth data during training.
Results: Experiments on the SCARED and Hamlyn datasets demonstrate that SfMDiffusion achieves superior performance: an Absolute relative error (Abs Rel) of 0.049, a Squared relative error (Sq Rel) of 0.366, and a Root Mean Square Error (RMSE) of 4.305 on SCARED dataset, and Abs Rel of 0.067, Sq Rel of 0.800, and RMSE of 7.465 on Hamlyn dataset.
Conclusion: SfMDiffusion provides an innovative approach for 3D reconstruction in image-guided surgical techniques. Future work will focus on computational optimization and validation across diverse surgical scenarios. Our code is available at https://github.com/Skylanding/SfM-Diffusion .
期刊介绍:
The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.