Deep Learning for Autonomous Surgical Guidance Using 3-Dimensional Images From Forward-Viewing Endoscopic Optical Coherence Tomography.

Sinaro Ly, Adrien Badré, Parker Brandt, Chen Wang, Paul Calle, Justin Reynolds, Qinghao Zhang, Kar-Ming Fung, Haoyang Cui, Zhongxin Yu, Sanjay G Patel, Yunlong Liu, Nathan A Bradley, Qinggong Tang, Chongle Pan
{"title":"Deep Learning for Autonomous Surgical Guidance Using 3-Dimensional Images From Forward-Viewing Endoscopic Optical Coherence Tomography.","authors":"Sinaro Ly, Adrien Badré, Parker Brandt, Chen Wang, Paul Calle, Justin Reynolds, Qinghao Zhang, Kar-Ming Fung, Haoyang Cui, Zhongxin Yu, Sanjay G Patel, Yunlong Liu, Nathan A Bradley, Qinggong Tang, Chongle Pan","doi":"10.1002/jbio.202500181","DOIUrl":null,"url":null,"abstract":"<p><p>A three-dimensional convolutional neural network (3D-CNN) was developed for the analysis of volumetric optical coherence tomography (OCT) images to enhance endoscopic guidance during percutaneous nephrostomy. The model was performance-benchmarked using a 10-fold nested cross-validation procedure and achieved an average test accuracy of 90.57% across a dataset of 10 porcine kidneys. This performance significantly exceeded that of 2D-CNN models that attained average test accuracies ranging from 85.63% to 88.22% using 1, 10, or 100 radial sections extracted from the 3D OCT volumes. The 3D-CNN (~12 million parameters) was benchmarked against three state-of-the-art volumetric architectures: the 3D Vision Transformer (3D-ViT, ~45 million parameters), 3D-DenseNet121 (~12 million parameters), and the Multi-plane and Multi-slice Transformer (M3T, ~29 million parameters). While these models achieved comparable inferencing accuracy, the 3D-CNN exhibited lower inference latency (33 ms) than 3D-ViT (86 ms), 3D-DenseNet121 (58 ms), and M3T (93 ms), representing a critical advantage for real-time surgical guidance applications. These results demonstrate the 3D-CNN's capability as a powerful and practical tool for computer-aided diagnosis in OCT-guided surgical interventions.</p>","PeriodicalId":94068,"journal":{"name":"Journal of biophotonics","volume":" ","pages":"e202500181"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of biophotonics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/jbio.202500181","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A three-dimensional convolutional neural network (3D-CNN) was developed for the analysis of volumetric optical coherence tomography (OCT) images to enhance endoscopic guidance during percutaneous nephrostomy. The model was performance-benchmarked using a 10-fold nested cross-validation procedure and achieved an average test accuracy of 90.57% across a dataset of 10 porcine kidneys. This performance significantly exceeded that of 2D-CNN models that attained average test accuracies ranging from 85.63% to 88.22% using 1, 10, or 100 radial sections extracted from the 3D OCT volumes. The 3D-CNN (~12 million parameters) was benchmarked against three state-of-the-art volumetric architectures: the 3D Vision Transformer (3D-ViT, ~45 million parameters), 3D-DenseNet121 (~12 million parameters), and the Multi-plane and Multi-slice Transformer (M3T, ~29 million parameters). While these models achieved comparable inferencing accuracy, the 3D-CNN exhibited lower inference latency (33 ms) than 3D-ViT (86 ms), 3D-DenseNet121 (58 ms), and M3T (93 ms), representing a critical advantage for real-time surgical guidance applications. These results demonstrate the 3D-CNN's capability as a powerful and practical tool for computer-aided diagnosis in OCT-guided surgical interventions.

利用前视内窥镜光学相干断层扫描的三维图像进行自主手术指导的深度学习。
三维卷积神经网络(3D-CNN)用于分析容积光学相干断层扫描(OCT)图像,以增强经皮肾造口术的内镜指导。该模型使用10倍嵌套交叉验证程序进行性能基准测试,在10个猪肾脏数据集上实现了90.57%的平均测试精度。这一性能显著超过了2D-CNN模型,后者使用从3D OCT体积中提取的1、10或100个径向切片获得了85.63%至88.22%的平均测试精度。3D- cnn(约1200万参数)与三种最先进的体积架构进行了基准测试:3D视觉变压器(3D- vit,约4500万参数),3D- densenet121(约1200万参数),以及多平面和多片变压器(M3T,约2900万参数)。虽然这些模型达到了相当的推理精度,但3D-CNN的推理延迟(33 ms)比3D-ViT (86 ms)、3D-DenseNet121 (58 ms)和M3T (93 ms)更低,这在实时手术指导应用中具有关键优势。这些结果证明了3D-CNN在oct引导的手术干预中作为计算机辅助诊断的强大实用工具的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信