A novel spatial-temporal image fusion method for augmented reality-based endoscopic surgery

IF 11.8 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2025-05-01 DOI:10.1016/j.media.2025.103609

Haochen Shi , Jiangchang Xu , Haitao Li , Shuanglin Jiang , Chaoyu Lei , Huifang Zhou , Yinwei Li , Xiaojun Chen

{"title":"A novel spatial-temporal image fusion method for augmented reality-based endoscopic surgery","authors":"Haochen Shi , Jiangchang Xu , Haitao Li , Shuanglin Jiang , Chaoyu Lei , Huifang Zhou , Yinwei Li , Xiaojun Chen","doi":"10.1016/j.media.2025.103609","DOIUrl":null,"url":null,"abstract":"<div><div>Augmented reality (AR) has significant potential to enhance the identification of critical locations during endoscopic surgeries, where accurate endoscope calibration is essential for ensuring the quality of augmented images. In optical-based surgical navigation systems, asynchrony between the optical tracker and the endoscope can cause the augmented scene to diverge from reality during rapid movements, potentially misleading the surgeon—a challenge that remains unresolved. In this paper, we propose a novel spatial–temporal endoscope calibration method that simultaneously determines the spatial transformation from the image to the optical marker and the temporal latency between the tracking and image acquisition systems. To estimate temporal latency, we utilize a Monte Carlo method to estimate the intrinsic parameters of the endoscope’s imaging system, leveraging a dataset of thousands of calibration samples. This dataset is larger than those typically employed in conventional camera calibration routines, rendering traditional algorithms computationally infeasible within a reasonable timeframe. By introducing latency as an independent variable into the principal equation of hand-eye calibration, we developed a weighted algorithm to iteratively solve the equation. This approach eliminates the need for a fixture to stabilize the endoscope during calibration, allowing for quicker calibration through handheld flexible movement. Experimental results demonstrate that our method achieves an average 2D error of <span><math><mrow><mn>7</mn><mo>±</mo><mn>3</mn></mrow></math></span> pixels and a pseudo-3D error of <span><math><mrow><mn>1</mn><mo>.</mo><mn>2</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>4</mn><mspace></mspace><mi>mm</mi></mrow></math></span> for stable scenes within <span><math><mrow><mn>82</mn><mo>.</mo><mn>4</mn><mo>±</mo><mn>16</mn><mo>.</mo><mn>6</mn></mrow></math></span> seconds—approximately 68% faster in operation time than conventional methods. In dynamic scenes, our method compensates for the virtual-to-reality latency of <span><math><mrow><mn>11</mn><mo>±</mo><mn>2</mn><mspace></mspace><mi>ms</mi></mrow></math></span>, which is shorter than a single frame interval and 5.7 times shorter than the uncompensated conventional method. Finally, we successfully integrated the proposed method into our surgical navigation system and validated its feasibility in clinical trials for transnasal optic canal decompression surgery. Our method has the potential to improve the safety and efficacy of endoscopic surgeries, leading to better patient outcomes.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103609"},"PeriodicalIF":11.8000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525001562","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Augmented reality (AR) has significant potential to enhance the identification of critical locations during endoscopic surgeries, where accurate endoscope calibration is essential for ensuring the quality of augmented images. In optical-based surgical navigation systems, asynchrony between the optical tracker and the endoscope can cause the augmented scene to diverge from reality during rapid movements, potentially misleading the surgeon—a challenge that remains unresolved. In this paper, we propose a novel spatial–temporal endoscope calibration method that simultaneously determines the spatial transformation from the image to the optical marker and the temporal latency between the tracking and image acquisition systems. To estimate temporal latency, we utilize a Monte Carlo method to estimate the intrinsic parameters of the endoscope’s imaging system, leveraging a dataset of thousands of calibration samples. This dataset is larger than those typically employed in conventional camera calibration routines, rendering traditional algorithms computationally infeasible within a reasonable timeframe. By introducing latency as an independent variable into the principal equation of hand-eye calibration, we developed a weighted algorithm to iteratively solve the equation. This approach eliminates the need for a fixture to stabilize the endoscope during calibration, allowing for quicker calibration through handheld flexible movement. Experimental results demonstrate that our method achieves an average 2D error of

7 \pm 3

pixels and a pseudo-3D error of

1.2 \pm 0.4 mm

for stable scenes within

82.4 \pm 16.6

seconds—approximately 68% faster in operation time than conventional methods. In dynamic scenes, our method compensates for the virtual-to-reality latency of

11 \pm 2 ms

, which is shorter than a single frame interval and 5.7 times shorter than the uncompensated conventional method. Finally, we successfully integrated the proposed method into our surgical navigation system and validated its feasibility in clinical trials for transnasal optic canal decompression surgery. Our method has the potential to improve the safety and efficacy of endoscopic surgeries, leading to better patient outcomes.

查看原文本刊更多论文

一种用于增强现实内镜手术的新型时空图像融合方法

增强现实（AR）在增强内窥镜手术中关键位置的识别方面具有巨大的潜力，在内窥镜手术中，准确的内窥镜校准对于确保增强图像的质量至关重要。在基于光学的手术导航系统中，光学跟踪器和内窥镜之间的不同步可能导致增强场景在快速运动时偏离现实，潜在地误导外科医生——这是一个尚未解决的挑战。在本文中，我们提出了一种新的时空内窥镜校准方法，该方法同时确定了从图像到光学标记的空间转换以及跟踪和图像采集系统之间的时间延迟。为了估计时间延迟，我们利用蒙特卡罗方法来估计内窥镜成像系统的内在参数，利用数千个校准样本的数据集。该数据集比传统相机校准程序中通常使用的数据集更大，使得传统算法在合理的时间范围内计算上不可行。通过将延迟作为自变量引入手眼标定主方程，开发了一种加权算法来迭代求解该方程。这种方法消除了在校准过程中需要固定装置来稳定内窥镜的需要，允许通过手持灵活的运动更快地校准。实验结果表明，该方法在82.4±16.6秒内实现了稳定场景的平均二维误差为7±3像素，伪三维误差为1.2±0.4mm，操作时间比传统方法快了约68%。在动态场景中，我们的方法补偿了虚拟到现实的延迟（11±2ms），比单帧间隔短，比无补偿的传统方法短5.7倍。最后，我们成功地将该方法整合到我们的手术导航系统中，并在经鼻视神经管减压手术的临床试验中验证了其可行性。我们的方法有可能提高内窥镜手术的安全性和有效性，导致更好的患者预后。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.