Michael Schwimmbeck, Serouj Khajarian, Christopher Auer, Thomas Wittenberg, Stefanie Remmele
{"title":"面向开放手术增强现实应用的零射击低延迟导航。","authors":"Michael Schwimmbeck, Serouj Khajarian, Christopher Auer, Thomas Wittenberg, Stefanie Remmele","doi":"10.1007/s11548-025-03480-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Augmented reality (AR) enhances surgical navigation by superimposing visible anatomical structures with three-dimensional virtual models using head-mounted displays (HMDs). In particular, interventions such as open liver surgery can benefit from AR navigation, as it aids in identifying and distinguishing tumors and risk structures. However, there is a lack of automatic and markerless methods that are robust against real-world challenges, such as partial occlusion and organ motion.</p><p><strong>Methods: </strong>We introduce a novel multi-device approach for automatic live navigation in open liver surgery that enhances the visualization and interaction capabilities of a HoloLens 2 HMD through precise and reliable registration using an Intel RealSense RGB-D camera. The intraoperative RGB-D segmentation and the preoperative CT data are utilized to register a virtual liver model to the target anatomy. An AR-prompted Segment Anything Model (SAM) enables robust segmentation of the liver in situ without the need for additional training data. To mitigate algorithmic latency, Double Exponential Smoothing (DES) is applied to forecast registration results.</p><p><strong>Results: </strong>We conducted a phantom study for open liver surgery, investigating various scenarios of liver motion, viewpoints, and occlusion. The mean registration errors (8.31 mm-18.78 mm TRE) are comparable to those reported in prior work, while our approach demonstrates high success rates even for high occlusion factors and strong motion. Using forecasting, we bypassed the algorithmic latency of 79.8 ms per frame, with median forecasting errors below 2 mms and 1.5 degrees between the quaternions.</p><p><strong>Conclusion: </strong>To our knowledge, this is the first work to approach markerless in situ visualization by combining a multi-device method with forecasting and a foundation model for segmentation and tracking. This enables a more reliable and precise AR registration of surgical targets with low latency. Our approach can be applied to other surgical applications and AR hardware with minimal effort.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards a zero-shot low-latency navigation for open surgery augmented reality applications.\",\"authors\":\"Michael Schwimmbeck, Serouj Khajarian, Christopher Auer, Thomas Wittenberg, Stefanie Remmele\",\"doi\":\"10.1007/s11548-025-03480-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Augmented reality (AR) enhances surgical navigation by superimposing visible anatomical structures with three-dimensional virtual models using head-mounted displays (HMDs). In particular, interventions such as open liver surgery can benefit from AR navigation, as it aids in identifying and distinguishing tumors and risk structures. However, there is a lack of automatic and markerless methods that are robust against real-world challenges, such as partial occlusion and organ motion.</p><p><strong>Methods: </strong>We introduce a novel multi-device approach for automatic live navigation in open liver surgery that enhances the visualization and interaction capabilities of a HoloLens 2 HMD through precise and reliable registration using an Intel RealSense RGB-D camera. The intraoperative RGB-D segmentation and the preoperative CT data are utilized to register a virtual liver model to the target anatomy. An AR-prompted Segment Anything Model (SAM) enables robust segmentation of the liver in situ without the need for additional training data. To mitigate algorithmic latency, Double Exponential Smoothing (DES) is applied to forecast registration results.</p><p><strong>Results: </strong>We conducted a phantom study for open liver surgery, investigating various scenarios of liver motion, viewpoints, and occlusion. The mean registration errors (8.31 mm-18.78 mm TRE) are comparable to those reported in prior work, while our approach demonstrates high success rates even for high occlusion factors and strong motion. Using forecasting, we bypassed the algorithmic latency of 79.8 ms per frame, with median forecasting errors below 2 mms and 1.5 degrees between the quaternions.</p><p><strong>Conclusion: </strong>To our knowledge, this is the first work to approach markerless in situ visualization by combining a multi-device method with forecasting and a foundation model for segmentation and tracking. This enables a more reliable and precise AR registration of surgical targets with low latency. Our approach can be applied to other surgical applications and AR hardware with minimal effort.</p>\",\"PeriodicalId\":51251,\"journal\":{\"name\":\"International Journal of Computer Assisted Radiology and Surgery\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-08-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Assisted Radiology and Surgery\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11548-025-03480-4\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03480-4","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
摘要
目的:增强现实(AR)通过使用头戴式显示器(hmd)将可见解剖结构与三维虚拟模型叠加来增强手术导航。特别是,开放性肝手术等干预措施可以从AR导航中受益,因为它有助于识别和区分肿瘤和风险结构。然而,缺乏自动和无标记的方法来应对现实世界的挑战,如部分遮挡和器官运动。方法:我们介绍了一种新型的多设备自动实时导航方法,该方法通过使用英特尔RealSense RGB-D相机进行精确可靠的配准,增强了HoloLens 2 HMD的可视化和交互能力。利用术中RGB-D分割和术前CT数据将虚拟肝脏模型注册到目标解剖。ar提示的任何部分模型(SAM)可以在不需要额外训练数据的情况下对肝脏进行原位鲁棒分割。为了减少算法延迟,采用双指数平滑(DES)来预测配准结果。结果:我们进行了一项肝脏开放手术的幻像研究,调查了肝脏运动、视点和闭塞的各种情况。平均配准误差(8.31 mm-18.78 mm TRE)与之前报道的工作相当,而我们的方法即使在高遮挡因素和强运动下也显示出很高的成功率。使用预测,我们绕过了每帧79.8 ms的算法延迟,四元数之间的中位数预测误差低于2 mm和1.5度。结论:据我们所知,这是第一次将多设备方法与预测和分割和跟踪的基础模型相结合来实现无标记的原位可视化。这使得手术目标的AR登记更可靠和精确,延迟更低。我们的方法可以以最小的努力应用于其他外科应用和AR硬件。
Towards a zero-shot low-latency navigation for open surgery augmented reality applications.
Purpose: Augmented reality (AR) enhances surgical navigation by superimposing visible anatomical structures with three-dimensional virtual models using head-mounted displays (HMDs). In particular, interventions such as open liver surgery can benefit from AR navigation, as it aids in identifying and distinguishing tumors and risk structures. However, there is a lack of automatic and markerless methods that are robust against real-world challenges, such as partial occlusion and organ motion.
Methods: We introduce a novel multi-device approach for automatic live navigation in open liver surgery that enhances the visualization and interaction capabilities of a HoloLens 2 HMD through precise and reliable registration using an Intel RealSense RGB-D camera. The intraoperative RGB-D segmentation and the preoperative CT data are utilized to register a virtual liver model to the target anatomy. An AR-prompted Segment Anything Model (SAM) enables robust segmentation of the liver in situ without the need for additional training data. To mitigate algorithmic latency, Double Exponential Smoothing (DES) is applied to forecast registration results.
Results: We conducted a phantom study for open liver surgery, investigating various scenarios of liver motion, viewpoints, and occlusion. The mean registration errors (8.31 mm-18.78 mm TRE) are comparable to those reported in prior work, while our approach demonstrates high success rates even for high occlusion factors and strong motion. Using forecasting, we bypassed the algorithmic latency of 79.8 ms per frame, with median forecasting errors below 2 mms and 1.5 degrees between the quaternions.
Conclusion: To our knowledge, this is the first work to approach markerless in situ visualization by combining a multi-device method with forecasting and a foundation model for segmentation and tracking. This enables a more reliable and precise AR registration of surgical targets with low latency. Our approach can be applied to other surgical applications and AR hardware with minimal effort.
期刊介绍:
The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.