ConsisTNet：在垂体内窥镜手术中一致解剖定位的时空方法。

IF 2.3 3区医学 Q3 ENGINEERING, BIOMEDICAL

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-01 Epub Date: 2025-04-29 DOI:10.1007/s11548-025-03369-2

Zhehua Mao, Adrito Das, Danyal Z Khan, Simon C Williams, John G Hanrahan, Danail Stoyanov, Hani J Marcus, Sophia Bano

{"title":"ConsisTNet：在垂体内窥镜手术中一致解剖定位的时空方法。","authors":"Zhehua Mao, Adrito Das, Danyal Z Khan, Simon C Williams, John G Hanrahan, Danail Stoyanov, Hani J Marcus, Sophia Bano","doi":"10.1007/s11548-025-03369-2","DOIUrl":null,"url":null,"abstract":"Purpose: Automated localization of critical anatomical structures in endoscopic pituitary surgery is crucial for enhancing patient safety and surgical outcomes. While deep learning models have shown promise in this task, their predictions often suffer from frame-to-frame inconsistency. This study addresses this issue by proposing ConsisTNet, a novel spatio-temporal model designed to improve prediction stability.Methods: ConsisTNet leverages spatio-temporal features extracted from consecutive frames to provide both temporally and spatially consistent predictions, addressing the limitations of single-frame approaches. We employ a semi-supervised strategy, utilizing ground-truth label tracking for pseudo-label generation through label propagation. Consistency is assessed by comparing predictions across consecutive frames using predicted label tracking. The model is optimized and accelerated using TensorRT for real-time intraoperative guidance.Results: Compared to previous state-of-the-art models, ConsisTNet significantly improves prediction consistency across video frames while maintaining high accuracy in segmentation and landmark detection. Specifically, segmentation consistency is improved by 4.56 and 9.45% in IoU for the two segmentation regions, and landmark detection consistency is enhanced with a 43.86% reduction in mean distance error. The accelerated model achieves an inference speed of 202 frames per second (FPS) with 16-bit floating point (FP16) precision, enabling real-time intraoperative guidance.Conclusion: ConsisTNet demonstrates significant improvements in spatio-temporal consistency of anatomical localization during endoscopic pituitary surgery, providing more stable and reliable real-time surgical assistance.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"1239-1248"},"PeriodicalIF":2.3000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12167350/pdf/","citationCount":"0","resultStr":"{\"title\":\"ConsisTNet: a spatio-temporal approach for consistent anatomical localization in endoscopic pituitary surgery.\",\"authors\":\"Zhehua Mao, Adrito Das, Danyal Z Khan, Simon C Williams, John G Hanrahan, Danail Stoyanov, Hani J Marcus, Sophia Bano\",\"doi\":\"10.1007/s11548-025-03369-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: Automated localization of critical anatomical structures in endoscopic pituitary surgery is crucial for enhancing patient safety and surgical outcomes. While deep learning models have shown promise in this task, their predictions often suffer from frame-to-frame inconsistency. This study addresses this issue by proposing ConsisTNet, a novel spatio-temporal model designed to improve prediction stability.Methods: ConsisTNet leverages spatio-temporal features extracted from consecutive frames to provide both temporally and spatially consistent predictions, addressing the limitations of single-frame approaches. We employ a semi-supervised strategy, utilizing ground-truth label tracking for pseudo-label generation through label propagation. Consistency is assessed by comparing predictions across consecutive frames using predicted label tracking. The model is optimized and accelerated using TensorRT for real-time intraoperative guidance.Results: Compared to previous state-of-the-art models, ConsisTNet significantly improves prediction consistency across video frames while maintaining high accuracy in segmentation and landmark detection. Specifically, segmentation consistency is improved by 4.56 and 9.45% in IoU for the two segmentation regions, and landmark detection consistency is enhanced with a 43.86% reduction in mean distance error. The accelerated model achieves an inference speed of 202 frames per second (FPS) with 16-bit floating point (FP16) precision, enabling real-time intraoperative guidance.Conclusion: ConsisTNet demonstrates significant improvements in spatio-temporal consistency of anatomical localization during endoscopic pituitary surgery, providing more stable and reliable real-time surgical assistance.\",\"PeriodicalId\":51251,\"journal\":{\"name\":\"International Journal of Computer Assisted Radiology and Surgery\",\"volume\":\" \",\"pages\":\"1239-1248\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12167350/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Assisted Radiology and Surgery\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11548-025-03369-2\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/29 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03369-2","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/29 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

目的：垂体内镜手术中关键解剖结构的自动定位对提高患者安全和手术效果至关重要。虽然深度学习模型在这项任务中表现出了希望，但它们的预测经常受到帧与帧之间不一致的影响。本研究通过提出一种新的时空模型ConsisTNet来解决这个问题，该模型旨在提高预测的稳定性。方法：ConsisTNet利用从连续帧中提取的时空特征来提供时间和空间一致的预测，解决了单帧方法的局限性。我们采用半监督策略，利用ground-truth标签跟踪通过标签传播生成伪标签。一致性是通过使用预测标签跟踪比较连续帧的预测来评估的。使用TensorRT对模型进行优化和加速，以实现实时术中引导。结果：与之前最先进的模型相比，ConsisTNet显著提高了跨视频帧的预测一致性，同时保持了分割和地标检测的高精度。具体而言，两个分割区域的分割一致性在IoU上分别提高了4.56和9.45%，landmark检测一致性增强，平均距离误差降低了43.86%。加速模型达到每秒202帧（FPS）的推理速度，16位浮点（FP16）精度，实现实时术中引导。结论：ConsisTNet可显著提高垂体内镜手术解剖定位的时空一致性，提供更稳定、可靠的实时手术辅助。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ConsisTNet: a spatio-temporal approach for consistent anatomical localization in endoscopic pituitary surgery.

Purpose: Automated localization of critical anatomical structures in endoscopic pituitary surgery is crucial for enhancing patient safety and surgical outcomes. While deep learning models have shown promise in this task, their predictions often suffer from frame-to-frame inconsistency. This study addresses this issue by proposing ConsisTNet, a novel spatio-temporal model designed to improve prediction stability.

Methods: ConsisTNet leverages spatio-temporal features extracted from consecutive frames to provide both temporally and spatially consistent predictions, addressing the limitations of single-frame approaches. We employ a semi-supervised strategy, utilizing ground-truth label tracking for pseudo-label generation through label propagation. Consistency is assessed by comparing predictions across consecutive frames using predicted label tracking. The model is optimized and accelerated using TensorRT for real-time intraoperative guidance.

Results: Compared to previous state-of-the-art models, ConsisTNet significantly improves prediction consistency across video frames while maintaining high accuracy in segmentation and landmark detection. Specifically, segmentation consistency is improved by 4.56 and 9.45% in IoU for the two segmentation regions, and landmark detection consistency is enhanced with a 43.86% reduction in mean distance error. The accelerated model achieves an inference speed of 202 frames per second (FPS) with 16-bit floating point (FP16) precision, enabling real-time intraoperative guidance.

Conclusion: ConsisTNet demonstrates significant improvements in spatio-temporal consistency of anatomical localization during endoscopic pituitary surgery, providing more stable and reliable real-time surgical assistance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Computer Assisted Radiology and Surgery ENGINEERING, BIOMEDICAL-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

CiteScore

5.90

自引率

6.70%

发文量

243

审稿时长

6-12 weeks

期刊介绍： The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.