Transformer-based robotic ultrasound 3D tracking for capsule robot in GI tract.

IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL
Xiaoyun Liu, Changyan He, Mulan Wu, Ann Ping, Anna Zavodni, Naomi Matsuura, Eric Diller
{"title":"Transformer-based robotic ultrasound 3D tracking for capsule robot in GI tract.","authors":"Xiaoyun Liu, Changyan He, Mulan Wu, Ann Ping, Anna Zavodni, Naomi Matsuura, Eric Diller","doi":"10.1007/s11548-025-03445-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Ultrasound (US) imaging is a promising modality for real-time monitoring of robotic capsule endoscopes navigating through the gastrointestinal (GI) tract. It offers high temporal resolution and safety but is limited by a narrow field of view, low visibility in gas-filled regions and challenges in detecting out-of-plane motions. This work addresses these issues by proposing a novel robotic ultrasound tracking system capable of long-distance 3D tracking and active re-localization when the capsule is lost due to motion or artifacts.</p><p><strong>Methods: </strong>We develop a hybrid deep learning-based tracking framework combining convolutional neural networks (CNNs) and a transformer backbone. The CNN component efficiently encodes spatial features, while the transformer captures long-range contextual dependencies in B-mode US images. This model is integrated with a robotic arm that adaptively scans and tracks the capsule. The system's performance is evaluated using ex vivo colon phantoms under varying imaging conditions, with physical perturbations introduced to simulate realistic clinical scenarios.</p><p><strong>Results: </strong>The proposed system achieved continuous 3D tracking over distances exceeding 90 cm, with a mean centroid localization error of 1.5 mm and over 90% detection accuracy. We demonstrated 3D tracking in a more complex workspace featuring two curved sections to simulate anatomical challenges. This suggests the strong resilience of the tracking system to motion-induced artifacts and geometric variability. The system maintained real-time tracking at 9-12 FPS and successfully re-localized the capsule within seconds after tracking loss, even under gas artifacts and acoustic shadowing.</p><p><strong>Conclusion: </strong>This study presents a hybrid CNN-transformer system for automatic, real-time 3D ultrasound tracking of capsule robots over long distances. The method reliably handles occlusions, view loss and image artifacts, offering millimeter-level tracking accuracy. It significantly reduces clinical workload through autonomous detection and re-localization. Future work includes improving probe-tissue interaction handling and validating performance in live animal and human trials to assess physiological impacts.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03445-7","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: Ultrasound (US) imaging is a promising modality for real-time monitoring of robotic capsule endoscopes navigating through the gastrointestinal (GI) tract. It offers high temporal resolution and safety but is limited by a narrow field of view, low visibility in gas-filled regions and challenges in detecting out-of-plane motions. This work addresses these issues by proposing a novel robotic ultrasound tracking system capable of long-distance 3D tracking and active re-localization when the capsule is lost due to motion or artifacts.

Methods: We develop a hybrid deep learning-based tracking framework combining convolutional neural networks (CNNs) and a transformer backbone. The CNN component efficiently encodes spatial features, while the transformer captures long-range contextual dependencies in B-mode US images. This model is integrated with a robotic arm that adaptively scans and tracks the capsule. The system's performance is evaluated using ex vivo colon phantoms under varying imaging conditions, with physical perturbations introduced to simulate realistic clinical scenarios.

Results: The proposed system achieved continuous 3D tracking over distances exceeding 90 cm, with a mean centroid localization error of 1.5 mm and over 90% detection accuracy. We demonstrated 3D tracking in a more complex workspace featuring two curved sections to simulate anatomical challenges. This suggests the strong resilience of the tracking system to motion-induced artifacts and geometric variability. The system maintained real-time tracking at 9-12 FPS and successfully re-localized the capsule within seconds after tracking loss, even under gas artifacts and acoustic shadowing.

Conclusion: This study presents a hybrid CNN-transformer system for automatic, real-time 3D ultrasound tracking of capsule robots over long distances. The method reliably handles occlusions, view loss and image artifacts, offering millimeter-level tracking accuracy. It significantly reduces clinical workload through autonomous detection and re-localization. Future work includes improving probe-tissue interaction handling and validating performance in live animal and human trials to assess physiological impacts.

基于变压器的胶囊机器人胃肠道超声三维跟踪。
目的:超声(US)成像是实时监测机器人胶囊内窥镜在胃肠道中导航的一种很有前途的方式。它提供了高时间分辨率和安全性,但受限于视野狭窄,在充满气体的区域能见度低以及检测平面外运动的挑战。这项工作通过提出一种新型机器人超声跟踪系统来解决这些问题,该系统能够在胶囊因运动或伪影丢失时进行远距离3D跟踪和主动重新定位。方法:我们开发了一个混合深度学习的跟踪框架,该框架结合了卷积神经网络(cnn)和变压器主干。CNN组件有效地编码空间特征,而变压器捕获b模式US图像中的远程上下文依赖关系。这个模型集成了一个机械臂,可以自适应地扫描和跟踪胶囊。该系统的性能通过在不同成像条件下的离体结肠幻影进行评估,并引入物理扰动来模拟现实的临床场景。结果:该系统实现了超过90 cm的连续三维跟踪,平均质心定位误差为1.5 mm,检测精度超过90%。我们在一个更复杂的工作空间中演示了3D跟踪,该工作空间具有两个弯曲部分来模拟解剖挑战。这表明跟踪系统对运动诱发的人工制品和几何变异性具有很强的弹性。该系统保持了9-12 FPS的实时跟踪速度,并在跟踪丢失后几秒钟内成功地重新定位了胶囊,即使在气体伪影和声波阴影下也是如此。结论:本研究提出了一种用于胶囊机器人长距离自动实时3D超声跟踪的混合cnn -变压器系统。该方法可靠地处理遮挡、视图丢失和图像伪影,提供毫米级的跟踪精度。它通过自主检测和重新定位显着减少了临床工作量。未来的工作包括改进探针-组织相互作用处理,并在活体动物和人体试验中验证性能,以评估生理影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Computer Assisted Radiology and Surgery
International Journal of Computer Assisted Radiology and Surgery ENGINEERING, BIOMEDICAL-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
CiteScore
5.90
自引率
6.70%
发文量
243
审稿时长
6-12 weeks
期刊介绍: The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信