利用反强化学习自主导航机械血栓切除术中的导管和导丝。

IF 2.3 3区医学 Q3 ENGINEERING, BIOMEDICAL

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2024-08-01 Epub Date: 2024-06-17 DOI:10.1007/s11548-024-03208-w

Harry Robertshaw, Lennart Karstensen, Benjamin Jackson, Alejandro Granados, Thomas C Booth

{"title":"利用反强化学习自主导航机械血栓切除术中的导管和导丝。","authors":"Harry Robertshaw, Lennart Karstensen, Benjamin Jackson, Alejandro Granados, Thomas C Booth","doi":"10.1007/s11548-024-03208-w","DOIUrl":null,"url":null,"abstract":"Purpose: Autonomous navigation of catheters and guidewires can enhance endovascular surgery safety and efficacy, reducing procedure times and operator radiation exposure. Integrating tele-operated robotics could widen access to time-sensitive emergency procedures like mechanical thrombectomy (MT). Reinforcement learning (RL) shows potential in endovascular navigation, yet its application encounters challenges without a reward signal. This study explores the viability of autonomous guidewire navigation in MT vasculature using inverse reinforcement learning (IRL) to leverage expert demonstrations.Methods: Employing the Simulation Open Framework Architecture (SOFA), this study established a simulation-based training and evaluation environment for MT navigation. We used IRL to infer reward functions from expert behaviour when navigating a guidewire and catheter. We utilized the soft actor-critic algorithm to train models with various reward functions and compared their performance in silico.Results: We demonstrated feasibility of navigation using IRL. When evaluating single- versus dual-device (i.e. guidewire versus catheter and guidewire) tracking, both methods achieved high success rates of 95% and 96%, respectively. Dual tracking, however, utilized both devices mimicking an expert. A success rate of 100% and procedure time of 22.6 s were obtained when training with a reward function obtained through 'reward shaping'. This outperformed a dense reward function (96%, 24.9 s) and an IRL-derived reward function (48%, 59.2 s).Conclusions: We have contributed to the advancement of autonomous endovascular intervention navigation, particularly MT, by effectively employing IRL based on demonstrator expertise. The results underscore the potential of using reward shaping to efficiently train models, offering a promising avenue for enhancing the accessibility and precision of MT procedures. We envisage that future research can extend our methodology to diverse anatomical structures to enhance generalizability.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7616368/pdf/","citationCount":"0","resultStr":"{\"title\":\"Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning.\",\"authors\":\"Harry Robertshaw, Lennart Karstensen, Benjamin Jackson, Alejandro Granados, Thomas C Booth\",\"doi\":\"10.1007/s11548-024-03208-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: Autonomous navigation of catheters and guidewires can enhance endovascular surgery safety and efficacy, reducing procedure times and operator radiation exposure. Integrating tele-operated robotics could widen access to time-sensitive emergency procedures like mechanical thrombectomy (MT). Reinforcement learning (RL) shows potential in endovascular navigation, yet its application encounters challenges without a reward signal. This study explores the viability of autonomous guidewire navigation in MT vasculature using inverse reinforcement learning (IRL) to leverage expert demonstrations.Methods: Employing the Simulation Open Framework Architecture (SOFA), this study established a simulation-based training and evaluation environment for MT navigation. We used IRL to infer reward functions from expert behaviour when navigating a guidewire and catheter. We utilized the soft actor-critic algorithm to train models with various reward functions and compared their performance in silico.Results: We demonstrated feasibility of navigation using IRL. When evaluating single- versus dual-device (i.e. guidewire versus catheter and guidewire) tracking, both methods achieved high success rates of 95% and 96%, respectively. Dual tracking, however, utilized both devices mimicking an expert. A success rate of 100% and procedure time of 22.6 s were obtained when training with a reward function obtained through 'reward shaping'. This outperformed a dense reward function (96%, 24.9 s) and an IRL-derived reward function (48%, 59.2 s).Conclusions: We have contributed to the advancement of autonomous endovascular intervention navigation, particularly MT, by effectively employing IRL based on demonstrator expertise. The results underscore the potential of using reward shaping to efficiently train models, offering a promising avenue for enhancing the accessibility and precision of MT procedures. We envisage that future research can extend our methodology to diverse anatomical structures to enhance generalizability.\",\"PeriodicalId\":51251,\"journal\":{\"name\":\"International Journal of Computer Assisted Radiology and Surgery\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7616368/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Assisted Radiology and Surgery\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11548-024-03208-w\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/6/17 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-024-03208-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/17 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

目的：导管和导丝的自主导航可提高血管内手术的安全性和有效性，减少手术时间和操作者的辐射暴露。整合远程操作机器人技术可以拓宽机械血栓切除术（MT）等对时间敏感的紧急手术的使用范围。强化学习（RL）在血管内导航中显示出潜力，但其应用在没有奖励信号的情况下遇到了挑战。本研究利用反强化学习（IRL），利用专家示范，探索在 MT 血管中自主导丝导航的可行性：本研究采用仿真开放框架结构（SOFA），为 MT 导航建立了一个基于仿真的训练和评估环境。我们利用 IRL 从专家导航导丝和导管的行为中推断出奖励函数。我们利用软演员批判算法来训练具有各种奖励函数的模型，并对它们的性能进行模拟比较：结果：我们证明了使用 IRL 导航的可行性。在评估单设备与双设备（即导丝与导管和导丝）追踪时，两种方法的成功率都很高，分别为 95% 和 96%。而双设备追踪则是利用两种设备模仿专家进行追踪。使用通过 "奖励塑造 "获得的奖励函数进行训练时，成功率为 100%，手术时间为 22.6 秒。这一结果优于密集奖励函数（96%，24.9 秒）和源自 IRL 的奖励函数（48%，59.2 秒）：我们根据演示者的专业知识有效地使用了 IRL，为推进自主血管内介入导航（尤其是 MT）做出了贡献。研究结果凸显了使用奖励塑造来有效训练模型的潜力，为提高 MT 手术的可及性和精确性提供了一条前景广阔的途径。我们设想，未来的研究可以将我们的方法扩展到不同的解剖结构，以增强普适性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning.

查看原文本刊更多论文

Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning.

Purpose: Autonomous navigation of catheters and guidewires can enhance endovascular surgery safety and efficacy, reducing procedure times and operator radiation exposure. Integrating tele-operated robotics could widen access to time-sensitive emergency procedures like mechanical thrombectomy (MT). Reinforcement learning (RL) shows potential in endovascular navigation, yet its application encounters challenges without a reward signal. This study explores the viability of autonomous guidewire navigation in MT vasculature using inverse reinforcement learning (IRL) to leverage expert demonstrations.

Methods: Employing the Simulation Open Framework Architecture (SOFA), this study established a simulation-based training and evaluation environment for MT navigation. We used IRL to infer reward functions from expert behaviour when navigating a guidewire and catheter. We utilized the soft actor-critic algorithm to train models with various reward functions and compared their performance in silico.

Results: We demonstrated feasibility of navigation using IRL. When evaluating single- versus dual-device (i.e. guidewire versus catheter and guidewire) tracking, both methods achieved high success rates of 95% and 96%, respectively. Dual tracking, however, utilized both devices mimicking an expert. A success rate of 100% and procedure time of 22.6 s were obtained when training with a reward function obtained through 'reward shaping'. This outperformed a dense reward function (96%, 24.9 s) and an IRL-derived reward function (48%, 59.2 s).

Conclusions: We have contributed to the advancement of autonomous endovascular intervention navigation, particularly MT, by effectively employing IRL based on demonstrator expertise. The results underscore the potential of using reward shaping to efficiently train models, offering a promising avenue for enhancing the accessibility and precision of MT procedures. We envisage that future research can extend our methodology to diverse anatomical structures to enhance generalizability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Computer Assisted Radiology and Surgery ENGINEERING, BIOMEDICAL-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

CiteScore

5.90

自引率

6.70%

发文量

243

审稿时长

6-12 weeks

期刊介绍： The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.