Harry Robertshaw, Lennart Karstensen, Benjamin Jackson, Alejandro Granados, Thomas C Booth
{"title":"利用反强化学习自主导航机械血栓切除术中的导管和导丝。","authors":"Harry Robertshaw, Lennart Karstensen, Benjamin Jackson, Alejandro Granados, Thomas C Booth","doi":"10.1007/s11548-024-03208-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Autonomous navigation of catheters and guidewires can enhance endovascular surgery safety and efficacy, reducing procedure times and operator radiation exposure. Integrating tele-operated robotics could widen access to time-sensitive emergency procedures like mechanical thrombectomy (MT). Reinforcement learning (RL) shows potential in endovascular navigation, yet its application encounters challenges without a reward signal. This study explores the viability of autonomous guidewire navigation in MT vasculature using inverse reinforcement learning (IRL) to leverage expert demonstrations.</p><p><strong>Methods: </strong>Employing the Simulation Open Framework Architecture (SOFA), this study established a simulation-based training and evaluation environment for MT navigation. We used IRL to infer reward functions from expert behaviour when navigating a guidewire and catheter. We utilized the soft actor-critic algorithm to train models with various reward functions and compared their performance in silico.</p><p><strong>Results: </strong>We demonstrated feasibility of navigation using IRL. When evaluating single- versus dual-device (i.e. guidewire versus catheter and guidewire) tracking, both methods achieved high success rates of 95% and 96%, respectively. Dual tracking, however, utilized both devices mimicking an expert. A success rate of 100% and procedure time of 22.6 s were obtained when training with a reward function obtained through 'reward shaping'. This outperformed a dense reward function (96%, 24.9 s) and an IRL-derived reward function (48%, 59.2 s).</p><p><strong>Conclusions: </strong>We have contributed to the advancement of autonomous endovascular intervention navigation, particularly MT, by effectively employing IRL based on demonstrator expertise. The results underscore the potential of using reward shaping to efficiently train models, offering a promising avenue for enhancing the accessibility and precision of MT procedures. We envisage that future research can extend our methodology to diverse anatomical structures to enhance generalizability.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7616368/pdf/","citationCount":"0","resultStr":"{\"title\":\"Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning.\",\"authors\":\"Harry Robertshaw, Lennart Karstensen, Benjamin Jackson, Alejandro Granados, Thomas C Booth\",\"doi\":\"10.1007/s11548-024-03208-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Autonomous navigation of catheters and guidewires can enhance endovascular surgery safety and efficacy, reducing procedure times and operator radiation exposure. Integrating tele-operated robotics could widen access to time-sensitive emergency procedures like mechanical thrombectomy (MT). Reinforcement learning (RL) shows potential in endovascular navigation, yet its application encounters challenges without a reward signal. This study explores the viability of autonomous guidewire navigation in MT vasculature using inverse reinforcement learning (IRL) to leverage expert demonstrations.</p><p><strong>Methods: </strong>Employing the Simulation Open Framework Architecture (SOFA), this study established a simulation-based training and evaluation environment for MT navigation. We used IRL to infer reward functions from expert behaviour when navigating a guidewire and catheter. We utilized the soft actor-critic algorithm to train models with various reward functions and compared their performance in silico.</p><p><strong>Results: </strong>We demonstrated feasibility of navigation using IRL. When evaluating single- versus dual-device (i.e. guidewire versus catheter and guidewire) tracking, both methods achieved high success rates of 95% and 96%, respectively. Dual tracking, however, utilized both devices mimicking an expert. A success rate of 100% and procedure time of 22.6 s were obtained when training with a reward function obtained through 'reward shaping'. This outperformed a dense reward function (96%, 24.9 s) and an IRL-derived reward function (48%, 59.2 s).</p><p><strong>Conclusions: </strong>We have contributed to the advancement of autonomous endovascular intervention navigation, particularly MT, by effectively employing IRL based on demonstrator expertise. The results underscore the potential of using reward shaping to efficiently train models, offering a promising avenue for enhancing the accessibility and precision of MT procedures. We envisage that future research can extend our methodology to diverse anatomical structures to enhance generalizability.</p>\",\"PeriodicalId\":51251,\"journal\":{\"name\":\"International Journal of Computer Assisted Radiology and Surgery\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7616368/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Assisted Radiology and Surgery\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11548-024-03208-w\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/6/17 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-024-03208-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/17 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning.
Purpose: Autonomous navigation of catheters and guidewires can enhance endovascular surgery safety and efficacy, reducing procedure times and operator radiation exposure. Integrating tele-operated robotics could widen access to time-sensitive emergency procedures like mechanical thrombectomy (MT). Reinforcement learning (RL) shows potential in endovascular navigation, yet its application encounters challenges without a reward signal. This study explores the viability of autonomous guidewire navigation in MT vasculature using inverse reinforcement learning (IRL) to leverage expert demonstrations.
Methods: Employing the Simulation Open Framework Architecture (SOFA), this study established a simulation-based training and evaluation environment for MT navigation. We used IRL to infer reward functions from expert behaviour when navigating a guidewire and catheter. We utilized the soft actor-critic algorithm to train models with various reward functions and compared their performance in silico.
Results: We demonstrated feasibility of navigation using IRL. When evaluating single- versus dual-device (i.e. guidewire versus catheter and guidewire) tracking, both methods achieved high success rates of 95% and 96%, respectively. Dual tracking, however, utilized both devices mimicking an expert. A success rate of 100% and procedure time of 22.6 s were obtained when training with a reward function obtained through 'reward shaping'. This outperformed a dense reward function (96%, 24.9 s) and an IRL-derived reward function (48%, 59.2 s).
Conclusions: We have contributed to the advancement of autonomous endovascular intervention navigation, particularly MT, by effectively employing IRL based on demonstrator expertise. The results underscore the potential of using reward shaping to efficiently train models, offering a promising avenue for enhancing the accessibility and precision of MT procedures. We envisage that future research can extend our methodology to diverse anatomical structures to enhance generalizability.
期刊介绍:
The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.