{"title":"外科手术机器人任务自动化的视觉运动策略学习","authors":"Junhui Huang;Qingxin Shi;Dongsheng Xie;Yiming Ma;Xiaoming Liu;Changsheng Li;Xingguang Duan","doi":"10.1109/TMRB.2024.3464090","DOIUrl":null,"url":null,"abstract":"With the increasing adoption of robotic surgery systems, the need for automated surgical tasks has become more pressing. Recent learning-based approaches provide solutions to surgical automation but typically rely on low-dimensional observations. To further imitate the actions of surgeons in an end-to-end paradigm, this paper introduces a novel visual-based approach to automating surgical tasks using generative imitation learning for robotic systems. We develop a hybrid model integrating state space models transformer, and conditional variational autoencoders (CVAE) to enhance performance and generalization called ACMT. The proposed model, leveraging the Mamba block and multi-head cross-attention mechanisms for sequential modeling, achieves a 75-100% success rate with just 100 demonstrations for most of the tasks. This work significantly advances data-driven automation in surgical robotics, aiming to alleviate the burden on surgeons and improve surgical outcomes.","PeriodicalId":73318,"journal":{"name":"IEEE transactions on medical robotics and bionics","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Visuomotor Policy Learning for Task Automation of Surgical Robot\",\"authors\":\"Junhui Huang;Qingxin Shi;Dongsheng Xie;Yiming Ma;Xiaoming Liu;Changsheng Li;Xingguang Duan\",\"doi\":\"10.1109/TMRB.2024.3464090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the increasing adoption of robotic surgery systems, the need for automated surgical tasks has become more pressing. Recent learning-based approaches provide solutions to surgical automation but typically rely on low-dimensional observations. To further imitate the actions of surgeons in an end-to-end paradigm, this paper introduces a novel visual-based approach to automating surgical tasks using generative imitation learning for robotic systems. We develop a hybrid model integrating state space models transformer, and conditional variational autoencoders (CVAE) to enhance performance and generalization called ACMT. The proposed model, leveraging the Mamba block and multi-head cross-attention mechanisms for sequential modeling, achieves a 75-100% success rate with just 100 demonstrations for most of the tasks. This work significantly advances data-driven automation in surgical robotics, aiming to alleviate the burden on surgeons and improve surgical outcomes.\",\"PeriodicalId\":73318,\"journal\":{\"name\":\"IEEE transactions on medical robotics and bionics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on medical robotics and bionics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10685114/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical robotics and bionics","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10685114/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
Visuomotor Policy Learning for Task Automation of Surgical Robot
With the increasing adoption of robotic surgery systems, the need for automated surgical tasks has become more pressing. Recent learning-based approaches provide solutions to surgical automation but typically rely on low-dimensional observations. To further imitate the actions of surgeons in an end-to-end paradigm, this paper introduces a novel visual-based approach to automating surgical tasks using generative imitation learning for robotic systems. We develop a hybrid model integrating state space models transformer, and conditional variational autoencoders (CVAE) to enhance performance and generalization called ACMT. The proposed model, leveraging the Mamba block and multi-head cross-attention mechanisms for sequential modeling, achieves a 75-100% success rate with just 100 demonstrations for most of the tasks. This work significantly advances data-driven automation in surgical robotics, aiming to alleviate the burden on surgeons and improve surgical outcomes.