Yonghao Long,Anran Lin,Derek Hang Chun Kwok,Lin Zhang,Zhenya Yang,Kejian Shi,Lei Song,Jiawei Fu,Hongbin Lin,Wang Wei,Kai Chen,Xiangyu Chu,Yang Hu,Hon Chi Yip,Philip Wai Yan Chiu,Peter Kazanzides,Russell H Taylor,Yunhui Liu,Zihan Chen,Zerui Wang, Samuel Kwok Wai Au,Qi Dou
{"title":"Surgical embodied intelligence for generalized task autonomy in laparoscopic robot-assisted surgery.","authors":"Yonghao Long,Anran Lin,Derek Hang Chun Kwok,Lin Zhang,Zhenya Yang,Kejian Shi,Lei Song,Jiawei Fu,Hongbin Lin,Wang Wei,Kai Chen,Xiangyu Chu,Yang Hu,Hon Chi Yip,Philip Wai Yan Chiu,Peter Kazanzides,Russell H Taylor,Yunhui Liu,Zihan Chen,Zerui Wang, Samuel Kwok Wai Au,Qi Dou","doi":"10.1126/scirobotics.adt3093","DOIUrl":null,"url":null,"abstract":"Surgical robots capable of autonomously performing various tasks could enhance efficiency and augment human productivity in addressing clinical needs. Although current solutions have automated specific actions within defined contexts, they are challenging to generalize across diverse environments in general surgery. Embodied intelligence enables general-purpose robot learning with applications for daily tasks, yet its application in the medical domain remains limited. We introduced an open-source surgical embodied intelligence simulator for an interactive environment to develop reinforcement learning methods for minimally invasive surgical robots. Using such embodied artificial intelligence, this study further addresses surgical task automation, enabling zero-shot transfer of simulation-trained policies to real-world scenarios. The proposed method encompasses visual parsing, a perceptual regressor, policy learning, and a visual servoing controller, forming a paradigm that combines the advantages of data-driven policy and classic controller. The visual parsing uses stereo depth estimation and image segmentation with a visual foundation model to handle complex scenes. Experiments demonstrated autonomy in seven game-based skill training tasks on the da Vinci Research Kit, with a proof-of-concept study on haptic-assisted skill training as a practical application. Moreover, we conducted automation of five surgical assistive tasks with the Sentire surgical system on ex vivo animal tissues with various scenes, object sizes, instrument types, and illuminations. The learned policies were also validated in a live-animal trial for three tasks in dynamic in vivo surgical environments. We hope this open-source infrastructure, coupled with a general-purpose learning paradigm, will inspire and facilitate future research on embodied intelligence toward autonomous surgical robots.","PeriodicalId":56029,"journal":{"name":"Science Robotics","volume":"7 1","pages":"eadt3093"},"PeriodicalIF":26.1000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science Robotics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1126/scirobotics.adt3093","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Surgical robots capable of autonomously performing various tasks could enhance efficiency and augment human productivity in addressing clinical needs. Although current solutions have automated specific actions within defined contexts, they are challenging to generalize across diverse environments in general surgery. Embodied intelligence enables general-purpose robot learning with applications for daily tasks, yet its application in the medical domain remains limited. We introduced an open-source surgical embodied intelligence simulator for an interactive environment to develop reinforcement learning methods for minimally invasive surgical robots. Using such embodied artificial intelligence, this study further addresses surgical task automation, enabling zero-shot transfer of simulation-trained policies to real-world scenarios. The proposed method encompasses visual parsing, a perceptual regressor, policy learning, and a visual servoing controller, forming a paradigm that combines the advantages of data-driven policy and classic controller. The visual parsing uses stereo depth estimation and image segmentation with a visual foundation model to handle complex scenes. Experiments demonstrated autonomy in seven game-based skill training tasks on the da Vinci Research Kit, with a proof-of-concept study on haptic-assisted skill training as a practical application. Moreover, we conducted automation of five surgical assistive tasks with the Sentire surgical system on ex vivo animal tissues with various scenes, object sizes, instrument types, and illuminations. The learned policies were also validated in a live-animal trial for three tasks in dynamic in vivo surgical environments. We hope this open-source infrastructure, coupled with a general-purpose learning paradigm, will inspire and facilitate future research on embodied intelligence toward autonomous surgical robots.
期刊介绍:
Science Robotics publishes original, peer-reviewed, science- or engineering-based research articles that advance the field of robotics. The journal also features editor-commissioned Reviews. An international team of academic editors holds Science Robotics articles to the same high-quality standard that is the hallmark of the Science family of journals.
Sub-topics include: actuators, advanced materials, artificial Intelligence, autonomous vehicles, bio-inspired design, exoskeletons, fabrication, field robotics, human-robot interaction, humanoids, industrial robotics, kinematics, machine learning, material science, medical technology, motion planning and control, micro- and nano-robotics, multi-robot control, sensors, service robotics, social and ethical issues, soft robotics, and space, planetary and undersea exploration.