{"title":"通过大视觉模型创新机器人辅助手术","authors":"Zhe Min \n (, ), Jiewen Lai \n (, ), Hongliang Ren \n (, )","doi":"10.1038/s44287-025-00166-6","DOIUrl":null,"url":null,"abstract":"The rapid development of generative artificial intelligence and large models, including large vision models (LVMs), has accelerated their wide applications in medicine. Robot-assisted surgery (RAS) or surgical robotics, in which vision has a vital role, typically combines medical images for diagnostic or navigation abilities with robots with precise operative capabilities. In this context, LVMs could serve as a revolutionary paradigm towards surgical autonomy, accomplishing surgical representations with high fidelity and physical intelligence and enabling high-quality data use and long-term learning. In this Perspective, vision-related tasks in RAS are divided into fundamental upstream tasks and advanced downstream counterparts, elucidating their shared technical foundations with state-of-the-art research that could catalyse a paradigm shift in surgical robotics research for the next decade. LVMs have already been extensively explored to tackle upstream tasks in RAS, exhibiting promising performances. Developing vision foundation models for downstream RAS tasks, which is based on upstream counterparts but necessitates further investigations, will directly enhance surgical autonomy. Here, we outline research trends that could accelerate this paradigm shift and highlight major challenges that could impede progress in the way to the ultimate transformation from ‘surgical robots’ to ‘robotic surgeons’. Robot-assisted surgery relies heavily on vision and generally integrates medical imaging for diagnostic and/or navigation purposes with robots that offer accurate surgical functions. This Perspective discusses how large vision models can enhance vision-related tasks in robot-assisted surgery transforming ‘surgical robots’ into ‘robotic surgeons’.","PeriodicalId":501701,"journal":{"name":"Nature Reviews Electrical Engineering","volume":"2 5","pages":"350-363"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Innovating robot-assisted surgery through large vision models\",\"authors\":\"Zhe Min \\n (, ), Jiewen Lai \\n (, ), Hongliang Ren \\n (, )\",\"doi\":\"10.1038/s44287-025-00166-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid development of generative artificial intelligence and large models, including large vision models (LVMs), has accelerated their wide applications in medicine. Robot-assisted surgery (RAS) or surgical robotics, in which vision has a vital role, typically combines medical images for diagnostic or navigation abilities with robots with precise operative capabilities. In this context, LVMs could serve as a revolutionary paradigm towards surgical autonomy, accomplishing surgical representations with high fidelity and physical intelligence and enabling high-quality data use and long-term learning. In this Perspective, vision-related tasks in RAS are divided into fundamental upstream tasks and advanced downstream counterparts, elucidating their shared technical foundations with state-of-the-art research that could catalyse a paradigm shift in surgical robotics research for the next decade. LVMs have already been extensively explored to tackle upstream tasks in RAS, exhibiting promising performances. Developing vision foundation models for downstream RAS tasks, which is based on upstream counterparts but necessitates further investigations, will directly enhance surgical autonomy. Here, we outline research trends that could accelerate this paradigm shift and highlight major challenges that could impede progress in the way to the ultimate transformation from ‘surgical robots’ to ‘robotic surgeons’. Robot-assisted surgery relies heavily on vision and generally integrates medical imaging for diagnostic and/or navigation purposes with robots that offer accurate surgical functions. This Perspective discusses how large vision models can enhance vision-related tasks in robot-assisted surgery transforming ‘surgical robots’ into ‘robotic surgeons’.\",\"PeriodicalId\":501701,\"journal\":{\"name\":\"Nature Reviews Electrical Engineering\",\"volume\":\"2 5\",\"pages\":\"350-363\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Reviews Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.nature.com/articles/s44287-025-00166-6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Reviews Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s44287-025-00166-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Innovating robot-assisted surgery through large vision models
The rapid development of generative artificial intelligence and large models, including large vision models (LVMs), has accelerated their wide applications in medicine. Robot-assisted surgery (RAS) or surgical robotics, in which vision has a vital role, typically combines medical images for diagnostic or navigation abilities with robots with precise operative capabilities. In this context, LVMs could serve as a revolutionary paradigm towards surgical autonomy, accomplishing surgical representations with high fidelity and physical intelligence and enabling high-quality data use and long-term learning. In this Perspective, vision-related tasks in RAS are divided into fundamental upstream tasks and advanced downstream counterparts, elucidating their shared technical foundations with state-of-the-art research that could catalyse a paradigm shift in surgical robotics research for the next decade. LVMs have already been extensively explored to tackle upstream tasks in RAS, exhibiting promising performances. Developing vision foundation models for downstream RAS tasks, which is based on upstream counterparts but necessitates further investigations, will directly enhance surgical autonomy. Here, we outline research trends that could accelerate this paradigm shift and highlight major challenges that could impede progress in the way to the ultimate transformation from ‘surgical robots’ to ‘robotic surgeons’. Robot-assisted surgery relies heavily on vision and generally integrates medical imaging for diagnostic and/or navigation purposes with robots that offer accurate surgical functions. This Perspective discusses how large vision models can enhance vision-related tasks in robot-assisted surgery transforming ‘surgical robots’ into ‘robotic surgeons’.