通过大视觉模型创新机器人辅助手术

Nature Reviews Electrical Engineering Pub Date : 2025-05-12 DOI:10.1038/s44287-025-00166-6

Zhe Min (, ), Jiewen Lai (, ), Hongliang Ren (, )

{"title":"通过大视觉模型创新机器人辅助手术","authors":"Zhe Min \n (, ), Jiewen Lai \n (, ), Hongliang Ren \n (, )","doi":"10.1038/s44287-025-00166-6","DOIUrl":null,"url":null,"abstract":"The rapid development of generative artificial intelligence and large models, including large vision models (LVMs), has accelerated their wide applications in medicine. Robot-assisted surgery (RAS) or surgical robotics, in which vision has a vital role, typically combines medical images for diagnostic or navigation abilities with robots with precise operative capabilities. In this context, LVMs could serve as a revolutionary paradigm towards surgical autonomy, accomplishing surgical representations with high fidelity and physical intelligence and enabling high-quality data use and long-term learning. In this Perspective, vision-related tasks in RAS are divided into fundamental upstream tasks and advanced downstream counterparts, elucidating their shared technical foundations with state-of-the-art research that could catalyse a paradigm shift in surgical robotics research for the next decade. LVMs have already been extensively explored to tackle upstream tasks in RAS, exhibiting promising performances. Developing vision foundation models for downstream RAS tasks, which is based on upstream counterparts but necessitates further investigations, will directly enhance surgical autonomy. Here, we outline research trends that could accelerate this paradigm shift and highlight major challenges that could impede progress in the way to the ultimate transformation from ‘surgical robots’ to ‘robotic surgeons’. Robot-assisted surgery relies heavily on vision and generally integrates medical imaging for diagnostic and/or navigation purposes with robots that offer accurate surgical functions. This Perspective discusses how large vision models can enhance vision-related tasks in robot-assisted surgery transforming ‘surgical robots’ into ‘robotic surgeons’.","PeriodicalId":501701,"journal":{"name":"Nature Reviews Electrical Engineering","volume":"2 5","pages":"350-363"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Innovating robot-assisted surgery through large vision models\",\"authors\":\"Zhe Min \\n (, ), Jiewen Lai \\n (, ), Hongliang Ren \\n (, )\",\"doi\":\"10.1038/s44287-025-00166-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid development of generative artificial intelligence and large models, including large vision models (LVMs), has accelerated their wide applications in medicine. Robot-assisted surgery (RAS) or surgical robotics, in which vision has a vital role, typically combines medical images for diagnostic or navigation abilities with robots with precise operative capabilities. In this context, LVMs could serve as a revolutionary paradigm towards surgical autonomy, accomplishing surgical representations with high fidelity and physical intelligence and enabling high-quality data use and long-term learning. In this Perspective, vision-related tasks in RAS are divided into fundamental upstream tasks and advanced downstream counterparts, elucidating their shared technical foundations with state-of-the-art research that could catalyse a paradigm shift in surgical robotics research for the next decade. LVMs have already been extensively explored to tackle upstream tasks in RAS, exhibiting promising performances. Developing vision foundation models for downstream RAS tasks, which is based on upstream counterparts but necessitates further investigations, will directly enhance surgical autonomy. Here, we outline research trends that could accelerate this paradigm shift and highlight major challenges that could impede progress in the way to the ultimate transformation from ‘surgical robots’ to ‘robotic surgeons’. Robot-assisted surgery relies heavily on vision and generally integrates medical imaging for diagnostic and/or navigation purposes with robots that offer accurate surgical functions. This Perspective discusses how large vision models can enhance vision-related tasks in robot-assisted surgery transforming ‘surgical robots’ into ‘robotic surgeons’.\",\"PeriodicalId\":501701,\"journal\":{\"name\":\"Nature Reviews Electrical Engineering\",\"volume\":\"2 5\",\"pages\":\"350-363\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Reviews Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.nature.com/articles/s44287-025-00166-6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Reviews Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s44287-025-00166-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

生成式人工智能和大模型（包括大视觉模型）的快速发展，加速了其在医学上的广泛应用。机器人辅助手术（RAS）或外科机器人，其中视觉具有至关重要的作用，通常将用于诊断或导航能力的医学图像与具有精确手术能力的机器人相结合。在这种情况下，lvm可以作为手术自主性的革命性范例，以高保真度和物理智能完成手术表征，并实现高质量的数据使用和长期学习。从这个角度来看，RAS中与视觉相关的任务被分为基本的上游任务和高级的下游任务，阐明了它们与最先进的研究共同的技术基础，这些研究可以催化下一个十年手术机器人研究的范式转变。lvm已经被广泛地用于解决RAS中的上游任务，并表现出良好的性能。开发下游RAS任务的视觉基础模型将直接提高手术自主性，该模型基于上游任务，但需要进一步研究。在这里，我们概述了可能加速这种范式转变的研究趋势，并强调了可能阻碍从“手术机器人”到“机器人外科医生”最终转变的主要挑战。机器人辅助手术在很大程度上依赖于视觉，通常将用于诊断和/或导航目的的医学成像与提供精确手术功能的机器人集成在一起。本展望讨论了大视觉模型如何在机器人辅助手术中增强视觉相关任务，将“手术机器人”转变为“机器人外科医生”。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Innovating robot-assisted surgery through large vision models

查看原文本刊更多论文

Innovating robot-assisted surgery through large vision models

The rapid development of generative artificial intelligence and large models, including large vision models (LVMs), has accelerated their wide applications in medicine. Robot-assisted surgery (RAS) or surgical robotics, in which vision has a vital role, typically combines medical images for diagnostic or navigation abilities with robots with precise operative capabilities. In this context, LVMs could serve as a revolutionary paradigm towards surgical autonomy, accomplishing surgical representations with high fidelity and physical intelligence and enabling high-quality data use and long-term learning. In this Perspective, vision-related tasks in RAS are divided into fundamental upstream tasks and advanced downstream counterparts, elucidating their shared technical foundations with state-of-the-art research that could catalyse a paradigm shift in surgical robotics research for the next decade. LVMs have already been extensively explored to tackle upstream tasks in RAS, exhibiting promising performances. Developing vision foundation models for downstream RAS tasks, which is based on upstream counterparts but necessitates further investigations, will directly enhance surgical autonomy. Here, we outline research trends that could accelerate this paradigm shift and highlight major challenges that could impede progress in the way to the ultimate transformation from ‘surgical robots’ to ‘robotic surgeons’. Robot-assisted surgery relies heavily on vision and generally integrates medical imaging for diagnostic and/or navigation purposes with robots that offer accurate surgical functions. This Perspective discusses how large vision models can enhance vision-related tasks in robot-assisted surgery transforming ‘surgical robots’ into ‘robotic surgeons’.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Nature Reviews Electrical Engineering

自引率

0.00%

发文量