Vishwanatha M. Rao, Michael Hla, Michael Moor, Subathra Adithan, Stephen Kwak, Eric J. Topol, Pranav Rajpurkar
{"title":"医学图像判读的多模态生成AI","authors":"Vishwanatha M. Rao, Michael Hla, Michael Moor, Subathra Adithan, Stephen Kwak, Eric J. Topol, Pranav Rajpurkar","doi":"10.1038/s41586-025-08675-y","DOIUrl":null,"url":null,"abstract":"Accurately interpreting medical images and generating insightful narrative reports is indispensable for patient care but places heavy burdens on clinical experts. Advances in artificial intelligence (AI), especially in an area that we refer to as multimodal generative medical image interpretation (GenMI), create opportunities to automate parts of this complex process. In this Perspective, we synthesize progress and challenges in developing AI systems for generation of medical reports from images. We focus extensively on radiology as a domain with enormous reporting needs and research efforts. In addition to analysing the strengths and applications of new models for medical report generation, we advocate for a novel paradigm to deploy GenMI in a manner that empowers clinicians and their patients. Initial research suggests that GenMI could one day match human expert performance in generating reports across disciplines, such as radiology, pathology and dermatology. However, formidable obstacles remain in validating model accuracy, ensuring transparency and eliciting nuanced impressions. If carefully implemented, GenMI could meaningfully assist clinicians in improving quality of care, enhancing medical education, reducing workloads, expanding specialty access and providing real-time expertise. Overall, we highlight opportunities alongside key challenges for developing multimodal generative AI that complements human experts for reliable medical report writing. This Perspective describes how recent advances in artificial intelligence could be used to automate medical image interpretation to complement human expertise and empower physicians and patients.","PeriodicalId":18787,"journal":{"name":"Nature","volume":"639 8056","pages":"888-896"},"PeriodicalIF":48.5000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multimodal generative AI for medical image interpretation\",\"authors\":\"Vishwanatha M. Rao, Michael Hla, Michael Moor, Subathra Adithan, Stephen Kwak, Eric J. Topol, Pranav Rajpurkar\",\"doi\":\"10.1038/s41586-025-08675-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurately interpreting medical images and generating insightful narrative reports is indispensable for patient care but places heavy burdens on clinical experts. Advances in artificial intelligence (AI), especially in an area that we refer to as multimodal generative medical image interpretation (GenMI), create opportunities to automate parts of this complex process. In this Perspective, we synthesize progress and challenges in developing AI systems for generation of medical reports from images. We focus extensively on radiology as a domain with enormous reporting needs and research efforts. In addition to analysing the strengths and applications of new models for medical report generation, we advocate for a novel paradigm to deploy GenMI in a manner that empowers clinicians and their patients. Initial research suggests that GenMI could one day match human expert performance in generating reports across disciplines, such as radiology, pathology and dermatology. However, formidable obstacles remain in validating model accuracy, ensuring transparency and eliciting nuanced impressions. If carefully implemented, GenMI could meaningfully assist clinicians in improving quality of care, enhancing medical education, reducing workloads, expanding specialty access and providing real-time expertise. Overall, we highlight opportunities alongside key challenges for developing multimodal generative AI that complements human experts for reliable medical report writing. This Perspective describes how recent advances in artificial intelligence could be used to automate medical image interpretation to complement human expertise and empower physicians and patients.\",\"PeriodicalId\":18787,\"journal\":{\"name\":\"Nature\",\"volume\":\"639 8056\",\"pages\":\"888-896\"},\"PeriodicalIF\":48.5000,\"publicationDate\":\"2025-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://www.nature.com/articles/s41586-025-08675-y\",\"RegionNum\":1,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature","FirstCategoryId":"103","ListUrlMain":"https://www.nature.com/articles/s41586-025-08675-y","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Multimodal generative AI for medical image interpretation
Accurately interpreting medical images and generating insightful narrative reports is indispensable for patient care but places heavy burdens on clinical experts. Advances in artificial intelligence (AI), especially in an area that we refer to as multimodal generative medical image interpretation (GenMI), create opportunities to automate parts of this complex process. In this Perspective, we synthesize progress and challenges in developing AI systems for generation of medical reports from images. We focus extensively on radiology as a domain with enormous reporting needs and research efforts. In addition to analysing the strengths and applications of new models for medical report generation, we advocate for a novel paradigm to deploy GenMI in a manner that empowers clinicians and their patients. Initial research suggests that GenMI could one day match human expert performance in generating reports across disciplines, such as radiology, pathology and dermatology. However, formidable obstacles remain in validating model accuracy, ensuring transparency and eliciting nuanced impressions. If carefully implemented, GenMI could meaningfully assist clinicians in improving quality of care, enhancing medical education, reducing workloads, expanding specialty access and providing real-time expertise. Overall, we highlight opportunities alongside key challenges for developing multimodal generative AI that complements human experts for reliable medical report writing. This Perspective describes how recent advances in artificial intelligence could be used to automate medical image interpretation to complement human expertise and empower physicians and patients.
期刊介绍:
Nature is a prestigious international journal that publishes peer-reviewed research in various scientific and technological fields. The selection of articles is based on criteria such as originality, importance, interdisciplinary relevance, timeliness, accessibility, elegance, and surprising conclusions. In addition to showcasing significant scientific advances, Nature delivers rapid, authoritative, insightful news, and interpretation of current and upcoming trends impacting science, scientists, and the broader public. The journal serves a dual purpose: firstly, to promptly share noteworthy scientific advances and foster discussions among scientists, and secondly, to ensure the swift dissemination of scientific results globally, emphasizing their significance for knowledge, culture, and daily life.