从大型语言模型到多模态人工智能：对医学中生成式人工智能潜力的范围审查。

IF 2.8 4区医学 Q2 ENGINEERING, BIOMEDICAL

Biomedical Engineering Letters Pub Date : 2025-08-22 eCollection Date: 2025-09-01 DOI:10.1007/s13534-025-00497-1

Lukas Buess, Matthias Keicher, Nassir Navab, Andreas Maier, Soroosh Tayebi Arasteh

{"title":"从大型语言模型到多模态人工智能：对医学中生成式人工智能潜力的范围审查。","authors":"Lukas Buess, Matthias Keicher, Nassir Navab, Andreas Maier, Soroosh Tayebi Arasteh","doi":"10.1007/s13534-025-00497-1","DOIUrl":null,"url":null,"abstract":"Generative artificial intelligence (AI) models, such as diffusion models and OpenAI's ChatGPT, are transforming medicine by enhancing diagnostic accuracy and automating clinical workflows. The field has advanced rapidly, evolving from text-only large language models for tasks such as clinical documentation and decision support to multimodal AI systems capable of integrating diverse data modalities, including imaging, text, and structured data, within a single model. The diverse landscape of these technologies, along with rising interest, highlights the need for a comprehensive review of their applications and potential. This scoping review explores the evolution of multimodal AI, highlighting its methods, applications, datasets, and evaluation in clinical settings. Adhering to PRISMA-ScR guidelines, we systematically queried PubMed, IEEE Xplore, and Web of Science, prioritizing recent studies published up to the end of 2024. After rigorous screening, 145 papers were included, revealing key trends and challenges in this dynamic field. Our findings underscore a shift from unimodal to multimodal approaches, driving innovations in diagnostic support, medical report generation, drug discovery, and conversational AI. However, critical challenges remain, including the integration of heterogeneous data types, improving model interpretability, addressing ethical concerns, and validating AI systems in real-world clinical settings. This review summarizes the current state of the art, identifies critical gaps, and provides insights to guide the development of scalable, trustworthy, and clinically impactful multimodal AI solutions in healthcare.Supplementary information: The online version contains supplementary material available at 10.1007/s13534-025-00497-1.","PeriodicalId":46898,"journal":{"name":"Biomedical Engineering Letters","volume":"15 5","pages":"845-863"},"PeriodicalIF":2.8000,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12411359/pdf/","citationCount":"0","resultStr":"{\"title\":\"From large language models to multimodal AI: a scoping review on the potential of generative AI in medicine.\",\"authors\":\"Lukas Buess, Matthias Keicher, Nassir Navab, Andreas Maier, Soroosh Tayebi Arasteh\",\"doi\":\"10.1007/s13534-025-00497-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generative artificial intelligence (AI) models, such as diffusion models and OpenAI's ChatGPT, are transforming medicine by enhancing diagnostic accuracy and automating clinical workflows. The field has advanced rapidly, evolving from text-only large language models for tasks such as clinical documentation and decision support to multimodal AI systems capable of integrating diverse data modalities, including imaging, text, and structured data, within a single model. The diverse landscape of these technologies, along with rising interest, highlights the need for a comprehensive review of their applications and potential. This scoping review explores the evolution of multimodal AI, highlighting its methods, applications, datasets, and evaluation in clinical settings. Adhering to PRISMA-ScR guidelines, we systematically queried PubMed, IEEE Xplore, and Web of Science, prioritizing recent studies published up to the end of 2024. After rigorous screening, 145 papers were included, revealing key trends and challenges in this dynamic field. Our findings underscore a shift from unimodal to multimodal approaches, driving innovations in diagnostic support, medical report generation, drug discovery, and conversational AI. However, critical challenges remain, including the integration of heterogeneous data types, improving model interpretability, addressing ethical concerns, and validating AI systems in real-world clinical settings. This review summarizes the current state of the art, identifies critical gaps, and provides insights to guide the development of scalable, trustworthy, and clinically impactful multimodal AI solutions in healthcare.Supplementary information: The online version contains supplementary material available at 10.1007/s13534-025-00497-1.\",\"PeriodicalId\":46898,\"journal\":{\"name\":\"Biomedical Engineering Letters\",\"volume\":\"15 5\",\"pages\":\"845-863\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12411359/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomedical Engineering Letters\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s13534-025-00497-1\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/9/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Engineering Letters","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s13534-025-00497-1","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

生成式人工智能（AI）模型，如扩散模型和OpenAI的ChatGPT，正在通过提高诊断准确性和自动化临床工作流程来改变医学。该领域发展迅速，从用于临床文档和决策支持等任务的纯文本大型语言模型发展到能够在单个模型中集成多种数据模式（包括成像、文本和结构化数据）的多模态人工智能系统。这些技术的多样性，以及日益增长的兴趣，突出了对其应用和潜力进行全面审查的必要性。本综述探讨了多模态人工智能的发展，重点介绍了其方法、应用、数据集和临床环境中的评估。根据PRISMA-ScR指南，我们系统地查询了PubMed、IEEE explore和Web of Science，对截至2024年底发表的最新研究进行了优先排序。经过严格筛选，145篇论文入选，揭示了这一动态领域的主要趋势和挑战。我们的研究结果强调了从单模态到多模态方法的转变，推动了诊断支持、医疗报告生成、药物发现和会话人工智能方面的创新。然而，关键的挑战仍然存在，包括异构数据类型的集成，提高模型的可解释性，解决伦理问题，以及在现实世界的临床环境中验证人工智能系统。本文总结了当前的技术状况，确定了关键差距，并提供了见解，以指导医疗保健中可扩展、可信赖且具有临床影响力的多模式人工智能解决方案的开发。补充信息：在线版本包含补充资料，下载地址：10.1007/s13534-025-00497-1。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

From large language models to multimodal AI: a scoping review on the potential of generative AI in medicine.

查看原文本刊更多论文

From large language models to multimodal AI: a scoping review on the potential of generative AI in medicine.

Generative artificial intelligence (AI) models, such as diffusion models and OpenAI's ChatGPT, are transforming medicine by enhancing diagnostic accuracy and automating clinical workflows. The field has advanced rapidly, evolving from text-only large language models for tasks such as clinical documentation and decision support to multimodal AI systems capable of integrating diverse data modalities, including imaging, text, and structured data, within a single model. The diverse landscape of these technologies, along with rising interest, highlights the need for a comprehensive review of their applications and potential. This scoping review explores the evolution of multimodal AI, highlighting its methods, applications, datasets, and evaluation in clinical settings. Adhering to PRISMA-ScR guidelines, we systematically queried PubMed, IEEE Xplore, and Web of Science, prioritizing recent studies published up to the end of 2024. After rigorous screening, 145 papers were included, revealing key trends and challenges in this dynamic field. Our findings underscore a shift from unimodal to multimodal approaches, driving innovations in diagnostic support, medical report generation, drug discovery, and conversational AI. However, critical challenges remain, including the integration of heterogeneous data types, improving model interpretability, addressing ethical concerns, and validating AI systems in real-world clinical settings. This review summarizes the current state of the art, identifies critical gaps, and provides insights to guide the development of scalable, trustworthy, and clinically impactful multimodal AI solutions in healthcare.

Supplementary information: The online version contains supplementary material available at 10.1007/s13534-025-00497-1.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biomedical Engineering Letters ENGINEERING, BIOMEDICAL-

CiteScore

6.80

自引率

0.00%

发文量

期刊介绍： Biomedical Engineering Letters (BMEL) aims to present the innovative experimental science and technological development in the biomedical field as well as clinical application of new development. The article must contain original biomedical engineering content, defined as development, theoretical analysis, and evaluation/validation of a new technique. BMEL publishes the following types of papers: original articles, review articles, editorials, and letters to the editor. All the papers are reviewed in single-blind fashion.