Multimodal Large Language Models in Medical Imaging: Current State and Future Directions.

IF 5.3 2区 医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Yoojin Nam, Dong Yeong Kim, Sunggu Kyung, Jinyoung Seo, Jeong Min Song, Jimin Kwon, Jihyun Kim, Wooyoung Jo, Hyungbin Park, Jimin Sung, Sangah Park, Heeyeon Kwon, Taehee Kwon, Kanghyun Kim, Namkug Kim
{"title":"Multimodal Large Language Models in Medical Imaging: Current State and Future Directions.","authors":"Yoojin Nam, Dong Yeong Kim, Sunggu Kyung, Jinyoung Seo, Jeong Min Song, Jimin Kwon, Jihyun Kim, Wooyoung Jo, Hyungbin Park, Jimin Sung, Sangah Park, Heeyeon Kwon, Taehee Kwon, Kanghyun Kim, Namkug Kim","doi":"10.3348/kjr.2025.0599","DOIUrl":null,"url":null,"abstract":"<p><p>Multimodal large language models (MLLMs) are emerging as powerful tools in medicine, particularly in radiology, with the potential to serve as trusted artificial intelligence (AI) partners for clinicians. In radiology, these models integrate large language models (LLMs) with diverse multimodal data sources by combining clinical information and text with radiologic images of various modalities, ranging from 2D chest X-rays to 3D CT/MRI. Methods for achieving this multimodal integration are rapidly evolving, and the high performance of freely available LLMs may further accelerate MLLM development. Current applications of MLLMs now span automatic generation of preliminary radiology report, visual question answering, and interactive diagnostic support. Despite these promising capabilities, several significant challenges hinder widespread clinical adoption. MLLMs require access to large-scale, high-quality multimodal datasets, which are scarce in the medical domain. Risks of hallucinated findings, lack of transparency in decision-making processes, and high computational demands further complicate implementation. This review summarizes the current capabilities and limitations of MLLMs in medicine-particularly in radiology-and outlines key directions for future research. Critical areas include incorporating region-grounded reasoning to link model outputs to specific image regions, developing robust foundation models pre-trained on large-scale medical datasets, and establishing strategies for the safe and effective integration of MLLMs into clinical practice.</p>","PeriodicalId":17881,"journal":{"name":"Korean Journal of Radiology","volume":"26 10","pages":"900-923"},"PeriodicalIF":5.3000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12479233/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Korean Journal of Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3348/kjr.2025.0599","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Multimodal large language models (MLLMs) are emerging as powerful tools in medicine, particularly in radiology, with the potential to serve as trusted artificial intelligence (AI) partners for clinicians. In radiology, these models integrate large language models (LLMs) with diverse multimodal data sources by combining clinical information and text with radiologic images of various modalities, ranging from 2D chest X-rays to 3D CT/MRI. Methods for achieving this multimodal integration are rapidly evolving, and the high performance of freely available LLMs may further accelerate MLLM development. Current applications of MLLMs now span automatic generation of preliminary radiology report, visual question answering, and interactive diagnostic support. Despite these promising capabilities, several significant challenges hinder widespread clinical adoption. MLLMs require access to large-scale, high-quality multimodal datasets, which are scarce in the medical domain. Risks of hallucinated findings, lack of transparency in decision-making processes, and high computational demands further complicate implementation. This review summarizes the current capabilities and limitations of MLLMs in medicine-particularly in radiology-and outlines key directions for future research. Critical areas include incorporating region-grounded reasoning to link model outputs to specific image regions, developing robust foundation models pre-trained on large-scale medical datasets, and establishing strategies for the safe and effective integration of MLLMs into clinical practice.

医学影像中的多模态大语言模型:现状与未来方向。
多模态大型语言模型(mllm)正在成为医学领域,特别是放射学领域的强大工具,有可能成为临床医生值得信赖的人工智能(AI)合作伙伴。在放射学中,这些模型通过将临床信息和文本与各种模式的放射图像(从2D胸部x射线到3D CT/MRI)相结合,将具有多种多模式数据源的大型语言模型(llm)集成在一起。实现这种多模态集成的方法正在迅速发展,而免费llm的高性能可能会进一步加速MLLM的发展。目前mllm的应用涵盖了初步放射学报告的自动生成、可视化问题回答和交互式诊断支持。尽管有这些有前途的能力,一些重大的挑战阻碍了广泛的临床应用。mllm需要访问大规模、高质量的多模态数据集,这在医学领域是稀缺的。幻觉结果的风险,决策过程缺乏透明度,以及高计算需求进一步使实施复杂化。这篇综述总结了目前mllm在医学上的能力和局限性,特别是在放射学上,并概述了未来研究的关键方向。关键领域包括整合基于区域的推理,将模型输出与特定图像区域联系起来,开发基于大规模医疗数据集预训练的稳健基础模型,以及建立安全有效地将mllm整合到临床实践中的策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Korean Journal of Radiology
Korean Journal of Radiology 医学-核医学
CiteScore
10.60
自引率
12.50%
发文量
141
审稿时长
1.3 months
期刊介绍: The inaugural issue of the Korean J Radiol came out in March 2000. Our journal aims to produce and propagate knowledge on radiologic imaging and related sciences. A unique feature of the articles published in the Journal will be their reflection of global trends in radiology combined with an East-Asian perspective. Geographic differences in disease prevalence will be reflected in the contents of papers, and this will serve to enrich our body of knowledge. World''s outstanding radiologists from many countries are serving as editorial board of our journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信