导航医学中多模式人工智能的前景：对技术挑战和临床应用的范围审查

IF 11.8 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2025-05-30 DOI:10.1016/j.media.2025.103621

Daan Schouten , Giulia Nicoletti , Bas Dille , Catherine Chia , Pierpaolo Vendittelli , Megan Schuurmans , Geert Litjens , Nadieh Khalili

{"title":"导航医学中多模式人工智能的前景：对技术挑战和临床应用的范围审查","authors":"Daan Schouten , Giulia Nicoletti , Bas Dille , Catherine Chia , Pierpaolo Vendittelli , Megan Schuurmans , Geert Litjens , Nadieh Khalili","doi":"10.1016/j.media.2025.103621","DOIUrl":null,"url":null,"abstract":"<div><div>Recent technological advances in healthcare have led to unprecedented growth in patient data quantity and diversity. While artificial intelligence (AI) models have shown promising results in analyzing individual data modalities, there is increasing recognition that models integrating multiple complementary data sources, so-called multimodal AI, could enhance clinical decision-making. This scoping review examines the landscape of deep learning-based multimodal AI applications across the medical domain, analyzing 432 papers published between 2018 and 2024. We provide an extensive overview of multimodal AI development across different medical disciplines, examining various architectural approaches, fusion strategies, and common application areas. Our analysis reveals that multimodal AI models consistently outperform their unimodal counterparts, with an average improvement of 6.2 percentage points in AUC. However, several challenges persist, including cross-departmental coordination, heterogeneous data characteristics, and incomplete datasets. We critically assess the technical and practical challenges in developing multimodal AI systems and discuss potential strategies for their clinical implementation, including a brief overview of commercially available multimodal AI models for clinical decision-making. Additionally, we identify key factors driving multimodal AI development and propose recommendations to accelerate the field’s maturation. This review provides researchers and clinicians with a thorough understanding of the current state, challenges, and future directions of multimodal AI in medicine.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103621"},"PeriodicalIF":11.8000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Navigating the landscape of multimodal AI in medicine: A scoping review on technical challenges and clinical applications\",\"authors\":\"Daan Schouten , Giulia Nicoletti , Bas Dille , Catherine Chia , Pierpaolo Vendittelli , Megan Schuurmans , Geert Litjens , Nadieh Khalili\",\"doi\":\"10.1016/j.media.2025.103621\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recent technological advances in healthcare have led to unprecedented growth in patient data quantity and diversity. While artificial intelligence (AI) models have shown promising results in analyzing individual data modalities, there is increasing recognition that models integrating multiple complementary data sources, so-called multimodal AI, could enhance clinical decision-making. This scoping review examines the landscape of deep learning-based multimodal AI applications across the medical domain, analyzing 432 papers published between 2018 and 2024. We provide an extensive overview of multimodal AI development across different medical disciplines, examining various architectural approaches, fusion strategies, and common application areas. Our analysis reveals that multimodal AI models consistently outperform their unimodal counterparts, with an average improvement of 6.2 percentage points in AUC. However, several challenges persist, including cross-departmental coordination, heterogeneous data characteristics, and incomplete datasets. We critically assess the technical and practical challenges in developing multimodal AI systems and discuss potential strategies for their clinical implementation, including a brief overview of commercially available multimodal AI models for clinical decision-making. Additionally, we identify key factors driving multimodal AI development and propose recommendations to accelerate the field’s maturation. This review provides researchers and clinicians with a thorough understanding of the current state, challenges, and future directions of multimodal AI in medicine.</div></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"105 \",\"pages\":\"Article 103621\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841525001689\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525001689","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

最近医疗保健领域的技术进步导致了患者数据数量和多样性的空前增长。虽然人工智能（AI）模型在分析单个数据模式方面显示出有希望的结果，但人们越来越认识到，集成多个互补数据源的模型，即所谓的多模式人工智能，可以增强临床决策。这篇范围审查研究了基于深度学习的多模式人工智能在医疗领域的应用前景，分析了2018年至2024年间发表的432篇论文。我们提供了跨不同医学学科的多模式人工智能开发的广泛概述，研究了各种架构方法、融合策略和常见应用领域。我们的分析显示，多模态人工智能模型的表现始终优于单模态模型，AUC平均提高6.2个百分点。然而，仍然存在一些挑战，包括跨部门协调、异构数据特征和不完整的数据集。我们批判性地评估了开发多模态人工智能系统的技术和实践挑战，并讨论了其临床实施的潜在策略，包括简要概述了用于临床决策的商用多模态人工智能模型。此外，我们还确定了推动多模式人工智能发展的关键因素，并提出了加速该领域成熟的建议。这篇综述为研究人员和临床医生提供了对医学中多模态人工智能的现状、挑战和未来方向的全面了解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Navigating the landscape of multimodal AI in medicine: A scoping review on technical challenges and clinical applications

Recent technological advances in healthcare have led to unprecedented growth in patient data quantity and diversity. While artificial intelligence (AI) models have shown promising results in analyzing individual data modalities, there is increasing recognition that models integrating multiple complementary data sources, so-called multimodal AI, could enhance clinical decision-making. This scoping review examines the landscape of deep learning-based multimodal AI applications across the medical domain, analyzing 432 papers published between 2018 and 2024. We provide an extensive overview of multimodal AI development across different medical disciplines, examining various architectural approaches, fusion strategies, and common application areas. Our analysis reveals that multimodal AI models consistently outperform their unimodal counterparts, with an average improvement of 6.2 percentage points in AUC. However, several challenges persist, including cross-departmental coordination, heterogeneous data characteristics, and incomplete datasets. We critically assess the technical and practical challenges in developing multimodal AI systems and discuss potential strategies for their clinical implementation, including a brief overview of commercially available multimodal AI models for clinical decision-making. Additionally, we identify key factors driving multimodal AI development and propose recommendations to accelerate the field’s maturation. This review provides researchers and clinicians with a thorough understanding of the current state, challenges, and future directions of multimodal AI in medicine.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.