Visual-language foundation models in medical imaging: A systematic review and meta-analysis of diagnostic and analytical applications

IF 4.9 2区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer methods and programs in biomedicine Pub Date : 2025-05-21 DOI:10.1016/j.cmpb.2025.108870

Yiyao Sun , Xinran Wen , Yan Zhang , Lijun Jin , Chunna Yang , Qianhui Zhang , Mingchen Jiang , Zhaoyang Xu , Wei Guo , Juan Su , Xiran Jiang

{"title":"Visual-language foundation models in medical imaging: A systematic review and meta-analysis of diagnostic and analytical applications","authors":"Yiyao Sun , Xinran Wen , Yan Zhang , Lijun Jin , Chunna Yang , Qianhui Zhang , Mingchen Jiang , Zhaoyang Xu , Wei Guo , Juan Su , Xiran Jiang","doi":"10.1016/j.cmpb.2025.108870","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and objective</h3><div>Visual-language foundation models (VLMs) have garnered attention for their numerous advantages and significant potential in AI-aided diagnosis and treatment, driving widespread applications in medical tasks. This study analyzes and summarizes the value and prospects of VLMs, highlighting their groundbreaking opportunities in healthcare.</div></div><div><h3>Methods</h3><div>This systematic review and meta-analysis, registered with PROSPERO (CRD42024575746), included studies from PubMed, Embase, Web of Science, and IEEE from inception to December 31, 2024. The inclusion criteria covered state-of-the-art VLM developments and applications in medical imaging. Metrics such as AUC, Dice coefficient, BLEU score, and Accuracy were pooled for tasks like classification, segmentation, report generation, and Visual Question Answering (VQA). Reporting quality and bias were assessed using the QUADAS-AI checklist.</div></div><div><h3>Results</h3><div>A total of 106 eligible studies were identified for this systematic review, of which 94 were included for meta-analysis. The pooled AUC for downstream classification tasks was 0.86 (0.85–0.87); pooled Dice coefficient for segmentation tasks was 0.73 (0.68–0.78); pooled BLEU score for report generation tasks was 0.31 (0.20–0.43); and pooled Acc score for VQA was 0.76 (0.71–0.81). Subgroup analyses were stratified by imaging modalities (radiological, pathological and surface imaging) and publication year (before or after 2023) to explore the heterogeneity within VLM research and to analyze diagnostic performance of the VLMs under different conditions.</div></div><div><h3>Conclusions</h3><div>VLMs based on medical imaging have demonstrated strong performance and significant potential in computer-assisted clinical diagnosis. Stricter reporting standards addressing the unique challenges of VLM research could enhance study quality.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"268 ","pages":"Article 108870"},"PeriodicalIF":4.9000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725002871","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Background and objective

Visual-language foundation models (VLMs) have garnered attention for their numerous advantages and significant potential in AI-aided diagnosis and treatment, driving widespread applications in medical tasks. This study analyzes and summarizes the value and prospects of VLMs, highlighting their groundbreaking opportunities in healthcare.

Methods

This systematic review and meta-analysis, registered with PROSPERO (CRD42024575746), included studies from PubMed, Embase, Web of Science, and IEEE from inception to December 31, 2024. The inclusion criteria covered state-of-the-art VLM developments and applications in medical imaging. Metrics such as AUC, Dice coefficient, BLEU score, and Accuracy were pooled for tasks like classification, segmentation, report generation, and Visual Question Answering (VQA). Reporting quality and bias were assessed using the QUADAS-AI checklist.

Results

A total of 106 eligible studies were identified for this systematic review, of which 94 were included for meta-analysis. The pooled AUC for downstream classification tasks was 0.86 (0.85–0.87); pooled Dice coefficient for segmentation tasks was 0.73 (0.68–0.78); pooled BLEU score for report generation tasks was 0.31 (0.20–0.43); and pooled Acc score for VQA was 0.76 (0.71–0.81). Subgroup analyses were stratified by imaging modalities (radiological, pathological and surface imaging) and publication year (before or after 2023) to explore the heterogeneity within VLM research and to analyze diagnostic performance of the VLMs under different conditions.

Conclusions

VLMs based on medical imaging have demonstrated strong performance and significant potential in computer-assisted clinical diagnosis. Stricter reporting standards addressing the unique challenges of VLM research could enhance study quality.

查看原文本刊更多论文

医学影像中的视觉语言基础模型：诊断和分析应用的系统回顾和荟萃分析

视觉语言基础模型（VLMs）因其在人工智能辅助诊断和治疗中的众多优势和巨大潜力而受到关注，在医疗任务中得到了广泛的应用。本研究分析和总结了VLMs的价值和前景，强调了它们在医疗保健领域的开创性机会。方法本系统综述和荟萃分析已在PROSPERO注册（CRD42024575746），纳入了从开始到2024年12月31日来自PubMed、Embase、Web of Science和IEEE的研究。纳入标准涵盖了最先进的VLM发展和在医学成像中的应用。AUC、Dice系数、BLEU评分和准确性等指标被用于分类、分割、报告生成和视觉问题回答（VQA）等任务。使用QUADAS-AI检查表评估报告质量和偏倚。结果本次系统评价共纳入106项符合条件的研究，其中94项纳入meta分析。下游分类任务的汇总AUC为0.86 (0.85 ~ 0.87)；分割任务的混合Dice系数为0.73 (0.68 ~ 0.78)；报告生成任务的合并BLEU得分为0.31 (0.20-0.43)；VQA的合并Acc评分为0.76（0.71 ~ 0.81）。亚组分析按影像学方式（影像学、病理和表面影像学）和发表年份（2023年之前或之后）进行分层，以探讨VLM研究的异质性，并分析不同条件下VLM的诊断性能。结论基于医学影像的svlms在计算机辅助临床诊断中表现出良好的性能和巨大的潜力。更严格的报告标准可以解决VLM研究的独特挑战，从而提高研究质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer methods and programs in biomedicine 工程技术-工程：生物医学

CiteScore

12.30

自引率

6.60%

发文量

601

审稿时长

135 days

期刊介绍： To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.