Yiyao Sun , Xinran Wen , Yan Zhang , Lijun Jin , Chunna Yang , Qianhui Zhang , Mingchen Jiang , Zhaoyang Xu , Wei Guo , Juan Su , Xiran Jiang
{"title":"Visual-language foundation models in medical imaging: A systematic review and meta-analysis of diagnostic and analytical applications","authors":"Yiyao Sun , Xinran Wen , Yan Zhang , Lijun Jin , Chunna Yang , Qianhui Zhang , Mingchen Jiang , Zhaoyang Xu , Wei Guo , Juan Su , Xiran Jiang","doi":"10.1016/j.cmpb.2025.108870","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and objective</h3><div>Visual-language foundation models (VLMs) have garnered attention for their numerous advantages and significant potential in AI-aided diagnosis and treatment, driving widespread applications in medical tasks. This study analyzes and summarizes the value and prospects of VLMs, highlighting their groundbreaking opportunities in healthcare.</div></div><div><h3>Methods</h3><div>This systematic review and meta-analysis, registered with PROSPERO (CRD42024575746), included studies from PubMed, Embase, Web of Science, and IEEE from inception to December 31, 2024. The inclusion criteria covered state-of-the-art VLM developments and applications in medical imaging. Metrics such as AUC, Dice coefficient, BLEU score, and Accuracy were pooled for tasks like classification, segmentation, report generation, and Visual Question Answering (VQA). Reporting quality and bias were assessed using the QUADAS-AI checklist.</div></div><div><h3>Results</h3><div>A total of 106 eligible studies were identified for this systematic review, of which 94 were included for meta-analysis. The pooled AUC for downstream classification tasks was 0.86 (0.85–0.87); pooled Dice coefficient for segmentation tasks was 0.73 (0.68–0.78); pooled BLEU score for report generation tasks was 0.31 (0.20–0.43); and pooled Acc score for VQA was 0.76 (0.71–0.81). Subgroup analyses were stratified by imaging modalities (radiological, pathological and surface imaging) and publication year (before or after 2023) to explore the heterogeneity within VLM research and to analyze diagnostic performance of the VLMs under different conditions.</div></div><div><h3>Conclusions</h3><div>VLMs based on medical imaging have demonstrated strong performance and significant potential in computer-assisted clinical diagnosis. Stricter reporting standards addressing the unique challenges of VLM research could enhance study quality.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"268 ","pages":"Article 108870"},"PeriodicalIF":4.9000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725002871","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Background and objective
Visual-language foundation models (VLMs) have garnered attention for their numerous advantages and significant potential in AI-aided diagnosis and treatment, driving widespread applications in medical tasks. This study analyzes and summarizes the value and prospects of VLMs, highlighting their groundbreaking opportunities in healthcare.
Methods
This systematic review and meta-analysis, registered with PROSPERO (CRD42024575746), included studies from PubMed, Embase, Web of Science, and IEEE from inception to December 31, 2024. The inclusion criteria covered state-of-the-art VLM developments and applications in medical imaging. Metrics such as AUC, Dice coefficient, BLEU score, and Accuracy were pooled for tasks like classification, segmentation, report generation, and Visual Question Answering (VQA). Reporting quality and bias were assessed using the QUADAS-AI checklist.
Results
A total of 106 eligible studies were identified for this systematic review, of which 94 were included for meta-analysis. The pooled AUC for downstream classification tasks was 0.86 (0.85–0.87); pooled Dice coefficient for segmentation tasks was 0.73 (0.68–0.78); pooled BLEU score for report generation tasks was 0.31 (0.20–0.43); and pooled Acc score for VQA was 0.76 (0.71–0.81). Subgroup analyses were stratified by imaging modalities (radiological, pathological and surface imaging) and publication year (before or after 2023) to explore the heterogeneity within VLM research and to analyze diagnostic performance of the VLMs under different conditions.
Conclusions
VLMs based on medical imaging have demonstrated strong performance and significant potential in computer-assisted clinical diagnosis. Stricter reporting standards addressing the unique challenges of VLM research could enhance study quality.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.