A narrative review of foundation models for medical image segmentation: zero-shot performance evaluation on diverse modalities.

IF 2.3 2区医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Quantitative Imaging in Medicine and Surgery Pub Date : 2025-06-06 Epub Date: 2025-06-03 DOI:10.21037/qims-2024-2826

Seungha Noh, Byoung-Dai Lee

{"title":"A narrative review of foundation models for medical image segmentation: zero-shot performance evaluation on diverse modalities.","authors":"Seungha Noh, Byoung-Dai Lee","doi":"10.21037/qims-2024-2826","DOIUrl":null,"url":null,"abstract":"Background and objective: Foundation models are deep learning models pretrained on extensive datasets, equipped with the ability to adapt to a variety of downstream tasks. Recently, they have gained prominence across various domains, including medical imaging. These models exhibit remarkable contextual understanding and generalization capabilities, spurring active research in healthcare to develop versatile artificial intelligence solutions for real-world clinical environments. Inspired by this, this study offers a comprehensive review of foundation models in medical image segmentation (MIS), evaluates their zero-shot performance on diverse datasets, and assesses their practical applicability in clinical settings.Methods: A total of 63 studies on foundation models for MIS were systematically reviewed, utilizing platforms such as arXiv, ResearchGate, Google Scholar, Semantic Scholar, and PubMed. Additionally, we curated 31 unseen medical image datasets from The Cancer Imaging Archive (TCIA), Kaggle, Zenodo, Institute of Electrical and Electronics Engineers (IEEE) DataPort, and Grand Challenge to evaluate the zero-shot performance of six foundation models. Performance analysis was conducted from various perspectives, including modality and anatomical structure.Key content and findings: Foundation models were categorized based on a taxonomy that incorporates criteria such as data dimensions, modality coverage, prompt type, and training strategy. Furthermore, the zero-shot evaluation revealed key insights into their strengths and limitations across diverse imaging modalities. This analysis underscores the potential of these models in MIS while highlighting areas for improvement to optimize real-world applications.Conclusions: Our findings provide a valuable resource for understanding the role of foundation models in MIS. By identifying their capabilities and limitations, this review lays the groundwork for advancing their practical deployment in clinical environments, supporting further innovation in medical image analysis.","PeriodicalId":54267,"journal":{"name":"Quantitative Imaging in Medicine and Surgery","volume":"15 6","pages":"5825-5858"},"PeriodicalIF":2.3000,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12209621/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Imaging in Medicine and Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/qims-2024-2826","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/3 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Background and objective: Foundation models are deep learning models pretrained on extensive datasets, equipped with the ability to adapt to a variety of downstream tasks. Recently, they have gained prominence across various domains, including medical imaging. These models exhibit remarkable contextual understanding and generalization capabilities, spurring active research in healthcare to develop versatile artificial intelligence solutions for real-world clinical environments. Inspired by this, this study offers a comprehensive review of foundation models in medical image segmentation (MIS), evaluates their zero-shot performance on diverse datasets, and assesses their practical applicability in clinical settings.

Methods: A total of 63 studies on foundation models for MIS were systematically reviewed, utilizing platforms such as arXiv, ResearchGate, Google Scholar, Semantic Scholar, and PubMed. Additionally, we curated 31 unseen medical image datasets from The Cancer Imaging Archive (TCIA), Kaggle, Zenodo, Institute of Electrical and Electronics Engineers (IEEE) DataPort, and Grand Challenge to evaluate the zero-shot performance of six foundation models. Performance analysis was conducted from various perspectives, including modality and anatomical structure.

Key content and findings: Foundation models were categorized based on a taxonomy that incorporates criteria such as data dimensions, modality coverage, prompt type, and training strategy. Furthermore, the zero-shot evaluation revealed key insights into their strengths and limitations across diverse imaging modalities. This analysis underscores the potential of these models in MIS while highlighting areas for improvement to optimize real-world applications.

Conclusions: Our findings provide a valuable resource for understanding the role of foundation models in MIS. By identifying their capabilities and limitations, this review lays the groundwork for advancing their practical deployment in clinical environments, supporting further innovation in medical image analysis.

查看原文本刊更多论文

医学图像分割的基础模型述评：不同模式下的零射击性能评价。

背景和目的：基础模型是在广泛的数据集上进行预训练的深度学习模型，具有适应各种下游任务的能力。最近，它们在包括医学成像在内的各个领域都得到了突出的表现。这些模型表现出卓越的上下文理解和泛化能力，促进了医疗保健领域的积极研究，为现实世界的临床环境开发多功能人工智能解决方案。受此启发，本研究全面回顾了医学图像分割（MIS）的基础模型，评估了它们在不同数据集上的零射击性能，并评估了它们在临床环境中的实际适用性。方法：利用arXiv、ResearchGate、谷歌Scholar、Semantic Scholar、PubMed等平台，对63篇MIS基础模型研究进行系统综述。此外，我们从癌症成像档案（TCIA）、Kaggle、Zenodo、电气和电子工程师协会（IEEE）数据端口和Grand Challenge中收集了31个未见过的医学图像数据集，以评估六个基础模型的零射击性能。从形态、解剖结构等角度进行性能分析。关键内容和发现：基础模型是基于一种分类法进行分类的，这种分类法结合了诸如数据维度、模式覆盖、提示类型和培训策略等标准。此外，零射击评估揭示了它们在不同成像模式下的优势和局限性。该分析强调了这些模型在MIS中的潜力，同时强调了需要改进的领域，以优化实际应用程序。结论：我们的发现为理解基础模型在MIS中的作用提供了宝贵的资源。通过识别它们的能力和局限性，本综述为推进它们在临床环境中的实际部署奠定了基础，支持医学图像分析的进一步创新。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊