Generalist models in medical image segmentation: A survey and performance comparison with task-specific approaches

IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Andrea Moglia , Matteo Leccardi , Matteo Cavicchioli , Alice Maccarini , Marco Marcon , Luca Mainardi , Pietro Cerveri
{"title":"Generalist models in medical image segmentation: A survey and performance comparison with task-specific approaches","authors":"Andrea Moglia ,&nbsp;Matteo Leccardi ,&nbsp;Matteo Cavicchioli ,&nbsp;Alice Maccarini ,&nbsp;Marco Marcon ,&nbsp;Luca Mainardi ,&nbsp;Pietro Cerveri","doi":"10.1016/j.inffus.2025.103709","DOIUrl":null,"url":null,"abstract":"<div><div>Following the successful paradigm shift of large language models, which leverages pre-training on a massive corpus of data and fine-tuning on various downstream tasks, generalist models have made their foray into computer vision. The introduction of the Segment Anything Model (SAM) marked a milestone in the segmentation of natural images, inspiring the design of numerous architectures for medical image segmentation. In this survey, we offer a comprehensive and in-depth investigation of generalist models for medical image segmentation. We begin with an introduction to the fundamental concepts that underpin their development. Then, we provide a taxonomy based on features fusion on the different declinations of SAM in terms of zero-shot, few-shot, fine-tuning, adapters, on SAM2, on other innovative models trained on images alone, and others trained on both text and images. We thoroughly analyze their performances at the level of both primary research and best-in-literature, followed by a rigorous comparison with the state-of-the-art task-specific models. We emphasize the need to address challenges in terms of compliance with regulatory frameworks, privacy and security laws, budget, and trustworthy artificial intelligence (AI). Finally, we share our perspective on future directions concerning synthetic data, early fusion, lessons learnt from generalist models in natural language processing, agentic AI, physical AI, and clinical translation. We publicly release a database-backed interactive app with all survey data (<span><span>https://hal9000-lab.github.io/GMMIS-Survey/</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103709"},"PeriodicalIF":15.5000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S156625352500781X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Following the successful paradigm shift of large language models, which leverages pre-training on a massive corpus of data and fine-tuning on various downstream tasks, generalist models have made their foray into computer vision. The introduction of the Segment Anything Model (SAM) marked a milestone in the segmentation of natural images, inspiring the design of numerous architectures for medical image segmentation. In this survey, we offer a comprehensive and in-depth investigation of generalist models for medical image segmentation. We begin with an introduction to the fundamental concepts that underpin their development. Then, we provide a taxonomy based on features fusion on the different declinations of SAM in terms of zero-shot, few-shot, fine-tuning, adapters, on SAM2, on other innovative models trained on images alone, and others trained on both text and images. We thoroughly analyze their performances at the level of both primary research and best-in-literature, followed by a rigorous comparison with the state-of-the-art task-specific models. We emphasize the need to address challenges in terms of compliance with regulatory frameworks, privacy and security laws, budget, and trustworthy artificial intelligence (AI). Finally, we share our perspective on future directions concerning synthetic data, early fusion, lessons learnt from generalist models in natural language processing, agentic AI, physical AI, and clinical translation. We publicly release a database-backed interactive app with all survey data (https://hal9000-lab.github.io/GMMIS-Survey/).

Abstract Image

医学图像分割中的通才模型:与特定任务方法的调查和性能比较
随着大型语言模型成功的范式转变,它利用了大量数据的预训练和对各种下游任务的微调,通才模型已经进军计算机视觉领域。分割任意模型(SAM)的引入标志着自然图像分割的一个里程碑,启发了许多医学图像分割架构的设计。在本研究中,我们对医学图像分割的通用模型进行了全面而深入的研究。我们首先介绍支撑其发展的基本概念。然后,我们提供了一个基于特征融合的分类,该分类基于SAM的不同偏角,包括零拍摄、少拍摄、微调、适配器、SAM2、单独训练图像的其他创新模型以及同时训练文本和图像的其他模型。我们在初级研究和最佳文献水平上彻底分析了它们的表现,然后与最先进的任务特定模型进行了严格的比较。我们强调有必要应对遵守监管框架、隐私和安全法律、预算和可信赖人工智能方面的挑战。最后,我们分享了我们对合成数据、早期融合、从自然语言处理、代理人工智能、物理人工智能和临床翻译的通才模型中吸取的教训的未来方向的看法。我们公开发布了一个数据库支持的交互式应用程序,其中包含所有调查数据(https://hal9000-lab.github.io/GMMIS-Survey/)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信