SemiSAM+: Rethinking semi-supervised medical image segmentation in the era of foundation models

IF 11.8 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2025-07-25 DOI:10.1016/j.media.2025.103733

Yichi Zhang , Bohao Lv , Le Xue , Wenbo Zhang , Yuchen Liu , Yu Fu , Yuan Cheng , Yuan Qi

{"title":"SemiSAM+: Rethinking semi-supervised medical image segmentation in the era of foundation models","authors":"Yichi Zhang , Bohao Lv , Le Xue , Wenbo Zhang , Yuchen Liu , Yu Fu , Yuan Cheng , Yuan Qi","doi":"10.1016/j.media.2025.103733","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning-based medical image segmentation typically requires large amount of labeled data for training, making it less applicable in clinical settings due to high annotation cost. Semi-supervised learning (SSL) has emerged as an appealing strategy due to its less dependence on acquiring abundant annotations from experts compared to fully supervised methods. Beyond existing model-centric advancements of SSL by designing novel regularization strategies, we anticipate a paradigmatic shift due to the emergence of promptable segmentation foundation models with universal segmentation capabilities using positional prompts represented by Segment Anything Model (SAM). In this paper, we present <strong>SemiSAM+</strong>, a foundation model-driven SSL framework to efficiently learn from limited labeled data for medical image segmentation. SemiSAM+ consists of one or multiple promptable foundation models as <strong>generalist models</strong>, and a trainable task-specific segmentation model as <strong>specialist model</strong>. For a given new segmentation task, the training is based on the specialist–generalist collaborative learning procedure, where the trainable specialist model delivers positional prompts to interact with the frozen generalist models to acquire pseudo-labels, and then the generalist model output provides the specialist model with informative and efficient supervision which benefits the automatic segmentation and prompt generation in turn. Extensive experiments on three public datasets and one in-house clinical dataset demonstrate that SemiSAM+ achieves significant performance improvement, especially under extremely limited annotation scenarios, and shows strong efficiency as a plug-and-play strategy that can be easily adapted to different specialist and generalist models.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103733"},"PeriodicalIF":11.8000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525002804","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning-based medical image segmentation typically requires large amount of labeled data for training, making it less applicable in clinical settings due to high annotation cost. Semi-supervised learning (SSL) has emerged as an appealing strategy due to its less dependence on acquiring abundant annotations from experts compared to fully supervised methods. Beyond existing model-centric advancements of SSL by designing novel regularization strategies, we anticipate a paradigmatic shift due to the emergence of promptable segmentation foundation models with universal segmentation capabilities using positional prompts represented by Segment Anything Model (SAM). In this paper, we present SemiSAM+, a foundation model-driven SSL framework to efficiently learn from limited labeled data for medical image segmentation. SemiSAM+ consists of one or multiple promptable foundation models as generalist models, and a trainable task-specific segmentation model as specialist model. For a given new segmentation task, the training is based on the specialist–generalist collaborative learning procedure, where the trainable specialist model delivers positional prompts to interact with the frozen generalist models to acquire pseudo-labels, and then the generalist model output provides the specialist model with informative and efficient supervision which benefits the automatic segmentation and prompt generation in turn. Extensive experiments on three public datasets and one in-house clinical dataset demonstrate that SemiSAM+ achieves significant performance improvement, especially under extremely limited annotation scenarios, and shows strong efficiency as a plug-and-play strategy that can be easily adapted to different specialist and generalist models.

查看原文本刊更多论文

SemiSAM+：重新思考基础模型时代的半监督医学图像分割

基于深度学习的医学图像分割通常需要大量的标记数据进行训练，由于标注成本高，在临床环境中不太适用。半监督学习（SSL）已经成为一种吸引人的策略，因为与完全监督的方法相比，它较少依赖于从专家那里获得大量的注释。除了现有的以模型为中心的SSL进步（通过设计新颖的正则化策略）之外，我们预计由于使用分段任意模型（SAM）表示的位置提示的具有通用分段功能的提示分段基础模型的出现，将出现范式转变。在本文中，我们提出了SemiSAM+，一个基础模型驱动的SSL框架，可以有效地从有限的标记数据中学习医学图像分割。SemiSAM+由一个或多个可提示的基础模型（通才模型）和一个可训练的特定任务分割模型（专家模型）组成。对于给定的新分词任务，基于专家-通才协同学习过程进行训练，可训练的专家模型提供位置提示与固定的通才模型交互获取伪标签，通才模型输出为专家模型提供信息和有效的监督，从而有利于自动分词和提示的生成。在三个公共数据集和一个内部临床数据集上的大量实验表明，SemiSAM+实现了显着的性能改进，特别是在极其有限的注释场景下，并且作为一种即插即用策略显示出强大的效率，可以很容易地适应不同的专家和通才模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.