GM-ABS: Promptable Generalist Model Drives Active Barely Supervised Training in Specialist Model for 3D Medical Image Segmentation.

IF 9.8 1区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

IEEE Transactions on Medical Imaging Pub Date : 2025-08-07 DOI:10.1109/tmi.2025.3596850

Zhe Xu,Cheng Chen,Donghuan Lu,Jinghan Sun,Dong Wei,Yefeng Zheng,Quanzheng Li,Raymond Kai-Yu Tong

{"title":"GM-ABS: Promptable Generalist Model Drives Active Barely Supervised Training in Specialist Model for 3D Medical Image Segmentation.","authors":"Zhe Xu,Cheng Chen,Donghuan Lu,Jinghan Sun,Dong Wei,Yefeng Zheng,Quanzheng Li,Raymond Kai-Yu Tong","doi":"10.1109/tmi.2025.3596850","DOIUrl":null,"url":null,"abstract":"Semi-supervised learning (SSL) has greatly advanced 3D medical image segmentation by alleviating the need for intensive labeling by radiologists. While previous efforts focused on model-centric advancements, the emergence of foundational generalist models like the Segment Anything Model (SAM) is expected to reshape the SSL landscape. Although these generalists usually show performance gaps relative to previous specialists in medical imaging, they possess impressive zero-shot segmentation abilities with manual prompts. Thus, this capability could serve as \"free lunch\" for training specialists, offering future SSL a promising data-centric perspective, especially revolutionizing both pseudo and expert labeling strategies to enhance the data pool. In this regard, we propose the Generalist Model-driven Active Barely Supervised (GM-ABS) learning paradigm, for developing specialized 3D segmentation models under extremely limited (barely) annotation budgets, e.g., merely cross-labeling three slices per selected scan. In specific, building upon a basic mean-teacher SSL framework, GM-ABS modernizes the SSL paradigm with two key data-centric designs: (i) Specialist-generalist collaboration, where the in-training specialist leverages class-specific positional prompts derived from class prototypes to interact with the frozen class-agnostic generalist across multiple views to achieve noisy-yet-effective label augmentation. Then, the specialist robustly assimilates the augmented knowledge via noise-tolerant collaborative learning. (ii) Expert-model collaboration that promotes active cross-labeling with notably low labeling efforts. This design progressively furnishes the specialist with informative and efficient supervision via a human-in-the-loop manner, which in turn benefits the quality of class-specific prompts. Extensive experiments on three benchmark datasets highlight the promising performance of GM-ABS over recent SSL approaches under extremely constrained labeling resources.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"736 1","pages":""},"PeriodicalIF":9.8000,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Medical Imaging","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/tmi.2025.3596850","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Semi-supervised learning (SSL) has greatly advanced 3D medical image segmentation by alleviating the need for intensive labeling by radiologists. While previous efforts focused on model-centric advancements, the emergence of foundational generalist models like the Segment Anything Model (SAM) is expected to reshape the SSL landscape. Although these generalists usually show performance gaps relative to previous specialists in medical imaging, they possess impressive zero-shot segmentation abilities with manual prompts. Thus, this capability could serve as "free lunch" for training specialists, offering future SSL a promising data-centric perspective, especially revolutionizing both pseudo and expert labeling strategies to enhance the data pool. In this regard, we propose the Generalist Model-driven Active Barely Supervised (GM-ABS) learning paradigm, for developing specialized 3D segmentation models under extremely limited (barely) annotation budgets, e.g., merely cross-labeling three slices per selected scan. In specific, building upon a basic mean-teacher SSL framework, GM-ABS modernizes the SSL paradigm with two key data-centric designs: (i) Specialist-generalist collaboration, where the in-training specialist leverages class-specific positional prompts derived from class prototypes to interact with the frozen class-agnostic generalist across multiple views to achieve noisy-yet-effective label augmentation. Then, the specialist robustly assimilates the augmented knowledge via noise-tolerant collaborative learning. (ii) Expert-model collaboration that promotes active cross-labeling with notably low labeling efforts. This design progressively furnishes the specialist with informative and efficient supervision via a human-in-the-loop manner, which in turn benefits the quality of class-specific prompts. Extensive experiments on three benchmark datasets highlight the promising performance of GM-ABS over recent SSL approaches under extremely constrained labeling resources.

查看原文本刊更多论文

GM-ABS：提示通才模型驱动积极的几乎没有监督的专家模型训练，用于3D医学图像分割。

半监督学习（SSL）通过减轻放射科医生对密集标记的需要，极大地推进了3D医学图像分割。虽然以前的努力主要集中在以模型为中心的进步上，但像分段任意模型（SAM）这样的基础通才模型的出现有望重塑SSL的格局。虽然这些多面手通常表现出与以前的医学成像专家相比的性能差距，但他们具有令人印象深刻的手动提示的零射击分割能力。因此，这个功能可以作为培训专家的“免费午餐”，为未来的SSL提供了一个有前途的以数据为中心的视角，特别是彻底改变了伪和专家标签策略，以增强数据池。在这方面，我们提出了通用模型驱动的主动几乎没有监督（GM-ABS）学习范式，用于在极其有限的注释预算下开发专门的3D分割模型，例如，每次选择扫描仅交叉标记三个切片。具体而言，GM-ABS基于基本的平均教师SSL框架，通过两个关键的以数据为中心的设计使SSL范式现代化：(i)专家-通才协作，其中培训中的专家利用源自类原型的特定于类的位置提示，跨多个视图与固定的类无关的通才进行交互，以实现嘈杂但有效的标签增强。然后，专家通过容忍噪声的协作学习稳健地吸收增强的知识。（ii）专家模式的合作，促进积极的交叉标记与显著低标记努力。这种设计通过人在循环的方式逐步为专家提供信息丰富和有效的监督，这反过来又有利于班级特定提示的质量。在三个基准数据集上进行的大量实验表明，在极度受限的标记资源下，GM-ABS比最近的SSL方法具有更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Medical Imaging 医学-成像科学与照相技术

CiteScore

21.80

自引率

5.70%

发文量

637

审稿时长

5.6 months

期刊介绍： The IEEE Transactions on Medical Imaging (T-MI) is a journal that welcomes the submission of manuscripts focusing on various aspects of medical imaging. The journal encourages the exploration of body structure, morphology, and function through different imaging techniques, including ultrasound, X-rays, magnetic resonance, radionuclides, microwaves, and optical methods. It also promotes contributions related to cell and molecular imaging, as well as all forms of microscopy. T-MI publishes original research papers that cover a wide range of topics, including but not limited to novel acquisition techniques, medical image processing and analysis, visualization and performance, pattern recognition, machine learning, and other related methods. The journal particularly encourages highly technical studies that offer new perspectives. By emphasizing the unification of medicine, biology, and imaging, T-MI seeks to bridge the gap between instrumentation, hardware, software, mathematics, physics, biology, and medicine by introducing new analysis methods. While the journal welcomes strong application papers that describe novel methods, it directs papers that focus solely on important applications using medically adopted or well-established methods without significant innovation in methodology to other journals. T-MI is indexed in Pubmed® and Medline®, which are products of the United States National Library of Medicine.