SAM2Med3D: Leveraging video foundation models for 3D breast MRI segmentation

IF 2.8 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk Pub Date : 2025-08-19 DOI:10.1016/j.cag.2025.104341

Ying Chen , Wenjing Cui , Xiaoyan Dong , Shuai Zhou , Zhongqiu Wang

{"title":"SAM2Med3D: Leveraging video foundation models for 3D breast MRI segmentation","authors":"Ying Chen , Wenjing Cui , Xiaoyan Dong , Shuai Zhou , Zhongqiu Wang","doi":"10.1016/j.cag.2025.104341","DOIUrl":null,"url":null,"abstract":"<div><div>Foundation models such as the Segment Anything Model 2 (SAM2) have demonstrated impressive generalization across natural image domains. However, their potential in volumetric medical imaging remains largely underexplored, particularly under limited data conditions. In this paper, we present SAM2Med3D, a novel multi-stage framework that adapts a general-purpose video foundation model for accurate and consistent 3D breast MRI segmentation by treating 3D MRI scan as a sequence of images. Unlike existing image-based approaches (e.g., MedSAM) that require large-scale medical data for fine-tuning, our method combines a lightweight, task-specific segmentation network with a video foundation model, achieving strong performance with only modest training data. To guide the foundation model effectively, we introduce a novel spatial filtering strategy that identifies reliable slices from the initial segmentation to serve as high-quality prompts. Additionally, we propose a confidence-driven fusion mechanism that adaptively integrates coarse and refined predictions across the volume, mitigating segmentation drift and ensuring both local accuracy and global volumetric consistency. We validate SAM2Med3D on two multi-center breast MRI datasets, including both public and self-collected datasets. Experimental results demonstrate that our method outperforms both task-specific segmentation networks and recent foundation-model-based methods, achieving superior accuracy and inter-slice consistency.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104341"},"PeriodicalIF":2.8000,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849325001827","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Foundation models such as the Segment Anything Model 2 (SAM2) have demonstrated impressive generalization across natural image domains. However, their potential in volumetric medical imaging remains largely underexplored, particularly under limited data conditions. In this paper, we present SAM2Med3D, a novel multi-stage framework that adapts a general-purpose video foundation model for accurate and consistent 3D breast MRI segmentation by treating 3D MRI scan as a sequence of images. Unlike existing image-based approaches (e.g., MedSAM) that require large-scale medical data for fine-tuning, our method combines a lightweight, task-specific segmentation network with a video foundation model, achieving strong performance with only modest training data. To guide the foundation model effectively, we introduce a novel spatial filtering strategy that identifies reliable slices from the initial segmentation to serve as high-quality prompts. Additionally, we propose a confidence-driven fusion mechanism that adaptively integrates coarse and refined predictions across the volume, mitigating segmentation drift and ensuring both local accuracy and global volumetric consistency. We validate SAM2Med3D on two multi-center breast MRI datasets, including both public and self-collected datasets. Experimental results demonstrate that our method outperforms both task-specific segmentation networks and recent foundation-model-based methods, achieving superior accuracy and inter-slice consistency.

查看原文本刊更多论文

SAM2Med3D：利用视频基础模型进行3D乳房MRI分割

诸如分段任意模型2 （SAM2）之类的基础模型已经在自然图像域上展示了令人印象深刻的泛化。然而，它们在体积医学成像方面的潜力在很大程度上仍未得到充分开发，特别是在有限的数据条件下。在本文中，我们提出了SAM2Med3D，这是一个新的多阶段框架，它通过将3D MRI扫描视为一系列图像来适应通用视频基础模型，从而实现准确和一致的3D乳房MRI分割。与现有的基于图像的方法（例如MedSAM）不同，该方法需要大规模的医疗数据进行微调，我们的方法将轻量级的任务特定分割网络与视频基础模型相结合，仅使用适度的训练数据即可实现强大的性能。为了有效地指导基础模型，我们引入了一种新的空间滤波策略，从初始分割中识别可靠的切片，作为高质量的提示。此外，我们提出了一种信心驱动的融合机制，该机制自适应地集成了整个体积的粗预测和精预测，减轻了分割漂移，并确保了局部精度和全局体积一致性。我们在两个多中心乳房MRI数据集上验证了SAM2Med3D，包括公共数据集和自收集数据集。实验结果表明，我们的方法优于任务特定分割网络和最近基于基础模型的方法，实现了更高的精度和片间一致性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Graphics-Uk 工程技术-计算机：软件工程

CiteScore

5.30

自引率

12.00%

发文量

173

审稿时长

38 days

期刊介绍： Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.