MA-SAM: A Multi-Atlas Guided SAM Using Pseudo Mask Prompts Without Manual Annotation for Spine Image Segmentation

IEEE transactions on medical imaging Pub Date : 2025-01-01 DOI:10.1109/TMI.2024.3524570

Dingwei Fan;Junyong Zhao;Chunlin Li;Xinlong Wang;Ronghan Zhang;Qi Zhu;Mingliang Wang;Haipeng Si;Daoqiang Zhang;Liang Sun

{"title":"MA-SAM: A Multi-Atlas Guided SAM Using Pseudo Mask Prompts Without Manual Annotation for Spine Image Segmentation","authors":"Dingwei Fan;Junyong Zhao;Chunlin Li;Xinlong Wang;Ronghan Zhang;Qi Zhu;Mingliang Wang;Haipeng Si;Daoqiang Zhang;Liang Sun","doi":"10.1109/TMI.2024.3524570","DOIUrl":null,"url":null,"abstract":"Accurate spine segmentation is crucial in clinical diagnosis and treatment of spine diseases. However, due to the complexity of spine anatomical structure, it has remained a challenging task to accurately segment spine images. Recently, the segment anything model (SAM) has achieved superior performance for image segmentation. However, generating high-quality points and boxes is still laborious for high-dimensional medical images. Meanwhile, an accurate mask is difficult to obtain. To address these issues, in this paper, we propose a multi-atlas guided SAM using multiple pseudo mask prompts for spine image segmentation, called MA-SAM. Specifically, we first design a multi-atlas prompt generation sub-network to obtain the anatomical structure prompts. More specifically, we use a network to obtain coarse mask of the input image. Then atlas label maps are registered to the coarse mask. Subsequently, a SAM-based segmentation sub-network is used to segment images. Specifically, we first utilize adapters to fine-tune the image encoder. Meanwhile, we use a prompt encoder to learn the anatomical structure prior knowledge from the multi-atlas prompts. Finally, a mask decoder is used to fuse the image and prompt features to obtain the segmentation results. Moreover, to boost the segmentation performance, different scale features from the prompt encoder are concatenated to the Upsample Block in the mask decoder. We validate our MA-SAM on the two spine segmentation tasks, including spine anatomical structure segmentation with CT images and lumbosacral plexus segmentation with MR images. Experiment results suggest that our method achieves better segmentation performance than SAM with points, boxes, and mask prompts.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 5","pages":"2157-2169"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10819446/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate spine segmentation is crucial in clinical diagnosis and treatment of spine diseases. However, due to the complexity of spine anatomical structure, it has remained a challenging task to accurately segment spine images. Recently, the segment anything model (SAM) has achieved superior performance for image segmentation. However, generating high-quality points and boxes is still laborious for high-dimensional medical images. Meanwhile, an accurate mask is difficult to obtain. To address these issues, in this paper, we propose a multi-atlas guided SAM using multiple pseudo mask prompts for spine image segmentation, called MA-SAM. Specifically, we first design a multi-atlas prompt generation sub-network to obtain the anatomical structure prompts. More specifically, we use a network to obtain coarse mask of the input image. Then atlas label maps are registered to the coarse mask. Subsequently, a SAM-based segmentation sub-network is used to segment images. Specifically, we first utilize adapters to fine-tune the image encoder. Meanwhile, we use a prompt encoder to learn the anatomical structure prior knowledge from the multi-atlas prompts. Finally, a mask decoder is used to fuse the image and prompt features to obtain the segmentation results. Moreover, to boost the segmentation performance, different scale features from the prompt encoder are concatenated to the Upsample Block in the mask decoder. We validate our MA-SAM on the two spine segmentation tasks, including spine anatomical structure segmentation with CT images and lumbosacral plexus segmentation with MR images. Experiment results suggest that our method achieves better segmentation performance than SAM with points, boxes, and mask prompts.

查看原文本刊更多论文

MA-SAM：一种使用伪掩码提示的多图谱引导SAM，无需手动注释用于脊柱图像分割

准确的脊柱分割对临床诊断和治疗脊柱疾病至关重要。然而，由于脊柱解剖结构的复杂性，准确分割脊柱图像一直是一项具有挑战性的任务。近年来，分段任意模型（SAM）在图像分割方面取得了较好的效果。然而，对于高维医学图像，生成高质量的点和框仍然很费力。同时，难以获得精确的掩模。为了解决这些问题，在本文中，我们提出了一种使用多个伪掩码提示进行脊柱图像分割的多图谱引导SAM，称为MA-SAM。具体而言，我们首先设计了一个多图谱提示生成子网络来获取解剖结构提示。更具体地说，我们使用网络来获得输入图像的粗掩码。然后将地图集标签图注册到粗掩码中。然后，利用基于sam的分割子网络对图像进行分割。具体来说，我们首先利用适配器对图像编码器进行微调。同时，我们使用提示编码器从多图谱提示中学习解剖结构先验知识。最后，利用掩码解码器对图像进行融合并提示特征，得到分割结果。此外，为了提高分割性能，提示编码器的不同尺度特征被连接到掩码解码器的上采样块中。我们在两个脊柱分割任务上验证了我们的MA-SAM，包括CT图像的脊柱解剖结构分割和MR图像的腰骶神经丛分割。实验结果表明，该方法比使用点、框和掩码提示的SAM具有更好的分割性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on medical imaging

自引率

0.00%

发文量