Medical image analysis最新文献

筛选
英文 中文
Learning lifespan brain anatomical correspondence via cortical developmental continuity transfer 通过大脑皮层发育的连续性转移学习生命周期的大脑解剖对应关系。
IF 10.7 1区 医学
Medical image analysis Pub Date : 2024-08-30 DOI: 10.1016/j.media.2024.103328
{"title":"Learning lifespan brain anatomical correspondence via cortical developmental continuity transfer","authors":"","doi":"10.1016/j.media.2024.103328","DOIUrl":"10.1016/j.media.2024.103328","url":null,"abstract":"<div><p>Identifying anatomical correspondences in the human brain throughout the lifespan is an essential prerequisite for studying brain development and aging. But given the tremendous individual variability in cortical folding patterns, the heterogeneity of different neurodevelopmental stages, and the scarce of neuroimaging data, it is difficult to infer reliable lifespan anatomical correspondence at finer scales. To solve this problem, in this work, we take the advantage of the developmental continuity of the cerebral cortex and propose a novel transfer learning strategy: the model is trained from scratch using the age group with the largest sample size, and then is transferred and adapted to the other groups following the cortical developmental trajectory. A novel loss function is designed to ensure that during the transfer process the common patterns will be extracted and preserved, while the group-specific new patterns will be captured. The proposed framework was evaluated using multiple datasets covering four lifespan age groups with 1,000+ brains (from 34 gestational weeks to young adult). Our experimental results show that: 1) the proposed transfer strategy can dramatically improve the model performance on populations (e.g., early neurodevelopment) with very limited number of training samples; and 2) with the transfer learning we are able to robustly infer the complicated many-to-many anatomical correspondences among different brains at different neurodevelopmental stages. (Code will be released soon: <span><span>https://github.com/qidianzl/CDC-transfer</span><svg><path></path></svg></span>).</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142145955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
O-PRESS: Boosting OCT axial resolution with Prior guidance, Recurrence, and Equivariant Self-Supervision O-PRESS:利用先验引导、复现和等变量自我监督提高 OCT 轴向分辨率
IF 10.7 1区 医学
Medical image analysis Pub Date : 2024-08-28 DOI: 10.1016/j.media.2024.103319
{"title":"O-PRESS: Boosting OCT axial resolution with Prior guidance, Recurrence, and Equivariant Self-Supervision","authors":"","doi":"10.1016/j.media.2024.103319","DOIUrl":"10.1016/j.media.2024.103319","url":null,"abstract":"<div><p>Optical coherence tomography (OCT) is a noninvasive technology that enables real-time imaging of tissue microanatomies. The axial resolution of OCT is intrinsically constrained by the spectral bandwidth of the employed light source while maintaining a fixed center wavelength for a specific application. Physically extending this bandwidth faces strong limitations and requires a substantial cost. We present a novel computational approach, called as <strong>O-PRESS</strong>, for boosting the axial resolution of <strong>O</strong>CT with <strong>P</strong>rior guidance, a <strong>R</strong>ecurrent mechanism, and <strong>E</strong>quivariant <strong>S</strong>elf-<strong>S</strong>upervision. Diverging from conventional deconvolution methods that rely on physical models or data-driven techniques, our method seamlessly integrates OCT modeling and deep learning, enabling us to achieve real-time axial-resolution enhancement exclusively from measurements without a need for paired images. Our approach solves two primary tasks of resolution enhancement and noise reduction with one treatment. Both tasks are executed in a self-supervised manner, with equivariance imaging and free space priors guiding their respective processes. Experimental evaluations, encompassing both quantitative metrics and visual assessments, consistently verify the efficacy and superiority of our approach, which exhibits performance on par with fully supervised methods. Importantly, the robustness of our model is affirmed, showcasing its dual capability to enhance axial resolution while concurrently improving the signal-to-noise ratio.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142171747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Domain adaptive noise reduction with iterative knowledge transfer and style generalization learning 利用迭代知识转移和风格泛化学习实现领域自适应降噪
IF 10.7 1区 医学
Medical image analysis Pub Date : 2024-08-24 DOI: 10.1016/j.media.2024.103327
{"title":"Domain adaptive noise reduction with iterative knowledge transfer and style generalization learning","authors":"","doi":"10.1016/j.media.2024.103327","DOIUrl":"10.1016/j.media.2024.103327","url":null,"abstract":"<div><p>Low-dose computed tomography (LDCT) denoising tasks face significant challenges in practical imaging scenarios. Supervised methods encounter difficulties in real-world scenarios as there are no paired data for training. Moreover, when applied to datasets with varying noise patterns, these methods may experience decreased performance owing to the domain gap. Conversely, unsupervised methods do not require paired data and can be directly trained on real-world data. However, they often exhibit inferior performance compared to supervised methods. To address this issue, it is necessary to leverage the strengths of these supervised and unsupervised methods. In this paper, we propose a novel domain adaptive noise reduction framework (DANRF), which integrates both knowledge transfer and style generalization learning to effectively tackle the domain gap problem. Specifically, an iterative knowledge transfer method with knowledge distillation is selected to train the target model using unlabeled target data and a pre-trained source model trained with paired simulation data. Meanwhile, we introduce the mean teacher mechanism to update the source model, enabling it to adapt to the target domain. Furthermore, an iterative style generalization learning process is also designed to enrich the style diversity of the training dataset. We evaluate the performance of our approach through experiments conducted on multi-source datasets. The results demonstrate the feasibility and effectiveness of our proposed DANRF model in multi-source LDCT image processing tasks. Given its hybrid nature, which combines the advantages of supervised and unsupervised learning, and its ability to bridge domain gaps, our approach is well-suited for improving practical low-dose CT imaging in clinical settings. Code for our proposed approach is publicly available at <span><span>https://github.com/tyfeiii/DANRF</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142076894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VSmTrans: A hybrid paradigm integrating self-attention and convolution for 3D medical image segmentation VSmTrans:用于三维医学图像分割的自注意和卷积混合范式
IF 10.7 1区 医学
Medical image analysis Pub Date : 2024-08-24 DOI: 10.1016/j.media.2024.103295
{"title":"VSmTrans: A hybrid paradigm integrating self-attention and convolution for 3D medical image segmentation","authors":"","doi":"10.1016/j.media.2024.103295","DOIUrl":"10.1016/j.media.2024.103295","url":null,"abstract":"<div><h3>Purpose</h3><p>Vision Transformers recently achieved a competitive performance compared with CNNs due to their excellent capability of learning global representation. However, there are two major challenges when applying them to 3D image segmentation: i) Because of the large size of 3D medical images, comprehensive global information is hard to capture due to the enormous computational costs. ii) Insufficient local inductive bias in Transformers affects the ability to segment detailed features such as ambiguous and subtly defined boundaries. Hence, to apply the Vision Transformer mechanism in the medical image segmentation field, the above challenges need to be overcome adequately.</p></div><div><h3>Methods</h3><p>We propose a hybrid paradigm, called Variable-Shape Mixed Transformer (VSmTrans), that integrates self-attention and convolution and can enjoy the benefits of free learning of both complex relationships from the self-attention mechanism and the local prior knowledge from convolution. Specifically, we designed a Variable-Shape self-attention mechanism, which can rapidly expand the receptive field without extra computing cost and achieve a good trade-off between global awareness and local details. In addition, the parallel convolution paradigm introduces strong local inductive bias to facilitate the ability to excavate details. Meanwhile, a pair of learnable parameters can automatically adjust the importance of the above two paradigms. Extensive experiments were conducted on two public medical image datasets with different modalities: the AMOS CT dataset and the BraTS2021 MRI dataset.</p></div><div><h3>Results</h3><p>Our method achieves the best average Dice scores of 88.3 % and 89.7 % on these datasets, which are superior to the previous state-of-the-art Swin Transformer-based and CNN-based architectures. A series of ablation experiments were also conducted to verify the efficiency of the proposed hybrid mechanism and the components and explore the effectiveness of those key parameters in VSmTrans.</p></div><div><h3>Conclusions</h3><p>The proposed hybrid Transformer-based backbone network for 3D medical image segmentation can tightly integrate self-attention and convolution to exploit the advantages of these two paradigms. The experimental results demonstrate our method's superiority compared to other state-of-the-art methods. The hybrid paradigm seems to be most appropriate to the medical image segmentation field. The ablation experiments also demonstrate that the proposed hybrid mechanism can effectively balance large receptive fields with local inductive biases, resulting in highly accurate segmentation results, especially in capturing details. Our code is available at https://github.com/qingze-bai/VSmTrans.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142097786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Metadata-conditioned generative models to synthesize anatomically-plausible 3D brain MRIs 以元数据为条件的生成模型,合成解剖学上可信的三维大脑 MRI 图像
IF 10.7 1区 医学
Medical image analysis Pub Date : 2024-08-24 DOI: 10.1016/j.media.2024.103325
{"title":"Metadata-conditioned generative models to synthesize anatomically-plausible 3D brain MRIs","authors":"","doi":"10.1016/j.media.2024.103325","DOIUrl":"10.1016/j.media.2024.103325","url":null,"abstract":"<div><p>Recent advances in generative models have paved the way for enhanced generation of natural and medical images, including synthetic brain MRIs. However, the mainstay of current AI research focuses on optimizing synthetic MRIs with respect to visual quality (such as signal-to-noise ratio) while lacking insights into their relevance to neuroscience. To generate high-quality T1-weighted MRIs relevant for neuroscience discovery, we present a two-stage Diffusion Probabilistic Model (called BrainSynth) to synthesize high-resolution MRIs conditionally-dependent on metadata (such as age and sex). We then propose a novel procedure to assess the quality of BrainSynth according to how well its synthetic MRIs capture macrostructural properties of brain regions and how accurately they encode the effects of age and sex. Results indicate that more than half of the brain regions in our synthetic MRIs are anatomically plausible, i.e., the effect size between real and synthetic MRIs is small relative to biological factors such as age and sex. Moreover, the anatomical plausibility varies across cortical regions according to their geometric complexity. As is, the MRIs generated by BrainSynth significantly improve the training of a predictive model to identify accelerated aging effects in an independent study. These results indicate that our model accurately capture the brain’s anatomical information and thus could enrich the data of underrepresented samples in a study. The code of BrainSynth will be released as part of the MONAI project at <span><span>https://github.com/Project-MONAI/GenerativeModels</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142088460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Establishing group-level brain structural connectivity incorporating anatomical knowledge under latent space modeling 在潜在空间建模下建立包含解剖学知识的群体级大脑结构连接。
IF 10.7 1区 医学
Medical image analysis Pub Date : 2024-08-23 DOI: 10.1016/j.media.2024.103309
{"title":"Establishing group-level brain structural connectivity incorporating anatomical knowledge under latent space modeling","authors":"","doi":"10.1016/j.media.2024.103309","DOIUrl":"10.1016/j.media.2024.103309","url":null,"abstract":"<div><p>Brain structural connectivity, capturing the white matter fiber tracts among brain regions inferred by diffusion MRI (dMRI), provides a unique characterization of brain anatomical organization. One fundamental question to address with structural connectivity is how to properly summarize and perform statistical inference for a group-level connectivity architecture, for instance, under different sex groups, or disease cohorts. Existing analyses commonly summarize group-level brain connectivity by a simple entry-wise sample mean or median across individual brain connectivity matrices. However, such a heuristic approach fully ignores the associations among structural connections and the topological properties of brain networks. In this project, we propose a latent space-based generative network model to estimate group-level brain connectivity. Within our modeling framework, we incorporate the anatomical information of brain regions as the attributes of nodes to enhance the plausibility of our estimation and improve biological interpretation. We name our method the attributes-informed brain connectivity (ABC) model, which compared with existing group-level connectivity estimations, (1) offers an interpretable latent space representation of the group-level connectivity, (2) incorporates the anatomical knowledge of nodes and tests its co-varying relationship with connectivity and (3) quantifies the uncertainty and evaluates the likelihood of the estimated group-level effects against chance. We devise a novel Bayesian MCMC algorithm to estimate the model. We evaluate the performance of our model through extensive simulations. By applying the ABC model to study brain structural connectivity stratified by sex among Alzheimer’s Disease (AD) subjects and healthy controls incorporating the anatomical attributes (volume, thickness and area) on nodes, our method shows superior predictive power on out-of-sample structural connectivity and identifies meaningful sex-specific network neuromarkers for AD.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142145953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vessel-promoted OCT to OCTA image translation by heuristic contextual constraints 通过启发式上下文约束将血管促进的 OCT 图像转换为 OCTA 图像
IF 10.7 1区 医学
Medical image analysis Pub Date : 2024-08-23 DOI: 10.1016/j.media.2024.103311
{"title":"Vessel-promoted OCT to OCTA image translation by heuristic contextual constraints","authors":"","doi":"10.1016/j.media.2024.103311","DOIUrl":"10.1016/j.media.2024.103311","url":null,"abstract":"<div><p>Optical Coherence Tomography Angiography (OCTA) is a crucial tool in the clinical screening of retinal diseases, allowing for accurate 3D imaging of blood vessels through non-invasive scanning. However, the hardware-based approach for acquiring OCTA images presents challenges due to the need for specialized sensors and expensive devices. In this paper, we introduce a novel method called TransPro, which can translate the readily available 3D Optical Coherence Tomography (OCT) images into 3D OCTA images without requiring any additional hardware modifications. Our TransPro method is primarily driven by two novel ideas that have been overlooked by prior work. The first idea is derived from a critical observation that the OCTA projection map is generated by averaging pixel values from its corresponding B-scans along the Z-axis. Hence, we introduce a hybrid architecture incorporating a 3D adversarial generative network and a novel <strong>H</strong>euristic <strong>C</strong>ontextual <strong>G</strong>uidance <strong>(HCG)</strong> module, which effectively maintains the consistency of the generated OCTA images between 3D volumes and projection maps. The second idea is to improve the vessel quality in the translated OCTA projection maps. As a result, we propose a novel <strong>V</strong>essel <strong>P</strong>romoted <strong>G</strong>uidance <strong>(VPG)</strong> module to enhance the attention of network on retinal vessels. Experimental results on two datasets demonstrate that our TransPro outperforms state-of-the-art approaches, with relative improvements around 11.4% in MAE, 2.7% in PSNR, 2% in SSIM, 40% in VDE, and 9.1% in VDC compared to the baseline method. The code is available at: <span><span>https://github.com/ustlsh/TransPro</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142097094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation 3DSAM-adapter:从二维到三维的 SAM 整体适配,可及时进行肿瘤分割
IF 10.7 1区 医学
Medical image analysis Pub Date : 2024-08-23 DOI: 10.1016/j.media.2024.103324
{"title":"3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation","authors":"","doi":"10.1016/j.media.2024.103324","DOIUrl":"10.1016/j.media.2024.103324","url":null,"abstract":"<div><p>Despite that the segment anything model (SAM) achieved impressive results on general-purpose semantic segmentation with strong generalization ability on daily images, its demonstrated performance on medical image segmentation is less precise and unstable, especially when dealing with tumor segmentation tasks that involve objects of small sizes, irregular shapes, and low contrast. Notably, the original SAM architecture is designed for 2D natural images and, therefore would not be able to extract the 3D spatial information from volumetric medical data effectively. In this paper, we propose a novel adaptation method for transferring SAM from 2D to 3D for promptable medical image segmentation. Through a holistically designed scheme for architecture modification, we transfer the SAM to support volumetric inputs while retaining the majority of its pre-trained parameters for reuse. The fine-tuning process is conducted in a parameter-efficient manner, wherein most of the pre-trained parameters remain frozen, and only a few lightweight spatial adapters are introduced and tuned. Regardless of the domain gap between natural and medical data and the disparity in the spatial arrangement between 2D and 3D, the transformer trained on natural images can effectively capture the spatial patterns present in volumetric medical images with only lightweight adaptations. We conduct experiments on four open-source tumor segmentation datasets, and with a single click prompt, our model can outperform domain state-of-the-art medical image segmentation models and interactive segmentation models. We also compared our adaptation method with existing popular adapters and observed significant performance improvement on most datasets. Our code and models are available at: <span><span>https://github.com/med-air/3DSAM-adapter</span><svg><path></path></svg></span></p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142097787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive dynamic inference for few-shot left atrium segmentation 自适应动态推理的左心房少拍分割技术
IF 10.7 1区 医学
Medical image analysis Pub Date : 2024-08-23 DOI: 10.1016/j.media.2024.103321
{"title":"Adaptive dynamic inference for few-shot left atrium segmentation","authors":"","doi":"10.1016/j.media.2024.103321","DOIUrl":"10.1016/j.media.2024.103321","url":null,"abstract":"<div><p>Accurate segmentation of the left atrium (LA) from late gadolinium-enhanced cardiac magnetic resonance (LGE CMR) images is crucial for aiding the treatment of patients with atrial fibrillation. Few-shot learning holds significant potential for achieving accurate LA segmentation with low demand on high-cost labeled LGE CMR data and fast generalization across different centers. However, accurate LA segmentation with few-shot learning is a challenging task due to the low-intensity contrast between the LA and other neighboring organs in LGE CMR images. To address this issue, we propose an Adaptive Dynamic Inference Network (ADINet) that explicitly models the differences between the foreground and background. Specifically, ADINet leverages dynamic collaborative inference (DCI) and dynamic reverse inference (DRI) to adaptively allocate semantic-aware and spatial-specific convolution weights and indication information. These allocations are conditioned on the support foreground and background knowledge, utilizing pixel-wise correlations, for different spatial positions of query images. The convolution weights adapt to different visual patterns based on spatial positions, enabling effective encoding of differences between foreground and background regions. Meanwhile, the indication information adapts to the background visual pattern to reversely decode foreground LA regions, leveraging their spatial complementarity. To promote the learning of ADINet, we propose hierarchical supervision, which enforces spatial consistency and differences between the background and foreground regions through pixel-wise semantic supervision and pixel-pixel correlation supervision. We demonstrated the performance of ADINet on three LGE CMR datasets from different centers. Compared to state-of-the-art methods with ten available samples, ADINet yielded better segmentation performance in terms of four metrics.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142083434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MA-SAM: Modality-agnostic SAM adaptation for 3D medical image segmentation MA-SAM:用于三维医学图像分割的模式识别 SAM 适应技术
IF 10.7 1区 医学
Medical image analysis Pub Date : 2024-08-22 DOI: 10.1016/j.media.2024.103310
{"title":"MA-SAM: Modality-agnostic SAM adaptation for 3D medical image segmentation","authors":"","doi":"10.1016/j.media.2024.103310","DOIUrl":"10.1016/j.media.2024.103310","url":null,"abstract":"<div><p>The Segment Anything Model (SAM), a foundation model for general image segmentation, has demonstrated impressive zero-shot performance across numerous natural image segmentation tasks. However, SAM’s performance significantly declines when applied to medical images, primarily due to the substantial disparity between natural and medical image domains. To effectively adapt SAM to medical images, it is important to incorporate critical third-dimensional information, i.e., volumetric or temporal knowledge, during fine-tuning. Simultaneously, we aim to harness SAM’s pre-trained weights within its original 2D backbone to the fullest extent. In this paper, we introduce a modality-agnostic SAM adaptation framework, named as MA-SAM, that is applicable to various volumetric and video medical data. Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments while preserving the majority of SAM’s pre-trained weights. By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data. We comprehensively evaluate our method on five medical image segmentation tasks, by using 11 public datasets across CT, MRI, and surgical video data. Remarkably, without using any prompt, our method consistently outperforms various state-of-the-art 3D approaches, surpassing nnU-Net by 0.9%, 2.6%, and 9.9% in Dice for CT multi-organ segmentation, MRI prostate segmentation, and surgical scene segmentation respectively. Our model also demonstrates strong generalization, and excels in challenging tumor segmentation when prompts are used. Our code is available at: <span><span>https://github.com/cchen-cc/MA-SAM</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142045900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信