{"title":"Towards a CMR Foundation Model for Multi-Task Cardiac Image Analysis.","authors":"Athira J Jacob, Indraneel Borgohain, Teodora Chitiboi, Puneet Sharma, Dorin Comaniciu, Daniel Rueckert","doi":"10.1016/j.jocmr.2025.101967","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Cardiac magnetic resonance (CMR) is a complex imaging modality requiring a broad variety of image processing tasks for comprehensive assessment of the study. Recently, foundation models (FM) have shown promise for automated image analyses in natural images (NI). In this study, a CMR-specific vision FM was developed and then finetuned in a supervised manner for 9 different imaging tasks typical to a CMR workflow, including classification, segmentation, landmark localization, and pathology detection.</p><p><strong>Methods: </strong>A ViT-S/8 model was trained in a self-supervised manner using DINO on 36 million CMR images from 27,524 subjects from three sources (UK Biobank and two clinical centers). The model was then finetuned for 9 tasks: classification (sequence, cine view), segmentation (cine SAX, cine LAX, LGE SAX, Mapping SAX), landmark localization, pathology detection (LGE, cardiac disease), on data from various sources (both public and 3 clinical datasets). The results were compared against metrics from state-of-the-art methods on the same tasks. A comparable baseline model was also trained on the same datasets for direct comparison. Additionally, the effect of pretraining strategy, as well as generalization and few-shot performance (training on few labeled samples) were explored for the pretrained model, compared to the baseline.</p><p><strong>Results: </strong>The proposed model obtained similar performance or moderate improvements to results reported in the literature in most tasks (except disease detection), without any task-specific optimization of methodology. The proposed model outperformed the baseline in most cases, with an average increase of 6.8 percentage points (pp) for cine view classification, and 0.1 to 1.8 pp for segmentation tasks. The proposed method also obtained generally lower standard deviations in the metrics. Improvements of 3.7 and 6.6 pp for hyperenhancement detection from LGE and 14 pp for disease detection were observed. Ablation studies highlighted the importance of pretraining strategy, architecture and the impact of domain shifts from pretraining to finetuning. Moreover, CMR-pretrained model achieved better generalization and few-shot performance compared to the baseline.</p><p><strong>Conclusions: </strong>Vision FM specialized for medical imaging can improve accuracy and robustness over NI-FM. Self-supervised pretraining offers a resource-efficient, unified framework for CMR assessment, with the potential to accelerate the development of deep learning-based solutions for image analysis tasks, even with few annotated data available.</p>","PeriodicalId":15221,"journal":{"name":"Journal of Cardiovascular Magnetic Resonance","volume":" ","pages":"101967"},"PeriodicalIF":6.1000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cardiovascular Magnetic Resonance","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jocmr.2025.101967","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Cardiac magnetic resonance (CMR) is a complex imaging modality requiring a broad variety of image processing tasks for comprehensive assessment of the study. Recently, foundation models (FM) have shown promise for automated image analyses in natural images (NI). In this study, a CMR-specific vision FM was developed and then finetuned in a supervised manner for 9 different imaging tasks typical to a CMR workflow, including classification, segmentation, landmark localization, and pathology detection.
Methods: A ViT-S/8 model was trained in a self-supervised manner using DINO on 36 million CMR images from 27,524 subjects from three sources (UK Biobank and two clinical centers). The model was then finetuned for 9 tasks: classification (sequence, cine view), segmentation (cine SAX, cine LAX, LGE SAX, Mapping SAX), landmark localization, pathology detection (LGE, cardiac disease), on data from various sources (both public and 3 clinical datasets). The results were compared against metrics from state-of-the-art methods on the same tasks. A comparable baseline model was also trained on the same datasets for direct comparison. Additionally, the effect of pretraining strategy, as well as generalization and few-shot performance (training on few labeled samples) were explored for the pretrained model, compared to the baseline.
Results: The proposed model obtained similar performance or moderate improvements to results reported in the literature in most tasks (except disease detection), without any task-specific optimization of methodology. The proposed model outperformed the baseline in most cases, with an average increase of 6.8 percentage points (pp) for cine view classification, and 0.1 to 1.8 pp for segmentation tasks. The proposed method also obtained generally lower standard deviations in the metrics. Improvements of 3.7 and 6.6 pp for hyperenhancement detection from LGE and 14 pp for disease detection were observed. Ablation studies highlighted the importance of pretraining strategy, architecture and the impact of domain shifts from pretraining to finetuning. Moreover, CMR-pretrained model achieved better generalization and few-shot performance compared to the baseline.
Conclusions: Vision FM specialized for medical imaging can improve accuracy and robustness over NI-FM. Self-supervised pretraining offers a resource-efficient, unified framework for CMR assessment, with the potential to accelerate the development of deep learning-based solutions for image analysis tasks, even with few annotated data available.
期刊介绍:
Journal of Cardiovascular Magnetic Resonance (JCMR) publishes high-quality articles on all aspects of basic, translational and clinical research on the design, development, manufacture, and evaluation of cardiovascular magnetic resonance (CMR) methods applied to the cardiovascular system. Topical areas include, but are not limited to:
New applications of magnetic resonance to improve the diagnostic strategies, risk stratification, characterization and management of diseases affecting the cardiovascular system.
New methods to enhance or accelerate image acquisition and data analysis.
Results of multicenter, or larger single-center studies that provide insight into the utility of CMR.
Basic biological perceptions derived by CMR methods.