{"title":"AEM: An interpretable multi-task multi-modal framework for cardiac disease prediction","authors":"Jiachuan Peng , Marcel Beetz , Abhirup Banerjee , Min Chen , Vicente Grau","doi":"10.1016/j.media.2026.103951","DOIUrl":null,"url":null,"abstract":"<div><div>Cardiovascular disease (CVD) is one of the leading causes of death and illness across the world. Especially, early prediction of heart failure (HF) is complicated due to the heterogeneity of its clinical presentations and symptoms. These challenges underscore the need for a multidisciplinary approach for comprehensive evaluation of cardiac state. To this end, we specifically select electrocardiogram (ECG) and 3D cardiac anatomy for their complementary coverage of cardiac electrical activities and fine-grained structural modeling. Building upon this, we present a novel pre-training framework, named Anatomy-Electrocardiogram Model (AEM), to explore their complex interactions. AEM adopts a multi-task self-supervised scheme that combines a masked reconstruction objective with a cardiac measurement (CM) regression branch to embed cardiac functional priors and structural details. Unlike image-domain models that typically localize the whole heart within the image, our 3D anatomy is background-free and continuous in 3D space. Hence, the model can naturally concentrate on finer structures at the patch level. The further integration with ECG captures functional dynamics through electrical conduction, encapsulating holistic cardiac representations. Extensive experiments are conducted on the multi-modal datasets collected from the UK Biobank, which contain paired biventricular point cloud anatomy and 12-lead ECG data. Our proposed AEM achieves an area under the receiver operating characteristic curve of 0.8192 for incident HF prediction and a concordance index of 0.6976 for survival prediction under linear evaluation, outperforming the state-of-the-art multi-modal methods. Additionally, we study the interpretability of the disease prediction by observing that our model effectively recognizes clinically plausible patterns and exhibits a high association with clinical features.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103951"},"PeriodicalIF":11.8000,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841526000204","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/1/16 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Cardiovascular disease (CVD) is one of the leading causes of death and illness across the world. Especially, early prediction of heart failure (HF) is complicated due to the heterogeneity of its clinical presentations and symptoms. These challenges underscore the need for a multidisciplinary approach for comprehensive evaluation of cardiac state. To this end, we specifically select electrocardiogram (ECG) and 3D cardiac anatomy for their complementary coverage of cardiac electrical activities and fine-grained structural modeling. Building upon this, we present a novel pre-training framework, named Anatomy-Electrocardiogram Model (AEM), to explore their complex interactions. AEM adopts a multi-task self-supervised scheme that combines a masked reconstruction objective with a cardiac measurement (CM) regression branch to embed cardiac functional priors and structural details. Unlike image-domain models that typically localize the whole heart within the image, our 3D anatomy is background-free and continuous in 3D space. Hence, the model can naturally concentrate on finer structures at the patch level. The further integration with ECG captures functional dynamics through electrical conduction, encapsulating holistic cardiac representations. Extensive experiments are conducted on the multi-modal datasets collected from the UK Biobank, which contain paired biventricular point cloud anatomy and 12-lead ECG data. Our proposed AEM achieves an area under the receiver operating characteristic curve of 0.8192 for incident HF prediction and a concordance index of 0.6976 for survival prediction under linear evaluation, outperforming the state-of-the-art multi-modal methods. Additionally, we study the interpretability of the disease prediction by observing that our model effectively recognizes clinically plausible patterns and exhibits a high association with clinical features.
期刊介绍:
Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.