Large Language Model-Informed ECG Dual Attention Network for Heart Failure Risk Prediction

IF 5.7 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Big Data Pub Date : 2025-01-30 DOI:10.1109/TBDATA.2025.3536922

Chen Chen;Lei Li;Marcel Beetz;Abhirup Banerjee;Ramneek Gupta;Vicente Grau

{"title":"Large Language Model-Informed ECG Dual Attention Network for Heart Failure Risk Prediction","authors":"Chen Chen;Lei Li;Marcel Beetz;Abhirup Banerjee;Ramneek Gupta;Vicente Grau","doi":"10.1109/TBDATA.2025.3536922","DOIUrl":null,"url":null,"abstract":"Heart failure (HF) poses a significant public health challenge, with a rising global mortality rate. Early detection and prevention of HF could significantly reduce its impact. We introduce a novel methodology for predicting HF risk using 12-lead electrocardiograms (ECGs). We present a novel, lightweight dual attention ECG network designed to capture complex ECG features essential for early HF risk prediction, despite the notable imbalance between low and high-risk groups. This network incorporates a cross-lead attention module and 12 lead-specific temporal attention modules, focusing on cross-lead interactions and each lead's local dynamics. To further alleviate model overfitting, we leverage a large language model (LLM) with a public ECG-Report dataset for pretraining on an ECG-Report alignment task. The network is then fine-tuned for HF risk prediction using two specific cohorts from the U.K. Biobank study, focusing on patients with hypertension (UKB-HYP) and those who have had a myocardial infarction (UKB-MI). The results reveal that LLM-informed pre-training substantially enhances HF risk prediction in these cohorts. The dual attention design not only improves interpretability but also predictive accuracy, outperforming existing competitive methods with C-index scores of 0.6349 for UKB-HYP and 0.5805 for UKB-MI. This demonstrates our method's potential in advancing HF risk assessment with clinical complex ECG data.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"948-960"},"PeriodicalIF":5.7000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10858425","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10858425/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Heart failure (HF) poses a significant public health challenge, with a rising global mortality rate. Early detection and prevention of HF could significantly reduce its impact. We introduce a novel methodology for predicting HF risk using 12-lead electrocardiograms (ECGs). We present a novel, lightweight dual attention ECG network designed to capture complex ECG features essential for early HF risk prediction, despite the notable imbalance between low and high-risk groups. This network incorporates a cross-lead attention module and 12 lead-specific temporal attention modules, focusing on cross-lead interactions and each lead's local dynamics. To further alleviate model overfitting, we leverage a large language model (LLM) with a public ECG-Report dataset for pretraining on an ECG-Report alignment task. The network is then fine-tuned for HF risk prediction using two specific cohorts from the U.K. Biobank study, focusing on patients with hypertension (UKB-HYP) and those who have had a myocardial infarction (UKB-MI). The results reveal that LLM-informed pre-training substantially enhances HF risk prediction in these cohorts. The dual attention design not only improves interpretability but also predictive accuracy, outperforming existing competitive methods with C-index scores of 0.6349 for UKB-HYP and 0.5805 for UKB-MI. This demonstrates our method's potential in advancing HF risk assessment with clinical complex ECG data.

查看原文本刊更多论文

基于大语言模型的心电双注意网络心衰风险预测

心力衰竭（HF）是一项重大的公共卫生挑战，全球死亡率不断上升。早期发现和预防心衰可显著降低其影响。我们介绍了一种使用12导联心电图（ECGs）预测HF风险的新方法。我们提出了一种新颖的、轻量级的双重注意ECG网络，旨在捕捉对早期HF风险预测至关重要的复杂ECG特征，尽管低危组和高危组之间存在明显的不平衡。该网络包含一个跨铅注意模块和12个特定铅的时间注意模块，重点关注跨铅互动和每个铅的本地动态。为了进一步缓解模型过拟合，我们利用大型语言模型（LLM）和公共ECG-Report数据集对ECG-Report对齐任务进行预训练。然后使用来自英国生物银行研究的两个特定队列对该网络进行微调，以进行HF风险预测，重点是高血压患者（UKB-HYP）和心肌梗死患者（UKB-MI）。结果显示，llm预先训练大大提高了这些队列的HF风险预测。双重注意设计不仅提高了可解释性，而且提高了预测准确性，优于现有的竞争方法，UKB-HYP的c指数得分为0.6349，UKB-MI的c指数得分为0.5805。这证明了我们的方法在利用临床复杂心电图数据推进心衰风险评估方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Big Data Multiple-

CiteScore

11.80

自引率

2.80%

发文量

114

期刊介绍： The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.