Disentangled representational learning for anomaly detection in single-lead electrocardiogram signals using variational autoencoder

IF 7 2区医学 Q1 BIOLOGY

Computers in biology and medicine Pub Date : 2024-11-23 DOI:10.1016/j.compbiomed.2024.109422

Maximilian Kapsecker , Matthias C. Möller , Stephan M. Jonas

{"title":"Disentangled representational learning for anomaly detection in single-lead electrocardiogram signals using variational autoencoder","authors":"Maximilian Kapsecker , Matthias C. Möller , Stephan M. Jonas","doi":"10.1016/j.compbiomed.2024.109422","DOIUrl":null,"url":null,"abstract":"<div><div>Wearable technology enables the unsupervised recording of electrocardiogram (ECG) signals. Analyzing these high-dimensional ECG data poses challenges regarding statistical approaches and explainability. This work investigates the feasibility of medically explainable anomaly detection through disentangled representational learning of ECGs and personalization to mitigate inter-subject variations. Five open-source ECG datasets were converted into a set of denoised one-second traces of lead I signal, each covering individual features such as wave morphologies and pathologies. A beta total correlation variational autoencoder was optimized on four of these datasets for 68 systematic parameterization variants. The best-performing model revealed disentanglement in the 12-dimensional embedding space, specifically between atrial- and ventricular features. Within the embedding space, a k-nearest neighbor classifier was evaluated on a left-out test set tailored for anomaly detection. The result is a F1 score of 0.94 for the binary prediction of sinus rhythm and the pathological classes: Left bundle branch block, right bundle branch block, myocardial infarction, and AV block (1st degree). The 90.94% accuracy in anomaly detection falls within the range of established detectors (89.38%–99.77%) but offers the advantage of being explainable and largely unsupervised. Model fine-tuning for each of 100 randomly sampled individuals of the Icentia11k dataset mitigated inter-subject variations. The associated F1 score for predicting normal, premature atrial contraction, and premature ventricular contraction from the embedding space was 0.93. The distribution plots of pathologies along the explainable axis were reasonably consistent with medical expertise. The results suggest the presented disentangled variational autoencoder as a robust method for explainable ECG representation.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"184 ","pages":"Article 109422"},"PeriodicalIF":7.0000,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482524015075","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Wearable technology enables the unsupervised recording of electrocardiogram (ECG) signals. Analyzing these high-dimensional ECG data poses challenges regarding statistical approaches and explainability. This work investigates the feasibility of medically explainable anomaly detection through disentangled representational learning of ECGs and personalization to mitigate inter-subject variations. Five open-source ECG datasets were converted into a set of denoised one-second traces of lead I signal, each covering individual features such as wave morphologies and pathologies. A beta total correlation variational autoencoder was optimized on four of these datasets for 68 systematic parameterization variants. The best-performing model revealed disentanglement in the 12-dimensional embedding space, specifically between atrial- and ventricular features. Within the embedding space, a k-nearest neighbor classifier was evaluated on a left-out test set tailored for anomaly detection. The result is a F1 score of 0.94 for the binary prediction of sinus rhythm and the pathological classes: Left bundle branch block, right bundle branch block, myocardial infarction, and AV block (1st degree). The 90.94% accuracy in anomaly detection falls within the range of established detectors (89.38%–99.77%) but offers the advantage of being explainable and largely unsupervised. Model fine-tuning for each of 100 randomly sampled individuals of the Icentia11k dataset mitigated inter-subject variations. The associated F1 score for predicting normal, premature atrial contraction, and premature ventricular contraction from the embedding space was 0.93. The distribution plots of pathologies along the explainable axis were reasonably consistent with medical expertise. The results suggest the presented disentangled variational autoencoder as a robust method for explainable ECG representation.

查看原文本刊更多论文

利用变异自动编码器对单导联心电图信号进行异常检测的分离表征学习

可穿戴技术实现了心电图（ECG）信号的无监督记录。分析这些高维心电图数据对统计方法和可解释性提出了挑战。这项研究通过对心电图的分离表征学习和个性化以减少受试者之间的差异，探讨了医学上可解释的异常检测的可行性。五个开源心电图数据集被转换成一组去噪的一秒钟 I 导联信号迹线，每个迹线都涵盖了波形和病理等个体特征。在其中四个数据集上对 68 个系统参数化变体的贝塔总相关变异自动编码器进行了优化。表现最好的模型显示了 12 维嵌入空间中的不纠缠，特别是心房和心室特征之间的不纠缠。在嵌入空间内，对专门用于异常检测的留空测试集上的 k 近邻分类器进行了评估。结果显示，窦性心律和病理类别二元预测的 F1 得分为 0.94：左束支传导阻滞、右束支传导阻滞、心肌梗塞和房室传导阻滞（1 度）。90.94% 的异常检测准确率在现有检测器（89.38%-99.77%）的范围内，但具有可解释和基本无监督的优势。对 Icentia11k 数据集中随机抽样的 100 个个体中的每个个体进行模型微调，可减轻受试者之间的差异。从嵌入空间预测正常、房性早搏和室性早搏的相关 F1 得分为 0.93。沿着可解释轴的病理分布图与医学专业知识相当吻合。这些结果表明，所提出的解纠缠变异自动编码器是一种稳健的可解释心电图表示方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers in biology and medicine 工程技术-工程：生物医学

CiteScore

11.70

自引率

10.40%

发文量

1086

审稿时长

74 days

期刊介绍： Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.