An explainable machine learning (XAI) framework to enhance types of cardiovascular disease diagnosis and prognosis.

IF 2 4区医学 Q3 ENGINEERING, BIOMEDICAL

Physical and Engineering Sciences in Medicine Pub Date : 2025-09-30 DOI:10.1007/s13246-025-01653-8

K Adalarasu, B Raghavan, B Madhavan, Sivanandam Venkatesh, Rengarajan Amirtharajan

{"title":"An explainable machine learning (XAI) framework to enhance types of cardiovascular disease diagnosis and prognosis.","authors":"K Adalarasu, B Raghavan, B Madhavan, Sivanandam Venkatesh, Rengarajan Amirtharajan","doi":"10.1007/s13246-025-01653-8","DOIUrl":null,"url":null,"abstract":"<p><p>The World Health Organisation 2024 report shows that Cardiovascular Disease (CVD) is the leading cause of death worldwide, estimated at 17.9 million deaths annually, and its mortality is about 32% of all deaths in the world. Of these, about 85% are myocardial infarctions and strokes. This study aims to diagnose heart disorders by providing early medical intervention to reduce the risks of abnormal heart structures. A data-driven model has been developed to achieve the above aim. The CVD and standard Electrocardiogram (ECG) datasets are extracted from PhysioNet in CSV format. This dataset comprises 305 samples of normal heart function, 15 samples of congestive heart failure, 32 samples of intracardiac atrial fibrillation, and 77 samples of supraventricular arrhythmia. The key steps include preprocessing the raw ECG data, extracting the relevant features, and introducing the input to the Machine Learning (ML) model for training. After preprocessing, ECG characteristic features, viz., mean heart interval, RR interval, p-wave amplitude, q-wave amplitude, r-wave amplitude, t-wave amplitude, and the derived features, namely, root mean square of successive difference (RMSSD), mean standard deviation of the normal-to-normal interval (SDDN), are extracted from the ECG signal and implemented using eXplainable Artificial Intelligence (XAI) methods to expound feature contributions. Various ML algorithms, including ensemble (EN), Naive Bayes (NB), and Support Vector Machine (SVM), are implemented for effectiveness. A tenfold cross-validation and performance are assessed using accuracy and recall analysis. Among these four models, SVM outperforms the other models and feature selection, achieving 99.5% accuracy when considering all features, 77% accuracy for the two derived features, and 99.5% accuracy for ECG wave characteristics features. To address the limitations, such as a small dataset and class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) is applied to further enhance model performance. This study demonstrates the effectiveness of ML models, notably SVM, in predicting CVD abnormalities based on their ECG characteristics. These results suggest that future research should focus on refining methods to identify key features of ECG wave characteristics, potentially streamlining and speeding up the prediction of CVD in real-time. This work utilises XAI techniques to make the models more transparent, understandable and improve model accuracy of 99.8% for SVM. Furthermore, increasing model transparency with XAI might facilitate quicker clinical adoption for the diagnosis of heart disease.</p>","PeriodicalId":48490,"journal":{"name":"Physical and Engineering Sciences in Medicine","volume":" ","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical and Engineering Sciences in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13246-025-01653-8","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

The World Health Organisation 2024 report shows that Cardiovascular Disease (CVD) is the leading cause of death worldwide, estimated at 17.9 million deaths annually, and its mortality is about 32% of all deaths in the world. Of these, about 85% are myocardial infarctions and strokes. This study aims to diagnose heart disorders by providing early medical intervention to reduce the risks of abnormal heart structures. A data-driven model has been developed to achieve the above aim. The CVD and standard Electrocardiogram (ECG) datasets are extracted from PhysioNet in CSV format. This dataset comprises 305 samples of normal heart function, 15 samples of congestive heart failure, 32 samples of intracardiac atrial fibrillation, and 77 samples of supraventricular arrhythmia. The key steps include preprocessing the raw ECG data, extracting the relevant features, and introducing the input to the Machine Learning (ML) model for training. After preprocessing, ECG characteristic features, viz., mean heart interval, RR interval, p-wave amplitude, q-wave amplitude, r-wave amplitude, t-wave amplitude, and the derived features, namely, root mean square of successive difference (RMSSD), mean standard deviation of the normal-to-normal interval (SDDN), are extracted from the ECG signal and implemented using eXplainable Artificial Intelligence (XAI) methods to expound feature contributions. Various ML algorithms, including ensemble (EN), Naive Bayes (NB), and Support Vector Machine (SVM), are implemented for effectiveness. A tenfold cross-validation and performance are assessed using accuracy and recall analysis. Among these four models, SVM outperforms the other models and feature selection, achieving 99.5% accuracy when considering all features, 77% accuracy for the two derived features, and 99.5% accuracy for ECG wave characteristics features. To address the limitations, such as a small dataset and class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) is applied to further enhance model performance. This study demonstrates the effectiveness of ML models, notably SVM, in predicting CVD abnormalities based on their ECG characteristics. These results suggest that future research should focus on refining methods to identify key features of ECG wave characteristics, potentially streamlining and speeding up the prediction of CVD in real-time. This work utilises XAI techniques to make the models more transparent, understandable and improve model accuracy of 99.8% for SVM. Furthermore, increasing model transparency with XAI might facilitate quicker clinical adoption for the diagnosis of heart disease.

查看原文本刊更多论文

一个可解释的机器学习（XAI）框架，以提高心血管疾病的诊断和预后。

世界卫生组织2024年的报告显示，心血管疾病（CVD）是全世界死亡的主要原因，估计每年有1790万人死亡，其死亡率约占世界总死亡人数的32%。其中，约85%是心肌梗死和中风。本研究旨在通过提供早期医疗干预来诊断心脏疾病，以降低心脏结构异常的风险。为了实现上述目标，开发了一个数据驱动模型。CVD和标准心电图（ECG）数据集以CSV格式从PhysioNet提取。该数据集包括305例正常心功能样本、15例充血性心力衰竭样本、32例心内心房颤动样本和77例室上性心律失常样本。关键步骤包括预处理原始心电数据，提取相关特征，并将输入引入机器学习（ML）模型进行训练。预处理后，从心电信号中提取心电特征特征，即平均心电间隔、RR间隔、p波振幅、q波振幅、r波振幅、t波振幅，以及衍生特征，即连续差均方根（RMSSD）、正态间隔平均标准差（SDDN），并利用可解释人工智能（eXplainable Artificial Intelligence， XAI）方法实现，阐述特征贡献。各种ML算法，包括集成（EN），朴素贝叶斯（NB）和支持向量机（SVM），实现了有效性。十倍交叉验证和性能评估使用准确性和召回分析。在这四种模型中，SVM优于其他模型和特征选择，考虑所有特征的准确率达到99.5%，两个衍生特征的准确率达到77%，心电波特征的准确率达到99.5%。针对数据集小、类不平衡等局限性，采用合成少数派过采样技术（SMOTE）进一步提高模型性能。本研究证明了ML模型，特别是SVM，在基于ECG特征预测CVD异常方面的有效性。这些结果表明，未来的研究应侧重于改进方法，以识别心电波特征的关键特征，从而有可能简化和加快CVD的实时预测。这项工作利用XAI技术使模型更加透明，可理解，并将SVM的模型精度提高到99.8%。此外，增加XAI模型的透明度可能会促进更快的临床应用于心脏病的诊断。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Physical and Engineering Sciences in Medicine Multiple-

CiteScore

8.40

自引率

4.50%

发文量

110