K Adalarasu, B Raghavan, B Madhavan, Sivanandam Venkatesh, Rengarajan Amirtharajan
{"title":"An explainable machine learning (XAI) framework to enhance types of cardiovascular disease diagnosis and prognosis.","authors":"K Adalarasu, B Raghavan, B Madhavan, Sivanandam Venkatesh, Rengarajan Amirtharajan","doi":"10.1007/s13246-025-01653-8","DOIUrl":null,"url":null,"abstract":"<p><p>The World Health Organisation 2024 report shows that Cardiovascular Disease (CVD) is the leading cause of death worldwide, estimated at 17.9 million deaths annually, and its mortality is about 32% of all deaths in the world. Of these, about 85% are myocardial infarctions and strokes. This study aims to diagnose heart disorders by providing early medical intervention to reduce the risks of abnormal heart structures. A data-driven model has been developed to achieve the above aim. The CVD and standard Electrocardiogram (ECG) datasets are extracted from PhysioNet in CSV format. This dataset comprises 305 samples of normal heart function, 15 samples of congestive heart failure, 32 samples of intracardiac atrial fibrillation, and 77 samples of supraventricular arrhythmia. The key steps include preprocessing the raw ECG data, extracting the relevant features, and introducing the input to the Machine Learning (ML) model for training. After preprocessing, ECG characteristic features, viz., mean heart interval, RR interval, p-wave amplitude, q-wave amplitude, r-wave amplitude, t-wave amplitude, and the derived features, namely, root mean square of successive difference (RMSSD), mean standard deviation of the normal-to-normal interval (SDDN), are extracted from the ECG signal and implemented using eXplainable Artificial Intelligence (XAI) methods to expound feature contributions. Various ML algorithms, including ensemble (EN), Naive Bayes (NB), and Support Vector Machine (SVM), are implemented for effectiveness. A tenfold cross-validation and performance are assessed using accuracy and recall analysis. Among these four models, SVM outperforms the other models and feature selection, achieving 99.5% accuracy when considering all features, 77% accuracy for the two derived features, and 99.5% accuracy for ECG wave characteristics features. To address the limitations, such as a small dataset and class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) is applied to further enhance model performance. This study demonstrates the effectiveness of ML models, notably SVM, in predicting CVD abnormalities based on their ECG characteristics. These results suggest that future research should focus on refining methods to identify key features of ECG wave characteristics, potentially streamlining and speeding up the prediction of CVD in real-time. This work utilises XAI techniques to make the models more transparent, understandable and improve model accuracy of 99.8% for SVM. Furthermore, increasing model transparency with XAI might facilitate quicker clinical adoption for the diagnosis of heart disease.</p>","PeriodicalId":48490,"journal":{"name":"Physical and Engineering Sciences in Medicine","volume":" ","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical and Engineering Sciences in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13246-025-01653-8","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
The World Health Organisation 2024 report shows that Cardiovascular Disease (CVD) is the leading cause of death worldwide, estimated at 17.9 million deaths annually, and its mortality is about 32% of all deaths in the world. Of these, about 85% are myocardial infarctions and strokes. This study aims to diagnose heart disorders by providing early medical intervention to reduce the risks of abnormal heart structures. A data-driven model has been developed to achieve the above aim. The CVD and standard Electrocardiogram (ECG) datasets are extracted from PhysioNet in CSV format. This dataset comprises 305 samples of normal heart function, 15 samples of congestive heart failure, 32 samples of intracardiac atrial fibrillation, and 77 samples of supraventricular arrhythmia. The key steps include preprocessing the raw ECG data, extracting the relevant features, and introducing the input to the Machine Learning (ML) model for training. After preprocessing, ECG characteristic features, viz., mean heart interval, RR interval, p-wave amplitude, q-wave amplitude, r-wave amplitude, t-wave amplitude, and the derived features, namely, root mean square of successive difference (RMSSD), mean standard deviation of the normal-to-normal interval (SDDN), are extracted from the ECG signal and implemented using eXplainable Artificial Intelligence (XAI) methods to expound feature contributions. Various ML algorithms, including ensemble (EN), Naive Bayes (NB), and Support Vector Machine (SVM), are implemented for effectiveness. A tenfold cross-validation and performance are assessed using accuracy and recall analysis. Among these four models, SVM outperforms the other models and feature selection, achieving 99.5% accuracy when considering all features, 77% accuracy for the two derived features, and 99.5% accuracy for ECG wave characteristics features. To address the limitations, such as a small dataset and class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) is applied to further enhance model performance. This study demonstrates the effectiveness of ML models, notably SVM, in predicting CVD abnormalities based on their ECG characteristics. These results suggest that future research should focus on refining methods to identify key features of ECG wave characteristics, potentially streamlining and speeding up the prediction of CVD in real-time. This work utilises XAI techniques to make the models more transparent, understandable and improve model accuracy of 99.8% for SVM. Furthermore, increasing model transparency with XAI might facilitate quicker clinical adoption for the diagnosis of heart disease.