{"title":"Predicting stroke severity of patients using interpretable machine learning algorithms.","authors":"Amir Sorayaie Azar, Tahereh Samimi, Ghanbar Tavassoli, Amin Naemi, Bahlol Rahimi, Zahra Hadianfard, Uffe Kock Wiil, Surena Nazarbaghi, Jamshid Bagherzadeh Mohasefi, Hadi Lotfnezhad Afshar","doi":"10.1186/s40001-024-02147-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Stroke is a significant global health concern, ranking as the second leading cause of death and placing a substantial financial burden on healthcare systems, particularly in low- and middle-income countries. Timely evaluation of stroke severity is crucial for predicting clinical outcomes, with standard assessment tools being the Rapid Arterial Occlusion Evaluation (RACE) and the National Institutes of Health Stroke Scale (NIHSS). This study aims to utilize Machine Learning (ML) algorithms to predict stroke severity using these two distinct scales.</p><p><strong>Methods: </strong>We conducted this study using two datasets collected from hospitals in Urmia, Iran, corresponding to stroke severity assessments based on RACE and NIHSS. Seven ML algorithms were applied, including K-Nearest Neighbor (KNN), Decision Tree (DT), Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and Artificial Neural Network (ANN). Hyperparameter tuning was performed using grid search to optimize model performance, and SHapley Additive Explanations (SHAP) were used to interpret the contribution of individual features.</p><p><strong>Results: </strong>Among the models, the RF achieved the highest performance, with accuracies of 92.68% for the RACE dataset and 91.19% for the NIHSS dataset. The Area Under the Curve (AUC) was 92.02% and 97.86% for the RACE and NIHSS datasets, respectively. The SHAP analysis identified triglyceride levels, length of hospital stay, and age as critical predictors of stroke severity.</p><p><strong>Conclusions: </strong>This study is the first to apply ML models to the RACE and NIHSS scales for predicting stroke severity. The use of SHAP enhances the interpretability of the models, increasing clinicians' trust in these ML algorithms. The best-performing ML model can be a valuable tool for assisting medical professionals in predicting stroke severity in clinical settings.</p>","PeriodicalId":11949,"journal":{"name":"European Journal of Medical Research","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11562860/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Medical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40001-024-02147-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Stroke is a significant global health concern, ranking as the second leading cause of death and placing a substantial financial burden on healthcare systems, particularly in low- and middle-income countries. Timely evaluation of stroke severity is crucial for predicting clinical outcomes, with standard assessment tools being the Rapid Arterial Occlusion Evaluation (RACE) and the National Institutes of Health Stroke Scale (NIHSS). This study aims to utilize Machine Learning (ML) algorithms to predict stroke severity using these two distinct scales.
Methods: We conducted this study using two datasets collected from hospitals in Urmia, Iran, corresponding to stroke severity assessments based on RACE and NIHSS. Seven ML algorithms were applied, including K-Nearest Neighbor (KNN), Decision Tree (DT), Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and Artificial Neural Network (ANN). Hyperparameter tuning was performed using grid search to optimize model performance, and SHapley Additive Explanations (SHAP) were used to interpret the contribution of individual features.
Results: Among the models, the RF achieved the highest performance, with accuracies of 92.68% for the RACE dataset and 91.19% for the NIHSS dataset. The Area Under the Curve (AUC) was 92.02% and 97.86% for the RACE and NIHSS datasets, respectively. The SHAP analysis identified triglyceride levels, length of hospital stay, and age as critical predictors of stroke severity.
Conclusions: This study is the first to apply ML models to the RACE and NIHSS scales for predicting stroke severity. The use of SHAP enhances the interpretability of the models, increasing clinicians' trust in these ML algorithms. The best-performing ML model can be a valuable tool for assisting medical professionals in predicting stroke severity in clinical settings.
期刊介绍:
European Journal of Medical Research publishes translational and clinical research of international interest across all medical disciplines, enabling clinicians and other researchers to learn about developments and innovations within these disciplines and across the boundaries between disciplines. The journal publishes high quality research and reviews and aims to ensure that the results of all well-conducted research are published, regardless of their outcome.