使用可解释的机器学习算法预测患者的中风严重程度。

IF 2.8 3区医学 Q2 MEDICINE, RESEARCH & EXPERIMENTAL

European Journal of Medical Research Pub Date : 2024-11-14 DOI:10.1186/s40001-024-02147-1

Amir Sorayaie Azar, Tahereh Samimi, Ghanbar Tavassoli, Amin Naemi, Bahlol Rahimi, Zahra Hadianfard, Uffe Kock Wiil, Surena Nazarbaghi, Jamshid Bagherzadeh Mohasefi, Hadi Lotfnezhad Afshar

{"title":"使用可解释的机器学习算法预测患者的中风严重程度。","authors":"Amir Sorayaie Azar, Tahereh Samimi, Ghanbar Tavassoli, Amin Naemi, Bahlol Rahimi, Zahra Hadianfard, Uffe Kock Wiil, Surena Nazarbaghi, Jamshid Bagherzadeh Mohasefi, Hadi Lotfnezhad Afshar","doi":"10.1186/s40001-024-02147-1","DOIUrl":null,"url":null,"abstract":"Background: Stroke is a significant global health concern, ranking as the second leading cause of death and placing a substantial financial burden on healthcare systems, particularly in low- and middle-income countries. Timely evaluation of stroke severity is crucial for predicting clinical outcomes, with standard assessment tools being the Rapid Arterial Occlusion Evaluation (RACE) and the National Institutes of Health Stroke Scale (NIHSS). This study aims to utilize Machine Learning (ML) algorithms to predict stroke severity using these two distinct scales.Methods: We conducted this study using two datasets collected from hospitals in Urmia, Iran, corresponding to stroke severity assessments based on RACE and NIHSS. Seven ML algorithms were applied, including K-Nearest Neighbor (KNN), Decision Tree (DT), Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and Artificial Neural Network (ANN). Hyperparameter tuning was performed using grid search to optimize model performance, and SHapley Additive Explanations (SHAP) were used to interpret the contribution of individual features.Results: Among the models, the RF achieved the highest performance, with accuracies of 92.68% for the RACE dataset and 91.19% for the NIHSS dataset. The Area Under the Curve (AUC) was 92.02% and 97.86% for the RACE and NIHSS datasets, respectively. The SHAP analysis identified triglyceride levels, length of hospital stay, and age as critical predictors of stroke severity.Conclusions: This study is the first to apply ML models to the RACE and NIHSS scales for predicting stroke severity. The use of SHAP enhances the interpretability of the models, increasing clinicians' trust in these ML algorithms. The best-performing ML model can be a valuable tool for assisting medical professionals in predicting stroke severity in clinical settings.","PeriodicalId":11949,"journal":{"name":"European Journal of Medical Research","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11562860/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting stroke severity of patients using interpretable machine learning algorithms.\",\"authors\":\"Amir Sorayaie Azar, Tahereh Samimi, Ghanbar Tavassoli, Amin Naemi, Bahlol Rahimi, Zahra Hadianfard, Uffe Kock Wiil, Surena Nazarbaghi, Jamshid Bagherzadeh Mohasefi, Hadi Lotfnezhad Afshar\",\"doi\":\"10.1186/s40001-024-02147-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Stroke is a significant global health concern, ranking as the second leading cause of death and placing a substantial financial burden on healthcare systems, particularly in low- and middle-income countries. Timely evaluation of stroke severity is crucial for predicting clinical outcomes, with standard assessment tools being the Rapid Arterial Occlusion Evaluation (RACE) and the National Institutes of Health Stroke Scale (NIHSS). This study aims to utilize Machine Learning (ML) algorithms to predict stroke severity using these two distinct scales.Methods: We conducted this study using two datasets collected from hospitals in Urmia, Iran, corresponding to stroke severity assessments based on RACE and NIHSS. Seven ML algorithms were applied, including K-Nearest Neighbor (KNN), Decision Tree (DT), Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and Artificial Neural Network (ANN). Hyperparameter tuning was performed using grid search to optimize model performance, and SHapley Additive Explanations (SHAP) were used to interpret the contribution of individual features.Results: Among the models, the RF achieved the highest performance, with accuracies of 92.68% for the RACE dataset and 91.19% for the NIHSS dataset. The Area Under the Curve (AUC) was 92.02% and 97.86% for the RACE and NIHSS datasets, respectively. The SHAP analysis identified triglyceride levels, length of hospital stay, and age as critical predictors of stroke severity.Conclusions: This study is the first to apply ML models to the RACE and NIHSS scales for predicting stroke severity. The use of SHAP enhances the interpretability of the models, increasing clinicians' trust in these ML algorithms. The best-performing ML model can be a valuable tool for assisting medical professionals in predicting stroke severity in clinical settings.\",\"PeriodicalId\":11949,\"journal\":{\"name\":\"European Journal of Medical Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11562860/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Medical Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s40001-024-02147-1\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Medical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40001-024-02147-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

摘要

背景：脑卒中是全球关注的重大健康问题，是导致死亡的第二大原因，给医疗系统造成了巨大的经济负担，尤其是在中低收入国家。及时评估中风严重程度对于预测临床结果至关重要，标准评估工具是快速动脉闭塞评估（RACE）和美国国立卫生研究院中风量表（NIHSS）。本研究旨在利用机器学习（ML）算法，使用这两种不同的量表预测中风严重程度：我们使用从伊朗乌尔米亚医院收集的两个数据集开展了这项研究，这两个数据集分别对应基于 RACE 和 NIHSS 的中风严重程度评估。我们使用了七种 ML 算法，包括 K-Nearest Neighbor (KNN)、决策树 (DT)、随机森林 (RF)、自适应提升 (AdaBoost)、极梯度提升 (XGBoost)、支持向量机 (SVM) 和人工神经网络 (ANN)。使用网格搜索对超参数进行了调整，以优化模型性能，并使用 SHapley Additive Explanations（SHAP）来解释各个特征的贡献：在所有模型中，RF 的性能最高，RACE 数据集的准确率为 92.68%，NIHSS 数据集的准确率为 91.19%。RACE 和 NIHSS 数据集的曲线下面积（AUC）分别为 92.02% 和 97.86%。SHAP 分析确定甘油三酯水平、住院时间和年龄是预测卒中严重程度的关键因素：本研究首次将 ML 模型应用于 RACE 和 NIHSS 量表以预测卒中严重程度。SHAP 的使用增强了模型的可解释性，提高了临床医生对这些 ML 算法的信任度。表现最佳的 ML 模型可以成为协助医疗专业人员在临床环境中预测卒中严重程度的重要工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Predicting stroke severity of patients using interpretable machine learning algorithms.

Background: Stroke is a significant global health concern, ranking as the second leading cause of death and placing a substantial financial burden on healthcare systems, particularly in low- and middle-income countries. Timely evaluation of stroke severity is crucial for predicting clinical outcomes, with standard assessment tools being the Rapid Arterial Occlusion Evaluation (RACE) and the National Institutes of Health Stroke Scale (NIHSS). This study aims to utilize Machine Learning (ML) algorithms to predict stroke severity using these two distinct scales.

Methods: We conducted this study using two datasets collected from hospitals in Urmia, Iran, corresponding to stroke severity assessments based on RACE and NIHSS. Seven ML algorithms were applied, including K-Nearest Neighbor (KNN), Decision Tree (DT), Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and Artificial Neural Network (ANN). Hyperparameter tuning was performed using grid search to optimize model performance, and SHapley Additive Explanations (SHAP) were used to interpret the contribution of individual features.

Results: Among the models, the RF achieved the highest performance, with accuracies of 92.68% for the RACE dataset and 91.19% for the NIHSS dataset. The Area Under the Curve (AUC) was 92.02% and 97.86% for the RACE and NIHSS datasets, respectively. The SHAP analysis identified triglyceride levels, length of hospital stay, and age as critical predictors of stroke severity.

Conclusions: This study is the first to apply ML models to the RACE and NIHSS scales for predicting stroke severity. The use of SHAP enhances the interpretability of the models, increasing clinicians' trust in these ML algorithms. The best-performing ML model can be a valuable tool for assisting medical professionals in predicting stroke severity in clinical settings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

European Journal of Medical Research 医学-医学：研究与实验

CiteScore

3.20

自引率

0.00%

发文量

247

审稿时长

>12 weeks

期刊介绍： European Journal of Medical Research publishes translational and clinical research of international interest across all medical disciplines, enabling clinicians and other researchers to learn about developments and innovations within these disciplines and across the boundaries between disciplines. The journal publishes high quality research and reviews and aims to ensure that the results of all well-conducted research are published, regardless of their outcome.