Development and validation of an explainable machine learning prediction model of hemorrhagic transformation after intravenous thrombolysis in stroke.

IF 2.7 3区医学 Q2 CLINICAL NEUROLOGY

Frontiers in Neurology Pub Date : 2025-01-15 eCollection Date: 2024-01-01 DOI:10.3389/fneur.2024.1446250

Yanan Lin, Yan Li, Yayin Luo, Jie Han

{"title":"Development and validation of an explainable machine learning prediction model of hemorrhagic transformation after intravenous thrombolysis in stroke.","authors":"Yanan Lin, Yan Li, Yayin Luo, Jie Han","doi":"10.3389/fneur.2024.1446250","DOIUrl":null,"url":null,"abstract":"Objective: To develop and validate an explainable machine learning (ML) model predicting the risk of hemorrhagic transformation (HT) after intravenous thrombolysis.Methods: We retrospectively enrolled patients who received intravenous tissue plasminogen activator (IV-tPA) thrombolysis within 4.5 h after symptom onset to form the original modeling cohort. HT was defined as any hemorrhage on head CT scan completed within 48 h after IV-tPA administration. We utilized the Random Forest (RF), Multilayer Perceptron (MLP), Adaptive Boosting (AdaBoost), and Gaussian Naive Bayes (GauNB) algorithms to develop ML-HT models. The models' predictive performance was evaluated using confusion matrix (including accuracy, precision, recall, and F1 score), and discriminative analysis (area under the receiver-operating-characteristic curve, ROC-AUC) in the original cohort, followed by validation in an independent external cohort. The models' explainability was assessed using SHapley Additive exPlanations (SHAP) global feature plot, SHAP Summary Plot, and Partial Dependence Plot.Results: A total of 1,007 patients were included in the original modeling cohort, with an HT incidence of 8.94%. The RF-based ML-HT model showed metrics of 0.874 (accuracy), 0.972 (precision), 0.890 (recall), 0.929 (F1 score); with ROC-AUC of 0.7847 in the original cohort and 0.7119 in the external validation cohort. The MLP model showed 0.878, 0.967, 0.989, 0.978, 0.7710, and 0.6768, respectively. The AdaBoost model showed 0.907, 0.967, 0.989, 0.978, 0.7798, and 0.6606, respectively. The GauNB model showed 0.848, 0.983, 0.598, 0.716, 0.6953, and 0.6289, respectively. The explainable analysis of the RF-based ML model indicated that the National Institute of Health Stroke Scale (NIHSS) score, age, platelet count, and atrial fibrillation were the primary determinants for HT following IV-tPA thrombolysis.Conclusion: The RF-based explainable ML model demonstrated promising predictive ability for estimating the risk of HT after IV-tPA thrombolysis and may have the potential to assist the clinical decision-making in emergency settings.","PeriodicalId":12575,"journal":{"name":"Frontiers in Neurology","volume":"15 ","pages":"1446250"},"PeriodicalIF":2.7000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11775651/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Neurology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fneur.2024.1446250","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: To develop and validate an explainable machine learning (ML) model predicting the risk of hemorrhagic transformation (HT) after intravenous thrombolysis.

Methods: We retrospectively enrolled patients who received intravenous tissue plasminogen activator (IV-tPA) thrombolysis within 4.5 h after symptom onset to form the original modeling cohort. HT was defined as any hemorrhage on head CT scan completed within 48 h after IV-tPA administration. We utilized the Random Forest (RF), Multilayer Perceptron (MLP), Adaptive Boosting (AdaBoost), and Gaussian Naive Bayes (GauNB) algorithms to develop ML-HT models. The models' predictive performance was evaluated using confusion matrix (including accuracy, precision, recall, and F1 score), and discriminative analysis (area under the receiver-operating-characteristic curve, ROC-AUC) in the original cohort, followed by validation in an independent external cohort. The models' explainability was assessed using SHapley Additive exPlanations (SHAP) global feature plot, SHAP Summary Plot, and Partial Dependence Plot.

Results: A total of 1,007 patients were included in the original modeling cohort, with an HT incidence of 8.94%. The RF-based ML-HT model showed metrics of 0.874 (accuracy), 0.972 (precision), 0.890 (recall), 0.929 (F1 score); with ROC-AUC of 0.7847 in the original cohort and 0.7119 in the external validation cohort. The MLP model showed 0.878, 0.967, 0.989, 0.978, 0.7710, and 0.6768, respectively. The AdaBoost model showed 0.907, 0.967, 0.989, 0.978, 0.7798, and 0.6606, respectively. The GauNB model showed 0.848, 0.983, 0.598, 0.716, 0.6953, and 0.6289, respectively. The explainable analysis of the RF-based ML model indicated that the National Institute of Health Stroke Scale (NIHSS) score, age, platelet count, and atrial fibrillation were the primary determinants for HT following IV-tPA thrombolysis.

Conclusion: The RF-based explainable ML model demonstrated promising predictive ability for estimating the risk of HT after IV-tPA thrombolysis and may have the potential to assist the clinical decision-making in emergency settings.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

Frontiers in Neurology CLINICAL NEUROLOGYNEUROSCIENCES -NEUROSCIENCES

CiteScore

4.90

自引率

8.80%

发文量

2792

审稿时长

14 weeks

期刊介绍： The section Stroke aims to quickly and accurately publish important experimental, translational and clinical studies, and reviews that contribute to the knowledge of stroke, its causes, manifestations, diagnosis, and management.