{"title":"An Explainable Two-Stage Machine Learning Model for Predicting the Post-Thrombolysis Complications in Stroke Patients: A Multi-Center Study.","authors":"Hongling Zhu, Qing Ye, Shurui Wang, Hongsen Cai, Mairihaba Maimaiti, Jinsheng Lai, Chuan Qin, Ping Zhang, Yanyan Chen, Qiushi Luo, Hong Wu, Danyang Chen, Shiling Chen, Shudan Zhu, Yuting Lv, Yanxiang Xu, Jian Zhang, Benshan Hu, Yuanxiang Yin, Yan Xie, Dongmei Zhu, Xiaoxing Ming, Zhouping Tang, Hesong Zeng","doi":"10.34133/research.0817","DOIUrl":null,"url":null,"abstract":"<p><p>Current tools for predicting the thrombolysis risk in patients after stroke exhibit limited event prediction in early post-thrombolysis hemorrhagic events. This highlights an unmet medical need to improve the tools for stroke management. We developed an explainable 2-stage machine learning model for stroke risk stratification to predict the risk of bleeding, composite complications, and all-cause death in patients before and after thrombolysis therapy. The model integrated LightGBM, XGBoost, random forest model (RF), decision tree model (DT), and logistic regression model (LR), and was trained on data from 5,333 patients from Tongji Hospital, achieving improved predictive accuracy in the post-thrombolysis stage compared to the pre-thrombolysis stage. The model exhibited increased area under the curve (AUC) of 0.7581 [95% confidence interval (CI), 0.6955 to 0.8177] and 0.7234 (0.6527 to 0.7909) (bleeding), 0.7625 (0.7324 to 0.7936) and 0.7035 (0.6685 to 0.7392) (composite complications), and 0.9264 (0.8736 to 0.9660) and 0.845 (0.7454 to 0.9375) (death) in post-thrombolysis stage than in pre-thrombolysis stage. External validation using data of 526 patients across 2 different hospitals confirmed the robustness of the model. Key predictors such as temperature, vital signs, and demographic factors were identified. A prototype embedding the best-performing model was constructed. This model enhances thrombolysis risk prediction and supports personalized patient care management, demonstrating its potential for clinical decision support system integration into stroke management strategies.</p>","PeriodicalId":21120,"journal":{"name":"Research","volume":"8 ","pages":"0817"},"PeriodicalIF":10.7000,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12364525/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.34133/research.0817","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Multidisciplinary","Score":null,"Total":0}
引用次数: 0
Abstract
Current tools for predicting the thrombolysis risk in patients after stroke exhibit limited event prediction in early post-thrombolysis hemorrhagic events. This highlights an unmet medical need to improve the tools for stroke management. We developed an explainable 2-stage machine learning model for stroke risk stratification to predict the risk of bleeding, composite complications, and all-cause death in patients before and after thrombolysis therapy. The model integrated LightGBM, XGBoost, random forest model (RF), decision tree model (DT), and logistic regression model (LR), and was trained on data from 5,333 patients from Tongji Hospital, achieving improved predictive accuracy in the post-thrombolysis stage compared to the pre-thrombolysis stage. The model exhibited increased area under the curve (AUC) of 0.7581 [95% confidence interval (CI), 0.6955 to 0.8177] and 0.7234 (0.6527 to 0.7909) (bleeding), 0.7625 (0.7324 to 0.7936) and 0.7035 (0.6685 to 0.7392) (composite complications), and 0.9264 (0.8736 to 0.9660) and 0.845 (0.7454 to 0.9375) (death) in post-thrombolysis stage than in pre-thrombolysis stage. External validation using data of 526 patients across 2 different hospitals confirmed the robustness of the model. Key predictors such as temperature, vital signs, and demographic factors were identified. A prototype embedding the best-performing model was constructed. This model enhances thrombolysis risk prediction and supports personalized patient care management, demonstrating its potential for clinical decision support system integration into stroke management strategies.
期刊介绍:
Research serves as a global platform for academic exchange, collaboration, and technological advancements. This journal welcomes high-quality research contributions from any domain, with open arms to authors from around the globe.
Comprising fundamental research in the life and physical sciences, Research also highlights significant findings and issues in engineering and applied science. The journal proudly features original research articles, reviews, perspectives, and editorials, fostering a diverse and dynamic scholarly environment.