An interpretable XGBoost model for risk prediction of progression from sepsis-associated acute kidney injury to chronic kidney disease

Q1 Medicine
Yingying Lin , Jingqi Gao , Linfang Chen , Yixiao Hong , Min Li , Peiling Chen , Xiuling Shang
{"title":"An interpretable XGBoost model for risk prediction of progression from sepsis-associated acute kidney injury to chronic kidney disease","authors":"Yingying Lin ,&nbsp;Jingqi Gao ,&nbsp;Linfang Chen ,&nbsp;Yixiao Hong ,&nbsp;Min Li ,&nbsp;Peiling Chen ,&nbsp;Xiuling Shang","doi":"10.1016/j.imu.2025.101685","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>To develop an interpretable machine learning (ML) model for predicting the risk of progression from sepsis-associated acute kidney injury (SA-AKI) to chronic kidney disease (CKD) guiding early stratified interventions.</div></div><div><h3>Methods</h3><div>Using data from 1315 SA-AKI patients [Medical Information Mart for Intensive Care IV (MIMIC-IV) database], we constructed an extreme gradient boosting (XGBoost) model with SHapley Additive exPlanations (SHAP) interpretability. Performance was evaluated by discrimination, calibration, and clinical utility [decision curve analysis (DCA)].</div></div><div><h3>Results</h3><div>CKD incidence was 36.7 % (median onset: 7.6 months). The XGBoost model achieved: superior discrimination [training area under the curve (AUC) 0.920; validation AUC 0.951 versus Sequential Organ Failure Assessment (SOFA) renal 0.616 and logistic regression (LR) 0.822], robust calibration, and clinical applicability. SHAP identified actionable thresholds (age &gt;65, maximum serum creatinine &gt;0.9 mg/dl) for early intervention. Feature stability analysis revealed a stage-dependent coefficient drift for serum creatinine (Δβ = +0.84), reflecting dynamic pathophysiology. Crucially, the model provides clinically interpretable outputs without requiring SHAP expertise, enabling seamless integration into workflows.</div></div><div><h3>Conclusion</h3><div>Our model delivers personalized, interpretable CKD risk alerts for SA-AKI patients, empowering clinicians to stratify follow-up care. External validation is warranted to confirm generalizability.</div></div>","PeriodicalId":13953,"journal":{"name":"Informatics in Medicine Unlocked","volume":"58 ","pages":"Article 101685"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics in Medicine Unlocked","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352914825000747","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

To develop an interpretable machine learning (ML) model for predicting the risk of progression from sepsis-associated acute kidney injury (SA-AKI) to chronic kidney disease (CKD) guiding early stratified interventions.

Methods

Using data from 1315 SA-AKI patients [Medical Information Mart for Intensive Care IV (MIMIC-IV) database], we constructed an extreme gradient boosting (XGBoost) model with SHapley Additive exPlanations (SHAP) interpretability. Performance was evaluated by discrimination, calibration, and clinical utility [decision curve analysis (DCA)].

Results

CKD incidence was 36.7 % (median onset: 7.6 months). The XGBoost model achieved: superior discrimination [training area under the curve (AUC) 0.920; validation AUC 0.951 versus Sequential Organ Failure Assessment (SOFA) renal 0.616 and logistic regression (LR) 0.822], robust calibration, and clinical applicability. SHAP identified actionable thresholds (age >65, maximum serum creatinine >0.9 mg/dl) for early intervention. Feature stability analysis revealed a stage-dependent coefficient drift for serum creatinine (Δβ = +0.84), reflecting dynamic pathophysiology. Crucially, the model provides clinically interpretable outputs without requiring SHAP expertise, enabling seamless integration into workflows.

Conclusion

Our model delivers personalized, interpretable CKD risk alerts for SA-AKI patients, empowering clinicians to stratify follow-up care. External validation is warranted to confirm generalizability.
一个可解释的XGBoost模型用于预测脓毒症相关急性肾损伤进展为慢性肾脏疾病的风险
目的建立一个可解释的机器学习(ML)模型,用于预测脓毒症相关急性肾损伤(SA-AKI)向慢性肾病(CKD)发展的风险,指导早期分层干预。方法利用1315例SA-AKI患者的数据[重症监护医学信息市场IV (MIMIC-IV)数据库],我们构建了一个具有SHapley加性解释(SHAP)可解释性的极端梯度增强(XGBoost)模型。通过鉴别、校准和临床效用[决策曲线分析(DCA)]来评估其性能。结果sckd发病率为36.7%,中位发病时间为7.6个月。XGBoost模型实现:优判别[曲线下训练面积(AUC) 0.920;验证AUC为0.951,而序贯器官衰竭评估(SOFA)为0.616,logistic回归(LR)为0.822],稳健校准和临床适用性。SHAP确定了早期干预的可操作阈值(65岁,最大血清肌酐0.9 mg/dl)。特征稳定性分析显示血清肌酐的阶段依赖系数漂移(Δβ = +0.84),反映动态病理生理。最重要的是,该模型提供临床可解释的输出,而不需要SHAP专业知识,可以无缝集成到工作流程中。我们的模型为SA-AKI患者提供个性化的、可解释的CKD风险警报,使临床医生能够分层后续护理。外部验证是必要的,以确认通用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Informatics in Medicine Unlocked
Informatics in Medicine Unlocked Medicine-Health Informatics
CiteScore
9.50
自引率
0.00%
发文量
282
审稿时长
39 days
期刊介绍: Informatics in Medicine Unlocked (IMU) is an international gold open access journal covering a broad spectrum of topics within medical informatics, including (but not limited to) papers focusing on imaging, pathology, teledermatology, public health, ophthalmological, nursing and translational medicine informatics. The full papers that are published in the journal are accessible to all who visit the website.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信