在现实世界的医院队列中,用于预测药物诱导的免疫性血小板减少症的机器学习模型的开发和外部验证。

IF 3.3 3区 医学 Q2 MEDICAL INFORMATICS
Hoang Van Dung, Vu Manh Tan, Nguyen Thi Dieu, Pham Van Linh, Nguyen Van Khai, Tran Thi Ngan, Nguyen Thi Thu Phuong
{"title":"在现实世界的医院队列中,用于预测药物诱导的免疫性血小板减少症的机器学习模型的开发和外部验证。","authors":"Hoang Van Dung, Vu Manh Tan, Nguyen Thi Dieu, Pham Van Linh, Nguyen Van Khai, Tran Thi Ngan, Nguyen Thi Thu Phuong","doi":"10.1186/s12911-025-03107-3","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Drug-induced immune thrombocytopenia (DITP) is a rare but potentially life-threatening adverse drug reaction, often underrecognized due to its nonspecific presentation and the lack of real-time diagnostic tools. Early identification of at-risk patients is critical to improving medication safety and preventing severe complications.</p><p><strong>Objective: </strong>To develop and externally validate a machine learning model for predicting the risk of DITP using routinely collected hospital data, and to optimize its clinical applicability through threshold adjustment.</p><p><strong>Methods: </strong>We conducted a retrospective cohort study using electronic medical records from Hai Phong International Hospital (2018-2024) for model development and internal validation. An independent cohort from Hai Phong International Hospital - Vinh Bao (2024) served as external validation. Eligible patients received at least one drug previously implicated in DITP and had serial platelet counts. A Light Gradient Boosting Machine (LightGBM) model was trained on demographic, clinical, laboratory, and pharmacological features. Model performance was assessed using area under the ROC curve (AUC), accuracy, recall, and F1-score. Shapley Additive explanations (SHAP) were used to interpret feature contributions. Threshold tuning and decision curve analysis (DCA) supported clinical applicability.</p><p><strong>Results: </strong>Among 17,546 patients in the training cohort and 1,403 in the external cohort, DITP occurred in 432 (2.46%) and 70 (4.99%) patients, respectively. In internal validation, LightGBM achieved an AUC of 0.860, recall of 0.392, and F1-score of 0.310. External validation confirmed model robustness with an AUC of 0.813 and an F1-score of 0.341 at the optimized threshold (0.09). SHAP analysis identified AST, baseline platelet count, and renal function as key contributors. DCA and clinical impact curves demonstrated potential benefit in supporting real-time risk stratification. Clopidogrel and vancomycin were frequently associated with suspected DITP cases.</p><p><strong>Conclusion: </strong>This externally validated machine learning model enables early identification of hospitalized patients at risk of DITP using data available in routine care. Its integration into electronic medical records may support clinical decision-making, reduce diagnostic delays, and improve pharmacovigilance practices in hospital settings.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"265"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12261740/pdf/","citationCount":"0","resultStr":"{\"title\":\"Development and external validation of a machine learning model for predicting drug-induced immune thrombocytopenia in a real-world hospital cohort.\",\"authors\":\"Hoang Van Dung, Vu Manh Tan, Nguyen Thi Dieu, Pham Van Linh, Nguyen Van Khai, Tran Thi Ngan, Nguyen Thi Thu Phuong\",\"doi\":\"10.1186/s12911-025-03107-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Drug-induced immune thrombocytopenia (DITP) is a rare but potentially life-threatening adverse drug reaction, often underrecognized due to its nonspecific presentation and the lack of real-time diagnostic tools. Early identification of at-risk patients is critical to improving medication safety and preventing severe complications.</p><p><strong>Objective: </strong>To develop and externally validate a machine learning model for predicting the risk of DITP using routinely collected hospital data, and to optimize its clinical applicability through threshold adjustment.</p><p><strong>Methods: </strong>We conducted a retrospective cohort study using electronic medical records from Hai Phong International Hospital (2018-2024) for model development and internal validation. An independent cohort from Hai Phong International Hospital - Vinh Bao (2024) served as external validation. Eligible patients received at least one drug previously implicated in DITP and had serial platelet counts. A Light Gradient Boosting Machine (LightGBM) model was trained on demographic, clinical, laboratory, and pharmacological features. Model performance was assessed using area under the ROC curve (AUC), accuracy, recall, and F1-score. Shapley Additive explanations (SHAP) were used to interpret feature contributions. Threshold tuning and decision curve analysis (DCA) supported clinical applicability.</p><p><strong>Results: </strong>Among 17,546 patients in the training cohort and 1,403 in the external cohort, DITP occurred in 432 (2.46%) and 70 (4.99%) patients, respectively. In internal validation, LightGBM achieved an AUC of 0.860, recall of 0.392, and F1-score of 0.310. External validation confirmed model robustness with an AUC of 0.813 and an F1-score of 0.341 at the optimized threshold (0.09). SHAP analysis identified AST, baseline platelet count, and renal function as key contributors. DCA and clinical impact curves demonstrated potential benefit in supporting real-time risk stratification. Clopidogrel and vancomycin were frequently associated with suspected DITP cases.</p><p><strong>Conclusion: </strong>This externally validated machine learning model enables early identification of hospitalized patients at risk of DITP using data available in routine care. Its integration into electronic medical records may support clinical decision-making, reduce diagnostic delays, and improve pharmacovigilance practices in hospital settings.</p>\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":\"25 1\",\"pages\":\"265\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12261740/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-025-03107-3\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03107-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

摘要

背景:药物性免疫性血小板减少症(DITP)是一种罕见但可能危及生命的药物不良反应,由于其非特异性表现和缺乏实时诊断工具而经常被忽视。早期识别高危患者对于提高用药安全性和预防严重并发症至关重要。目的:开发并外部验证利用医院常规采集数据预测DITP风险的机器学习模型,并通过阈值调整优化其临床适用性。方法:采用海防国际医院2018-2024年的电子病历进行回顾性队列研究,进行模型开发和内部验证。来自海防国际医院-永保(2024)的独立队列作为外部验证。符合条件的患者接受了至少一种先前与DITP有关的药物治疗,并有一系列血小板计数。光梯度增强机(LightGBM)模型根据人口统计学、临床、实验室和药理学特征进行训练。使用ROC曲线下面积(AUC)、准确率、召回率和f1评分来评估模型的性能。Shapley加性解释(SHAP)用于解释特征贡献。阈值调整和决策曲线分析(DCA)支持临床适用性。结果:在培训组17546例患者和外部组1403例患者中,分别有432例(2.46%)和70例(4.99%)患者发生DITP。在内部验证中,LightGBM的AUC为0.860,召回率为0.392,f1得分为0.310。外部验证证实了模型的稳健性,在优化阈值(0.09)下,AUC为0.813,f1得分为0.341。SHAP分析确定AST、基线血小板计数和肾功能是关键因素。DCA和临床影响曲线显示了支持实时风险分层的潜在益处。氯吡格雷和万古霉素常与疑似DITP病例相关。结论:这种外部验证的机器学习模型可以使用常规护理中可用的数据早期识别有DITP风险的住院患者。将其集成到电子医疗记录中可以支持临床决策,减少诊断延误,并改善医院环境中的药物警戒实践。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Development and external validation of a machine learning model for predicting drug-induced immune thrombocytopenia in a real-world hospital cohort.

Background: Drug-induced immune thrombocytopenia (DITP) is a rare but potentially life-threatening adverse drug reaction, often underrecognized due to its nonspecific presentation and the lack of real-time diagnostic tools. Early identification of at-risk patients is critical to improving medication safety and preventing severe complications.

Objective: To develop and externally validate a machine learning model for predicting the risk of DITP using routinely collected hospital data, and to optimize its clinical applicability through threshold adjustment.

Methods: We conducted a retrospective cohort study using electronic medical records from Hai Phong International Hospital (2018-2024) for model development and internal validation. An independent cohort from Hai Phong International Hospital - Vinh Bao (2024) served as external validation. Eligible patients received at least one drug previously implicated in DITP and had serial platelet counts. A Light Gradient Boosting Machine (LightGBM) model was trained on demographic, clinical, laboratory, and pharmacological features. Model performance was assessed using area under the ROC curve (AUC), accuracy, recall, and F1-score. Shapley Additive explanations (SHAP) were used to interpret feature contributions. Threshold tuning and decision curve analysis (DCA) supported clinical applicability.

Results: Among 17,546 patients in the training cohort and 1,403 in the external cohort, DITP occurred in 432 (2.46%) and 70 (4.99%) patients, respectively. In internal validation, LightGBM achieved an AUC of 0.860, recall of 0.392, and F1-score of 0.310. External validation confirmed model robustness with an AUC of 0.813 and an F1-score of 0.341 at the optimized threshold (0.09). SHAP analysis identified AST, baseline platelet count, and renal function as key contributors. DCA and clinical impact curves demonstrated potential benefit in supporting real-time risk stratification. Clopidogrel and vancomycin were frequently associated with suspected DITP cases.

Conclusion: This externally validated machine learning model enables early identification of hospitalized patients at risk of DITP using data available in routine care. Its integration into electronic medical records may support clinical decision-making, reduce diagnostic delays, and improve pharmacovigilance practices in hospital settings.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
5.70%
发文量
297
审稿时长
1 months
期刊介绍: BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信