Risk factors for tuberculosis treatment outcomes: a statistical learning-based exploration using the SINAN database with incomplete observations.

IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS
Nguyen Ky Phat, Yoonah Lee, Dinh Hoa Vu, Nguyen Phuoc Long, Seongoh Park
{"title":"Risk factors for tuberculosis treatment outcomes: a statistical learning-based exploration using the SINAN database with incomplete observations.","authors":"Nguyen Ky Phat, Yoonah Lee, Dinh Hoa Vu, Nguyen Phuoc Long, Seongoh Park","doi":"10.1186/s12911-025-03139-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Understanding early predictors of treatment outcomes allows better outcome prediction and resource allocation for efficient tuberculosis (TB) management.</p><p><strong>Objectives: </strong>This study aimed to predict treatment outcomes of TB patients from a real-world population-wide health record dataset with a significant rate of incomplete observations. In addition, potential risk factors associated with death during TB treatment were investigated.</p><p><strong>Methods: </strong>We exploited the upweighting approach and multiple imputation analysis (MIA) to address the extreme imbalance in responses and missing data. Three algorithms were employed for TB treatment outcome prediction, including logistic regression (LOGIT), random forest, and stochastic gradient boosting. The three models exhibited similar performance in predicting the treatment outcomes. Moreover, an interpretation of LOGIT was conducted, adjusted odds ratios (aORs) were computed, and the interpretation results were compared between MIA and complete case analysis (CCA).</p><p><strong>Results: </strong>MIA was an appropriate method for coping with missing data. In addition, compared to CCA, the interpretation results of the MIA-derived LOGIT showed more statistically significant covariates associated with TB treatment outcomes. In MIA, factors such as TB clinical form involving both pulmonary TB and extrapulmonary TB [aOR = 3.077, 95% confidence interval (CI) = 2.994-3.163], retreatment after abandonment (aOR = 2.272, 95% CI = 2.209-2.338), and the absence of isoniazid (aOR = 2.072, 95% CI = 1.892-2.269) or rifampicin (aOR = 1.968, 95% CI = 1.746-2.218) in the treatment regimen were associated with increased odds of death.</p><p><strong>Conclusion: </strong>In conclusion, our results shed light on the potential risk factors for death during TB treatment and suggest the use of simple yet interpretable LOGIT for the prediction of TB treatment outcomes.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"301"},"PeriodicalIF":3.8000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12341307/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03139-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Understanding early predictors of treatment outcomes allows better outcome prediction and resource allocation for efficient tuberculosis (TB) management.

Objectives: This study aimed to predict treatment outcomes of TB patients from a real-world population-wide health record dataset with a significant rate of incomplete observations. In addition, potential risk factors associated with death during TB treatment were investigated.

Methods: We exploited the upweighting approach and multiple imputation analysis (MIA) to address the extreme imbalance in responses and missing data. Three algorithms were employed for TB treatment outcome prediction, including logistic regression (LOGIT), random forest, and stochastic gradient boosting. The three models exhibited similar performance in predicting the treatment outcomes. Moreover, an interpretation of LOGIT was conducted, adjusted odds ratios (aORs) were computed, and the interpretation results were compared between MIA and complete case analysis (CCA).

Results: MIA was an appropriate method for coping with missing data. In addition, compared to CCA, the interpretation results of the MIA-derived LOGIT showed more statistically significant covariates associated with TB treatment outcomes. In MIA, factors such as TB clinical form involving both pulmonary TB and extrapulmonary TB [aOR = 3.077, 95% confidence interval (CI) = 2.994-3.163], retreatment after abandonment (aOR = 2.272, 95% CI = 2.209-2.338), and the absence of isoniazid (aOR = 2.072, 95% CI = 1.892-2.269) or rifampicin (aOR = 1.968, 95% CI = 1.746-2.218) in the treatment regimen were associated with increased odds of death.

Conclusion: In conclusion, our results shed light on the potential risk factors for death during TB treatment and suggest the use of simple yet interpretable LOGIT for the prediction of TB treatment outcomes.

结核病治疗结果的危险因素:使用不完全观察的SINAN数据库进行基于统计学学习的探索。
背景:了解治疗结果的早期预测因素可以更好地预测结果,并为有效的结核病(TB)管理分配资源。目的:本研究旨在从现实世界人口健康记录数据集中预测结核病患者的治疗结果,该数据集具有显著的不完整观察率。此外,还调查了与结核病治疗期间死亡相关的潜在危险因素。方法:利用上升权重法和多重归算分析(MIA)来解决极端不平衡的响应和缺失数据。采用logistic回归(LOGIT)、随机森林(random forest)和随机梯度增强(stochastic gradient boosting)三种算法预测结核病治疗结果。这三种模型在预测治疗结果方面表现相似。此外,对LOGIT进行解释,计算调整优势比(aORs),并比较MIA和完整病例分析(CCA)的解释结果。结果:MIA是处理缺失数据的合适方法。此外,与CCA相比,mia衍生的LOGIT的解释结果显示与结核病治疗结果相关的协变量更具统计学意义。在MIA中,结核病临床形式包括肺结核和肺外结核[aOR = 3.077, 95%可信区间(CI) = 2.994-3.163]、放弃后再治疗(aOR = 2.272, 95% CI = 2.209-2.338)、治疗方案中未使用异烟肼(aOR = 2.072, 95% CI = 1.892-2.269)或利福平(aOR = 1.968, 95% CI = 1.746-2.218)等因素与死亡几率增加相关。结论:总之,我们的研究结果揭示了结核病治疗期间死亡的潜在危险因素,并建议使用简单但可解释的LOGIT来预测结核病治疗结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
5.70%
发文量
297
审稿时长
1 months
期刊介绍: BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信