Interpretable machine learning model to predict 90-day radiographically confirmed pneumonia after chemotherapy initiation in non-Hodgkin lymphoma: development and internal validation of a single-center cohort.

IF 3.1 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL
Frontiers in Medicine Pub Date : 2025-09-22 eCollection Date: 2025-01-01 DOI:10.3389/fmed.2025.1674896
Zhanna Zhang, Manqi Su, Panruo Jiang, Xiaoxia Wang, Lingling Kong, Xiangmin Tong, Gongqiang Wu
{"title":"Interpretable machine learning model to predict 90-day radiographically confirmed pneumonia after chemotherapy initiation in non-Hodgkin lymphoma: development and internal validation of a single-center cohort.","authors":"Zhanna Zhang, Manqi Su, Panruo Jiang, Xiaoxia Wang, Lingling Kong, Xiangmin Tong, Gongqiang Wu","doi":"10.3389/fmed.2025.1674896","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Radiographically confirmed pneumonia within 90 days of chemotherapy initiation is a frequent and clinically important complication in patients with non-Hodgkin lymphoma, yet interpretable tools for early individualized risk estimation are limited.</p><p><strong>Objective: </strong>To develop and internally validate an interpretable machine-learning model that predicts the 90-day risk of radiographically confirmed pneumonia after chemotherapy initiation in non-Hodgkin lymphoma.</p><p><strong>Methods: </strong>We retrospectively analyzed 205 chemotherapy-treated NHL patients. A two-step feature selection (LASSO followed by random-forest-based recursive feature elimination) identified four predictors: high-grade malignancy, drinking (alcohol use), estimated glomerular filtration rate (eGFR), and smoking. Five algorithms were trained and compared under a stratified 70/30 split (training <i>n</i> = 145; internal hold-out test set <i>n</i> = 60) with leakage-safe preprocessing (within-fold kNN imputation, SMOTE, and scaling). The gradient boosting machine (GBM) performed best and was interpreted using SHAP. A web-based prototype was implemented for research use only.</p><p><strong>Results: </strong>On the internal hold-out test set (<i>n</i> = 60), the GBM achieved an AUC of 0.855 (95% CI 0.746-0.964), an F1 score of 0.679, and a Brier score of 0.155. SHAP identified reduced eGFR, smoking, drinking, and high-grade malignancy as influential contributors; case-level waterfall and force plots enhanced transparency. These estimates reflect internal validation only and were obtained without systematic microbiological confirmation or standardized radiologic rescoring. Accordingly, performance may be optimistic, and real-world use is not advised pending temporal and multicenter external validation (with potential recalibration) and prospective evaluation.</p><p><strong>Conclusion: </strong>The interpretable GBM model demonstrated promising discrimination and calibration on an internal hold-out test set; however, clinical deployment requires temporal and multicenter external validation (as well as prospective assessment with potential recalibration). The accompanying web calculator is a research-only prototype and is not intended for clinical decision-making until such validation is completed.</p>","PeriodicalId":12488,"journal":{"name":"Frontiers in Medicine","volume":"12 ","pages":"1674896"},"PeriodicalIF":3.1000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12497835/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fmed.2025.1674896","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Radiographically confirmed pneumonia within 90 days of chemotherapy initiation is a frequent and clinically important complication in patients with non-Hodgkin lymphoma, yet interpretable tools for early individualized risk estimation are limited.

Objective: To develop and internally validate an interpretable machine-learning model that predicts the 90-day risk of radiographically confirmed pneumonia after chemotherapy initiation in non-Hodgkin lymphoma.

Methods: We retrospectively analyzed 205 chemotherapy-treated NHL patients. A two-step feature selection (LASSO followed by random-forest-based recursive feature elimination) identified four predictors: high-grade malignancy, drinking (alcohol use), estimated glomerular filtration rate (eGFR), and smoking. Five algorithms were trained and compared under a stratified 70/30 split (training n = 145; internal hold-out test set n = 60) with leakage-safe preprocessing (within-fold kNN imputation, SMOTE, and scaling). The gradient boosting machine (GBM) performed best and was interpreted using SHAP. A web-based prototype was implemented for research use only.

Results: On the internal hold-out test set (n = 60), the GBM achieved an AUC of 0.855 (95% CI 0.746-0.964), an F1 score of 0.679, and a Brier score of 0.155. SHAP identified reduced eGFR, smoking, drinking, and high-grade malignancy as influential contributors; case-level waterfall and force plots enhanced transparency. These estimates reflect internal validation only and were obtained without systematic microbiological confirmation or standardized radiologic rescoring. Accordingly, performance may be optimistic, and real-world use is not advised pending temporal and multicenter external validation (with potential recalibration) and prospective evaluation.

Conclusion: The interpretable GBM model demonstrated promising discrimination and calibration on an internal hold-out test set; however, clinical deployment requires temporal and multicenter external validation (as well as prospective assessment with potential recalibration). The accompanying web calculator is a research-only prototype and is not intended for clinical decision-making until such validation is completed.

Abstract Image

Abstract Image

Abstract Image

可解释的机器学习模型预测非霍奇金淋巴瘤化疗开始后90天影像学证实的肺炎:单中心队列的发展和内部验证
背景:化疗开始后90 天内影像学证实的肺炎是非霍奇金淋巴瘤患者常见且临床上重要的并发症,但早期个体化风险评估的可解释工具有限。目的:开发并内部验证一个可解释的机器学习模型,该模型预测非霍奇金淋巴瘤患者化疗开始后90天内影像学证实的肺炎风险。方法:回顾性分析205例接受化疗的非霍奇金淋巴瘤患者。两步特征选择(LASSO之后是基于随机森林的递归特征消除)确定了四个预测因素:高度恶性肿瘤、饮酒(饮酒)、估计肾小球滤过率(eGFR)和吸烟。五种算法在分层70/30分割(训练n = 145;内部保持测试集n = 60)下进行训练和比较,并进行泄漏安全预处理(折叠内kNN插入、SMOTE和缩放)。梯度增强机(GBM)表现最好,并使用SHAP进行解释。一个基于网络的原型仅供研究使用。结果:在内部hold out测试集(n = 60)上,GBM的AUC为0.855 (95% CI 0.746-0.964), F1评分为0.679,Brier评分为0.155。SHAP确定eGFR降低、吸烟、饮酒和高度恶性肿瘤是影响因素;案例级瀑布和力图增强了透明度。这些估计仅反映了内部验证,没有系统的微生物学确认或标准化的放射学评分。因此,性能可能是乐观的,在等待时间和多中心外部验证(可能重新校准)和前瞻性评估之前,不建议实际使用。结论:可解释的GBM模型在内部hold-out测试集上具有良好的识别和校准能力;然而,临床部署需要时间和多中心外部验证(以及潜在重新校准的前瞻性评估)。随附的网络计算器是一个仅用于研究的原型,在验证完成之前不打算用于临床决策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Frontiers in Medicine
Frontiers in Medicine Medicine-General Medicine
CiteScore
5.10
自引率
5.10%
发文量
3710
审稿时长
12 weeks
期刊介绍: Frontiers in Medicine publishes rigorously peer-reviewed research linking basic research to clinical practice and patient care, as well as translating scientific advances into new therapies and diagnostic tools. Led by an outstanding Editorial Board of international experts, this multidisciplinary open-access journal is at the forefront of disseminating and communicating scientific knowledge and impactful discoveries to researchers, academics, clinicians and the public worldwide. In addition to papers that provide a link between basic research and clinical practice, a particular emphasis is given to studies that are directly relevant to patient care. In this spirit, the journal publishes the latest research results and medical knowledge that facilitate the translation of scientific advances into new therapies or diagnostic tools. The full listing of the Specialty Sections represented by Frontiers in Medicine is as listed below. As well as the established medical disciplines, Frontiers in Medicine is launching new sections that together will facilitate - the use of patient-reported outcomes under real world conditions - the exploitation of big data and the use of novel information and communication tools in the assessment of new medicines - the scientific bases for guidelines and decisions from regulatory authorities - access to medicinal products and medical devices worldwide - addressing the grand health challenges around the world
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信