An interpretable machine learning model for predicting bone marrow invasion in patients with lymphoma via ¹⁸F-FDG PET/CT: a multicenter study.

IF 3.3 3区医学 Q2 MEDICAL INFORMATICS

BMC Medical Informatics and Decision Making Pub Date : 2025-07-15 DOI:10.1186/s12911-025-03110-8

Xinyu Zhu, Denglu Lu, Yang Wu, Yanqi Lu, Liang He, Yanyun Deng, Xingyu Mu, Wei Fu

{"title":"An interpretable machine learning model for predicting bone marrow invasion in patients with lymphoma via 18F-FDG PET/CT: a multicenter study.","authors":"Xinyu Zhu, Denglu Lu, Yang Wu, Yanqi Lu, Liang He, Yanyun Deng, Xingyu Mu, Wei Fu","doi":"10.1186/s12911-025-03110-8","DOIUrl":null,"url":null,"abstract":"Purpose: Accurate identification of bone marrow invasion (BMI) is critical for determining the prognosis of and treatment strategies for lymphoma. Although bone marrow biopsy (BMB) is the current gold standard, its invasive nature and sampling errors highlight the necessity for noninvasive alternatives. We aimed to develop and validate an interpretable machine learning model that integrates clinical data, 18F-fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) parameters, radiomic features, and deep learning features to predict BMI in lymphoma patients.Methods: We included 159 newly diagnosed lymphoma patients (118 from Center I and 41 from Center II), excluding those with prior treatments, incomplete data, or under 18 years of age. Data from Center I were randomly allocated to training (n = 94) and internal test (n = 24) sets; Center II served as an external validation set (n = 41). Clinical parameters, PET/CT features, radiomic characteristics, and deep learning features were comprehensively analyzed and integrated into machine learning models. Model interpretability was elucidated via Shapley Additive exPlanations (SHAPs). Additionally, a comparative diagnostic study evaluated reader performance with and without model assistance.Results: BMI was confirmed in 70 (44%) patients. The key clinical predictors included B symptoms and platelet count. Among the tested models, the ExtraTrees classifier achieved the best performance. For external validation, the combined model (clinical + PET/CT + radiomics + deep learning) achieved an area under the receiver operating characteristic curve (AUC) of 0.886, outperforming models that use only clinical (AUC 0.798), radiomic (AUC 0.708), or deep learning features (AUC 0.662). SHAP analysis revealed that PET radiomic features (especially PET_lbp_3D_m1_glcm_DependenceEntropy), platelet count, and B symptoms were significant predictors of BMI. Model assistance significantly enhanced junior reader performance (AUC improved from 0.663 to 0.818, p = 0.03) and improved senior reader accuracy, although not significantly (AUC 0.768 to 0.867, p = 0.10).Conclusion: Our interpretable machine learning model, which integrates clinical, imaging, radiomic, and deep learning features, demonstrated robust BMI prediction performance and notably enhanced physician diagnostic accuracy. These findings underscore the clinical potential of interpretable AI to complement medical expertise and potentially reduce the reliance on invasive BMB for lymphoma staging.","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"264"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12261613/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03110-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: Accurate identification of bone marrow invasion (BMI) is critical for determining the prognosis of and treatment strategies for lymphoma. Although bone marrow biopsy (BMB) is the current gold standard, its invasive nature and sampling errors highlight the necessity for noninvasive alternatives. We aimed to develop and validate an interpretable machine learning model that integrates clinical data, ¹⁸F-fluorodeoxyglucose positron emission tomography/computed tomography (¹⁸F-FDG PET/CT) parameters, radiomic features, and deep learning features to predict BMI in lymphoma patients.

Methods: We included 159 newly diagnosed lymphoma patients (118 from Center I and 41 from Center II), excluding those with prior treatments, incomplete data, or under 18 years of age. Data from Center I were randomly allocated to training (n = 94) and internal test (n = 24) sets; Center II served as an external validation set (n = 41). Clinical parameters, PET/CT features, radiomic characteristics, and deep learning features were comprehensively analyzed and integrated into machine learning models. Model interpretability was elucidated via Shapley Additive exPlanations (SHAPs). Additionally, a comparative diagnostic study evaluated reader performance with and without model assistance.

Results: BMI was confirmed in 70 (44%) patients. The key clinical predictors included B symptoms and platelet count. Among the tested models, the ExtraTrees classifier achieved the best performance. For external validation, the combined model (clinical + PET/CT + radiomics + deep learning) achieved an area under the receiver operating characteristic curve (AUC) of 0.886, outperforming models that use only clinical (AUC 0.798), radiomic (AUC 0.708), or deep learning features (AUC 0.662). SHAP analysis revealed that PET radiomic features (especially PET_lbp_3D_m1_glcm_DependenceEntropy), platelet count, and B symptoms were significant predictors of BMI. Model assistance significantly enhanced junior reader performance (AUC improved from 0.663 to 0.818, p = 0.03) and improved senior reader accuracy, although not significantly (AUC 0.768 to 0.867, p = 0.10).

Conclusion: Our interpretable machine learning model, which integrates clinical, imaging, radiomic, and deep learning features, demonstrated robust BMI prediction performance and notably enhanced physician diagnostic accuracy. These findings underscore the clinical potential of interpretable AI to complement medical expertise and potentially reduce the reliance on invasive BMB for lymphoma staging.

查看原文本刊更多论文

通过18F-FDG PET/CT预测淋巴瘤患者骨髓侵袭的可解释机器学习模型：一项多中心研究

目的：准确识别骨髓浸润（BMI）对确定淋巴瘤的预后和治疗策略至关重要。虽然骨髓活检（BMB）是目前的金标准，但其侵入性和采样误差突出了非侵入性替代方案的必要性。我们旨在开发并验证一个可解释的机器学习模型，该模型集成了临床数据、18f -氟脱氧葡萄糖正电子发射断层扫描/计算机断层扫描（18F-FDG PET/CT）参数、放射学特征和深度学习特征，以预测淋巴瘤患者的BMI。方法：我们纳入了159例新诊断的淋巴瘤患者（118例来自中心I， 41例来自中心II），排除了既往治疗、数据不完整或年龄在18岁以下的患者。中心I的数据被随机分配到训练组（n = 94）和内部测试组（n = 24）；中心II作为外部验证集（n = 41）。综合分析临床参数、PET/CT特征、放射学特征和深度学习特征，并将其整合到机器学习模型中。通过Shapley加性解释（SHAPs）阐明了模型的可解释性。此外，一项比较诊断研究评估了有和没有模型辅助的读者表现。结果：70例（44%）患者BMI得到确认。关键的临床预测指标包括B症状和血小板计数。在测试的模型中，ExtraTrees分类器的性能最好。对于外部验证，联合模型（临床+ PET/CT +放射组学+深度学习）的受试者工作特征曲线下面积（AUC）为0.886，优于仅使用临床（AUC 0.798），放射组学（AUC 0.708）或深度学习特征的模型（AUC 0.662）。SHAP分析显示PET放射学特征（尤其是PET_lbp_3D_m1_glcm_DependenceEntropy）、血小板计数和B症状是BMI的重要预测因子。模型辅助显着提高了初级读者的表现（AUC从0.663提高到0.818,p = 0.03），并提高了高级读者的准确性，尽管不显着（AUC从0.768提高到0.867,p = 0.10）。结论：我们的可解释机器学习模型集成了临床、影像学、放射学和深度学习特征，显示出强大的BMI预测性能，并显著提高了医生诊断的准确性。这些发现强调了可解释的人工智能在补充医学专业知识方面的临床潜力，并有可能减少对侵袭性BMB对淋巴瘤分期的依赖。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Medical Informatics and Decision Making 医学-医学：信息

CiteScore

7.20

自引率

5.70%

发文量

297

审稿时长

1 months

期刊介绍： BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.