Interpretable machine learning models for survival prediction in prostate cancer bone metastases.

IF 3.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES
Hua Zhang, Bingtian Dong, Jialin Han, Lewen Huang
{"title":"Interpretable machine learning models for survival prediction in prostate cancer bone metastases.","authors":"Hua Zhang, Bingtian Dong, Jialin Han, Lewen Huang","doi":"10.1038/s41598-025-09691-8","DOIUrl":null,"url":null,"abstract":"<p><p>Prostate cancer bone metastasis (PCBM) is a highly lethal condition with limited survival. Accurate survival prediction is essential for managing these typically incurable patients. However, existing clinical models lack precision. This study seeks to establish machine learning models to improve survival predictions for PCBM patients. We extracted data for PCBM patients from the SEER database spanning 2010 to 2019. Prognostic features were identified through univariate and multivariate Cox regression analyses. To predict survival outcomes, we developed and validated XGBoost models with five-fold cross-validation. Model performance was assessed based on the area under the receiver operating characteristic curve (AUC) and overall accuracy. Feature importance was assessed using SHAP (SHapley Additive exPlanations) values, while decision curve analysis was conducted to determine the clinical applicability of the models. Additionally, Kaplan-Meier (K-M) analysis was employed to examine the impact of surgery, radiotherapy, and chemotherapy on the survival of PCBM patients. The XGBoost models achieved robust performance in predicting survival for PCBM patients, with AUC values of 0.76, 0.83, and 0.91 for 1-year, 3-year, and 5-year survival predictions, respectively, in the test set. Key prognostic factors included T stage, grade, age, PSA, and Gleason score. Single patients exhibited a significantly higher mortality risk than their married counterparts (HR = 1.23, 95% CI 1.19-1.27, p < 0.001). Conversely, a median household income exceeding $75,000 was associated with a notably reduced mortality risk (HR = 0.87, 95% CI 0.85-0.90, p < 0.001). Univariate Cox analysis showed that surgery, chemotherapy, and radiotherapy were all significantly associated with improved survival. However, multivariate Cox regression analysis indicated that only chemotherapy (HR = 0.85, 95% CI 0.81-0.89, p < 0.001) and radiotherapy (HR = 0.96, 95% CI 0.93-0.99, p = 0.032) remained significant, while surgery (HR = 0.98, 95% CI 0.93-1.03, p = 0.387) did not. SHAP summary and force plots were utilized to analyze the XGBoost model both on a global and local scale. Subsequently, a web-based tool was created to streamline the integration of this predictive model into clinical settings. Our study examined the clinical features of patients with PCBM and developed six machine learning models for prognosis, with the XGBoost model demonstrating the highest performance. The model's high accuracy and interpretability provide valuable support for developing personalized treatment plans for PCBM patients.</p>","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"24150"},"PeriodicalIF":3.9000,"publicationDate":"2025-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12230155/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-09691-8","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Prostate cancer bone metastasis (PCBM) is a highly lethal condition with limited survival. Accurate survival prediction is essential for managing these typically incurable patients. However, existing clinical models lack precision. This study seeks to establish machine learning models to improve survival predictions for PCBM patients. We extracted data for PCBM patients from the SEER database spanning 2010 to 2019. Prognostic features were identified through univariate and multivariate Cox regression analyses. To predict survival outcomes, we developed and validated XGBoost models with five-fold cross-validation. Model performance was assessed based on the area under the receiver operating characteristic curve (AUC) and overall accuracy. Feature importance was assessed using SHAP (SHapley Additive exPlanations) values, while decision curve analysis was conducted to determine the clinical applicability of the models. Additionally, Kaplan-Meier (K-M) analysis was employed to examine the impact of surgery, radiotherapy, and chemotherapy on the survival of PCBM patients. The XGBoost models achieved robust performance in predicting survival for PCBM patients, with AUC values of 0.76, 0.83, and 0.91 for 1-year, 3-year, and 5-year survival predictions, respectively, in the test set. Key prognostic factors included T stage, grade, age, PSA, and Gleason score. Single patients exhibited a significantly higher mortality risk than their married counterparts (HR = 1.23, 95% CI 1.19-1.27, p < 0.001). Conversely, a median household income exceeding $75,000 was associated with a notably reduced mortality risk (HR = 0.87, 95% CI 0.85-0.90, p < 0.001). Univariate Cox analysis showed that surgery, chemotherapy, and radiotherapy were all significantly associated with improved survival. However, multivariate Cox regression analysis indicated that only chemotherapy (HR = 0.85, 95% CI 0.81-0.89, p < 0.001) and radiotherapy (HR = 0.96, 95% CI 0.93-0.99, p = 0.032) remained significant, while surgery (HR = 0.98, 95% CI 0.93-1.03, p = 0.387) did not. SHAP summary and force plots were utilized to analyze the XGBoost model both on a global and local scale. Subsequently, a web-based tool was created to streamline the integration of this predictive model into clinical settings. Our study examined the clinical features of patients with PCBM and developed six machine learning models for prognosis, with the XGBoost model demonstrating the highest performance. The model's high accuracy and interpretability provide valuable support for developing personalized treatment plans for PCBM patients.

Abstract Image

Abstract Image

Abstract Image

可解释的机器学习模型用于前列腺癌骨转移的生存预测。
前列腺癌骨转移(PCBM)是一种高致死率且生存期有限的疾病。准确的生存预测对于治疗这些典型的不治之症患者至关重要。然而,现有的临床模型缺乏准确性。本研究旨在建立机器学习模型,以提高PCBM患者的生存预测。我们从2010年至2019年的SEER数据库中提取了PCBM患者的数据。通过单因素和多因素Cox回归分析确定预后特征。为了预测生存结果,我们开发并验证了XGBoost模型,并进行了五倍交叉验证。模型的性能是根据接收者工作特征曲线下的面积(AUC)和总体精度来评估的。采用SHapley加性解释(SHapley Additive explanation)值评估特征重要性,同时进行决策曲线分析以确定模型的临床适用性。此外,我们采用Kaplan-Meier (K-M)分析来检验手术、放疗和化疗对PCBM患者生存的影响。XGBoost模型在预测PCBM患者的生存方面取得了稳健的表现,在测试集中,1年、3年和5年生存预测的AUC分别为0.76、0.83和0.91。主要预后因素包括T分期、分级、年龄、PSA和Gleason评分。单身患者的死亡风险明显高于已婚患者(HR = 1.23, 95% CI 1.19-1.27, p
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Scientific Reports
Scientific Reports Natural Science Disciplines-
CiteScore
7.50
自引率
4.30%
发文量
19567
审稿时长
3.9 months
期刊介绍: We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections. Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021). •Engineering Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live. •Physical sciences Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics. •Earth and environmental sciences Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems. •Biological sciences Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants. •Health sciences The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信