Jeong Hyun Lee, Jaeyun Jeong, Young Jin Ahn, Kwang Suk Lee, Jong Soo Lee, Seung Hwan Lee, Won Sik Ham, Byung Ha Chung, Kyo Chul Koo
{"title":"Machine-Learning-Based Survival Prediction in Castration-Resistant Prostate Cancer: A Multi-Model Analysis Using a Comprehensive Clinical Dataset.","authors":"Jeong Hyun Lee, Jaeyun Jeong, Young Jin Ahn, Kwang Suk Lee, Jong Soo Lee, Seung Hwan Lee, Won Sik Ham, Byung Ha Chung, Kyo Chul Koo","doi":"10.3390/jpm15090432","DOIUrl":null,"url":null,"abstract":"<p><p><b>Purpose:</b> Accurate survival prediction is essential for optimizing the treatment planning in patients with castration-resistant prostate cancer (CRPC). However, the traditional statistical models often underperform due to limited variable inclusion and an inability to account for complex, multidimensional data interactions. <b>Methods:</b> We retrospectively collected 46 clinical, laboratory, and pathological variables from 801 patients with CRPC, covering the disease course from the initial disease diagnosis to CRPC progression. Multiple machine learning (ML) models, including random survival forests (RSFs), XGBoost, LightGBM, and logistic regression, were developed to predict cancer-specific mortality (CSM), overall mortality (OM), and 2- and 3-year survival status. The dataset was split into training and test cohorts (80:20), with 10-fold cross-validation. The performance was assessed using the C-index for regression models and the AUC, accuracy, precision, recall, and F1-score for classification models. Model interpretability was assessed using SHapley Additive exPlanations (SHAP). <b>Results:</b> Over a median follow-up of 24 months, 70.6% of patients experienced CSM. RSFs achieved the highest C-index in the test set for both CSM (0.772) and OM (0.771). For classification tasks, RSFs demonstrated a superior performance in predicting 2-year survival, while XGBoost yielded the highest F1-score for 3-year survival. The SHAP analysis identified time to first-line CRPC treatment and hemoglobin and alkaline phosphatase levels as key predictors of survival outcomes. <b>Conclusion:</b> The RSF and XGBoost ML models demonstrated a superior performance over that of traditional statistical methods in predicting survival in CRPC. These models offer accurate and interpretable prognostic tools that may inform personalized treatment strategies. External validation and the integration of emerging therapies are warranted for broader clinical applicability.</p>","PeriodicalId":16722,"journal":{"name":"Journal of Personalized Medicine","volume":"15 9","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12471436/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Personalized Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/jpm15090432","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Accurate survival prediction is essential for optimizing the treatment planning in patients with castration-resistant prostate cancer (CRPC). However, the traditional statistical models often underperform due to limited variable inclusion and an inability to account for complex, multidimensional data interactions. Methods: We retrospectively collected 46 clinical, laboratory, and pathological variables from 801 patients with CRPC, covering the disease course from the initial disease diagnosis to CRPC progression. Multiple machine learning (ML) models, including random survival forests (RSFs), XGBoost, LightGBM, and logistic regression, were developed to predict cancer-specific mortality (CSM), overall mortality (OM), and 2- and 3-year survival status. The dataset was split into training and test cohorts (80:20), with 10-fold cross-validation. The performance was assessed using the C-index for regression models and the AUC, accuracy, precision, recall, and F1-score for classification models. Model interpretability was assessed using SHapley Additive exPlanations (SHAP). Results: Over a median follow-up of 24 months, 70.6% of patients experienced CSM. RSFs achieved the highest C-index in the test set for both CSM (0.772) and OM (0.771). For classification tasks, RSFs demonstrated a superior performance in predicting 2-year survival, while XGBoost yielded the highest F1-score for 3-year survival. The SHAP analysis identified time to first-line CRPC treatment and hemoglobin and alkaline phosphatase levels as key predictors of survival outcomes. Conclusion: The RSF and XGBoost ML models demonstrated a superior performance over that of traditional statistical methods in predicting survival in CRPC. These models offer accurate and interpretable prognostic tools that may inform personalized treatment strategies. External validation and the integration of emerging therapies are warranted for broader clinical applicability.
期刊介绍:
Journal of Personalized Medicine (JPM; ISSN 2075-4426) is an international, open access journal aimed at bringing all aspects of personalized medicine to one platform. JPM publishes cutting edge, innovative preclinical and translational scientific research and technologies related to personalized medicine (e.g., pharmacogenomics/proteomics, systems biology). JPM recognizes that personalized medicine—the assessment of genetic, environmental and host factors that cause variability of individuals—is a challenging, transdisciplinary topic that requires discussions from a range of experts. For a comprehensive perspective of personalized medicine, JPM aims to integrate expertise from the molecular and translational sciences, therapeutics and diagnostics, as well as discussions of regulatory, social, ethical and policy aspects. We provide a forum to bring together academic and clinical researchers, biotechnology, diagnostic and pharmaceutical companies, health professionals, regulatory and ethical experts, and government and regulatory authorities.