Machine learning-based prediction of disease-free survival in breast cancer patients with non-pathological complete response after neoadjuvant chemotherapy: a retrospective multicenter cohort study.

IF 2.9 3区 医学 Q2 ONCOLOGY
American journal of cancer research Pub Date : 2025-06-15 eCollection Date: 2025-01-01 DOI:10.62347/MHSV3723
Zi-Ran Zhang, Chao-Xian Wang, Huan Wang, Si-Li Jin
{"title":"Machine learning-based prediction of disease-free survival in breast cancer patients with non-pathological complete response after neoadjuvant chemotherapy: a retrospective multicenter cohort study.","authors":"Zi-Ran Zhang, Chao-Xian Wang, Huan Wang, Si-Li Jin","doi":"10.62347/MHSV3723","DOIUrl":null,"url":null,"abstract":"<p><p>This study aimed to construct a robust machine learning (ML) model for predicting the disease-free survival (DFS) and risk stratification of breast cancer (BC) patients with non-pathological complete response (non-PCR) after neoadjuvant chemotherapy (NAC). The model will facilitate the initiation of early interventions for high-risk patients. This retrospective multicenter cohort study included BC patients from two hospitals in China who received NAC but did not achieve PCR. Four ML algorithms were utilized to construct models based on patients' clinicopathological data, followed by a performance evaluation of these models. To improve the interpretability of the model, the shapley additive explanation (SHAP) method was employed to analyze the contribution of each feature to the predictive outcomes. A total of 463 non-PCR patients were included in the study. Of these, 385 patients were from Ruijin Hospital, affiliated with Shanghai Jiao Tong University, and were randomly split into a training cohort and an internal validation cohort in a 3:1 ratio for model development and preliminary performance evaluation. In addition, 78 patients enrolled from Jiaxing Women and Children's Hospital were assigned to the external validation cohort to evaluate the model's generalizability. Univariate and multivariate Cox regression analyses demonstrated that age, residual tumor size, Ki67 change, molecular subtype, and axillary lymph node metastasis were independent factors influencing DFS. Among the four ML models, the random survival forest (RSF) model showed the best performance, with a concordance index of 0.820 in the training cohort, 0.642 in the internal validation cohort, and 0.689 in the external validation cohort. Further analysis revealed that the RSF model had excellent discriminative ability with a high area under curve value, while its low Brier score indicated excellent calibration. Decision curve analysis indicated that the RSF model offered a higher clinical net benefit at various time points and effectively stratified risk, successfully identifying high-risk patients. SHAP analysis underscored residual tumor size as the most influential predictive feature. The RSF model can effectively predict DFS and risk of BC patients with non-PCR following NAC, offering a critical reference for developing individualized treatment strategies.</p>","PeriodicalId":7437,"journal":{"name":"American journal of cancer research","volume":"15 6","pages":"2482-2499"},"PeriodicalIF":2.9000,"publicationDate":"2025-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12256414/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of cancer research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.62347/MHSV3723","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

This study aimed to construct a robust machine learning (ML) model for predicting the disease-free survival (DFS) and risk stratification of breast cancer (BC) patients with non-pathological complete response (non-PCR) after neoadjuvant chemotherapy (NAC). The model will facilitate the initiation of early interventions for high-risk patients. This retrospective multicenter cohort study included BC patients from two hospitals in China who received NAC but did not achieve PCR. Four ML algorithms were utilized to construct models based on patients' clinicopathological data, followed by a performance evaluation of these models. To improve the interpretability of the model, the shapley additive explanation (SHAP) method was employed to analyze the contribution of each feature to the predictive outcomes. A total of 463 non-PCR patients were included in the study. Of these, 385 patients were from Ruijin Hospital, affiliated with Shanghai Jiao Tong University, and were randomly split into a training cohort and an internal validation cohort in a 3:1 ratio for model development and preliminary performance evaluation. In addition, 78 patients enrolled from Jiaxing Women and Children's Hospital were assigned to the external validation cohort to evaluate the model's generalizability. Univariate and multivariate Cox regression analyses demonstrated that age, residual tumor size, Ki67 change, molecular subtype, and axillary lymph node metastasis were independent factors influencing DFS. Among the four ML models, the random survival forest (RSF) model showed the best performance, with a concordance index of 0.820 in the training cohort, 0.642 in the internal validation cohort, and 0.689 in the external validation cohort. Further analysis revealed that the RSF model had excellent discriminative ability with a high area under curve value, while its low Brier score indicated excellent calibration. Decision curve analysis indicated that the RSF model offered a higher clinical net benefit at various time points and effectively stratified risk, successfully identifying high-risk patients. SHAP analysis underscored residual tumor size as the most influential predictive feature. The RSF model can effectively predict DFS and risk of BC patients with non-PCR following NAC, offering a critical reference for developing individualized treatment strategies.

基于机器学习的新辅助化疗后无病理性完全缓解乳腺癌患者无病生存期预测:一项回顾性多中心队列研究
本研究旨在构建一个鲁棒的机器学习(ML)模型,用于预测新辅助化疗(NAC)后非病理完全缓解(non-PCR)的乳腺癌(BC)患者的无病生存(DFS)和风险分层。该模型将有助于对高危患者进行早期干预。这项回顾性多中心队列研究纳入了来自中国两家医院接受NAC但未进行PCR的BC患者。根据患者的临床病理数据,利用四种ML算法构建模型,并对这些模型进行性能评估。为了提高模型的可解释性,采用shapley加性解释(SHAP)方法分析各特征对预测结果的贡献。共纳入463例非pcr患者。其中,385例患者来自上海交通大学附属瑞金医院,按3:1的比例随机分为培训队列和内部验证队列,用于模型开发和初步绩效评估。此外,从嘉兴市妇女儿童医院招募的78例患者被分配到外部验证队列,以评估模型的可推广性。单因素和多因素Cox回归分析显示,年龄、残留肿瘤大小、Ki67变化、分子亚型和腋窝淋巴结转移是影响DFS的独立因素。在4个ML模型中,随机生存森林(RSF)模型表现最好,训练队列的一致性指数为0.820,内部验证队列的一致性指数为0.642,外部验证队列的一致性指数为0.689。进一步分析表明,RSF模型具有良好的判别能力,曲线值下面积大,Brier评分低,表明其校正效果良好。决策曲线分析表明,RSF模型在不同时间点具有较高的临床净收益,并能有效分层风险,成功识别高危患者。SHAP分析强调残余肿瘤大小是最具影响力的预测特征。RSF模型可有效预测NAC后非pcr BC患者的DFS和风险,为制定个体化治疗策略提供重要参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
3.80%
发文量
263
期刊介绍: The American Journal of Cancer Research (AJCR) (ISSN 2156-6976), is an independent open access, online only journal to facilitate rapid dissemination of novel discoveries in basic science and treatment of cancer. It was founded by a group of scientists for cancer research and clinical academic oncologists from around the world, who are devoted to the promotion and advancement of our understanding of the cancer and its treatment. The scope of AJCR is intended to encompass that of multi-disciplinary researchers from any scientific discipline where the primary focus of the research is to increase and integrate knowledge about etiology and molecular mechanisms of carcinogenesis with the ultimate aim of advancing the cure and prevention of this increasingly devastating disease. To achieve these aims AJCR will publish review articles, original articles and new techniques in cancer research and therapy. It will also publish hypothesis, case reports and letter to the editor. Unlike most other open access online journals, AJCR will keep most of the traditional features of paper print that we are all familiar with, such as continuous volume, issue numbers, as well as continuous page numbers to retain our comfortable familiarity towards an academic journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信