Comparison of machine learning and deep learning models for survival prediction in early-stage hormone receptor-positive/HER2-negative breast cancer receiving neoadjuvant chemotherapy

ESMO Real World Data and Digital Oncology Pub Date : 2025-09-16 DOI:10.1016/j.esmorw.2025.100184

L. Mastrantoni , G. Garufi , G. Giordano , N. Maliziola , E. Di Monte , G. Arcuri , V. Frescura , A. Rotondi , A. Orlandi , L. Carbognin , A. Palazzo , L. Pontolillo , A. Fabi , S. Pannunzio , I. Paris , F. Marazzi , A. Franco , G. Franceschini , G. Scambia , D. Giannarelli , E. Bria

{"title":"Comparison of machine learning and deep learning models for survival prediction in early-stage hormone receptor-positive/HER2-negative breast cancer receiving neoadjuvant chemotherapy","authors":"L. Mastrantoni , G. Garufi , G. Giordano , N. Maliziola , E. Di Monte , G. Arcuri , V. Frescura , A. Rotondi , A. Orlandi , L. Carbognin , A. Palazzo , L. Pontolillo , A. Fabi , S. Pannunzio , I. Paris , F. Marazzi , A. Franco , G. Franceschini , G. Scambia , D. Giannarelli , E. Bria","doi":"10.1016/j.esmorw.2025.100184","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>We compared machine learning (ML) and deep learning (DL) models to predict disease-free survival (DFS) and overall survival (OS) in patients with hormone receptor (HR)-positive/human epidermal growth factor receptor 2 (HER2)-negative breast cancer (BC) receiving neoadjuvant chemotherapy (NACT), using routine clinicopathological features before and after surgery.</div></div><div><h3>Materials and methods</h3><div>In this retrospective cohort, 572 patients with stage I-III HR-positive/HER2-negative BC treated with anthracycline/taxane-based NACT and surgery were analyzed. Data were split into training (<em>n</em> = 463) and validation (<em>n</em> = 109) sets. Five ML models (random survival forest, extra survival tree, gradient boosting machine, support vector machine, regularized Cox) and four neural networks (DeepSurv, DeepHit, logistic hazard, multi-task logistic regression) were trained via five-fold cross-validation. Performance was assessed on the validation cohort by the C-index and integrated Brier score (iBS).</div></div><div><h3>Results</h3><div>Median age was 49 years and pathological complete response (pCR) rate was 15%. Median DFS was 103 months [95% confidence interval (CI) 84.4 months-not estimable (NE)], and 5-year OS was 78.6% (95% CI 74.8% to 82.5%). DeepSurv yielded the best overall performance, with a C-index of 0.70 (95% CI 0.60-0.78, iBS 0.22) for DFS and 0.68 (95% CI 0.56-0.79, iBS 0.17) for OS. The top ML model achieved C-indices of 0.64 (DFS) and 0.68 (OS). Key predictors were nodal status, estrogen receptor/progesterone receptor expression, tumor size, Ki-67 and pCR.</div></div><div><h3>Conclusions</h3><div>Both ML and DL models predicted survival post-NACT in HR-positive/HER2-negative BC, suggesting that simple models can perform as well as DL architectures in small datasets. The marginally higher discrimination of DL models should be weighed against computational demands and lower interpretability compared with ML methods.</div></div>","PeriodicalId":100491,"journal":{"name":"ESMO Real World Data and Digital Oncology","volume":"10 ","pages":"Article 100184"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ESMO Real World Data and Digital Oncology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949820125000736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background

We compared machine learning (ML) and deep learning (DL) models to predict disease-free survival (DFS) and overall survival (OS) in patients with hormone receptor (HR)-positive/human epidermal growth factor receptor 2 (HER2)-negative breast cancer (BC) receiving neoadjuvant chemotherapy (NACT), using routine clinicopathological features before and after surgery.

Materials and methods

In this retrospective cohort, 572 patients with stage I-III HR-positive/HER2-negative BC treated with anthracycline/taxane-based NACT and surgery were analyzed. Data were split into training (n = 463) and validation (n = 109) sets. Five ML models (random survival forest, extra survival tree, gradient boosting machine, support vector machine, regularized Cox) and four neural networks (DeepSurv, DeepHit, logistic hazard, multi-task logistic regression) were trained via five-fold cross-validation. Performance was assessed on the validation cohort by the C-index and integrated Brier score (iBS).

Results

Median age was 49 years and pathological complete response (pCR) rate was 15%. Median DFS was 103 months [95% confidence interval (CI) 84.4 months-not estimable (NE)], and 5-year OS was 78.6% (95% CI 74.8% to 82.5%). DeepSurv yielded the best overall performance, with a C-index of 0.70 (95% CI 0.60-0.78, iBS 0.22) for DFS and 0.68 (95% CI 0.56-0.79, iBS 0.17) for OS. The top ML model achieved C-indices of 0.64 (DFS) and 0.68 (OS). Key predictors were nodal status, estrogen receptor/progesterone receptor expression, tumor size, Ki-67 and pCR.

Conclusions

Both ML and DL models predicted survival post-NACT in HR-positive/HER2-negative BC, suggesting that simple models can perform as well as DL architectures in small datasets. The marginally higher discrimination of DL models should be weighed against computational demands and lower interpretability compared with ML methods.

查看原文本刊更多论文

机器学习与深度学习模型对早期激素受体阳性/ her2阴性乳腺癌新辅助化疗患者生存预测的比较

我们比较了机器学习（ML）和深度学习（DL）模型来预测接受新辅助化疗（NACT）的激素受体（HR）阳性/人表皮生长因子受体2 （HER2）阴性乳腺癌（BC）患者的无病生存期（DFS）和总生存期（OS），使用术前和术后常规临床病理特征。材料和方法在这项回顾性队列研究中，572例I-III期hr阳性/ her2阴性BC患者接受蒽环类/紫杉烷类NACT和手术治疗。数据分为训练集（n = 463）和验证集（n = 109）。5个ML模型（随机生存森林、额外生存树、梯度增强机、支持向量机、正则化Cox）和4个神经网络（DeepSurv、DeepHit、logistic hazard、多任务逻辑回归）通过5倍交叉验证进行训练。在验证队列中，通过c指数和综合Brier评分（iBS）评估疗效。结果患者中位年龄49岁，病理完全缓解（pCR）率为15%。中位DFS为103个月[95%置信区间（CI） 84.4个月-不可估计（NE）]， 5年OS为78.6% （95% CI 74.8%至82.5%）。DeepSurv获得了最佳的整体性能，DFS的c指数为0.70 (95% CI 0.60-0.78, iBS 0.22)， OS的c指数为0.68 （95% CI 0.56-0.79, iBS 0.17）。顶级ML模型的c指数分别为0.64 （DFS）和0.68 （OS）。关键预测因子为淋巴结状态、雌激素受体/孕激素受体表达、肿瘤大小、Ki-67和pCR。结论ML和DL模型均可预测hr阳性/ her2阴性BC患者nact后的生存，表明简单模型在小数据集中的表现与DL架构一样好。与机器学习方法相比，深度学习模型略高的识别率应该与计算需求和较低的可解释性进行权衡。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ESMO Real World Data and Digital Oncology

自引率

0.00%

发文量