Prediction and validation of pathologic complete response for locally advanced rectal cancer under neoadjuvant chemoradiotherapy based on a novel predictor using interpretable machine learning

IF 3.5 2区 医学 Q2 ONCOLOGY
Ejso Pub Date : 2024-10-06 DOI:10.1016/j.ejso.2024.108738
Ye Wang , Zhen Pan , Shoufeng Li , Huajun Cai , Ying Huang , Jinfu Zhuang , Xing Liu , Xingrong Lu , Guoxian Guan
{"title":"Prediction and validation of pathologic complete response for locally advanced rectal cancer under neoadjuvant chemoradiotherapy based on a novel predictor using interpretable machine learning","authors":"Ye Wang ,&nbsp;Zhen Pan ,&nbsp;Shoufeng Li ,&nbsp;Huajun Cai ,&nbsp;Ying Huang ,&nbsp;Jinfu Zhuang ,&nbsp;Xing Liu ,&nbsp;Xingrong Lu ,&nbsp;Guoxian Guan","doi":"10.1016/j.ejso.2024.108738","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Precise evaluation of pathological complete response (pCR) is essential for determining the prognosis of patients with locally advanced rectal cancer (LARC) undergoing neoadjuvant chemoradiotherapy (NCRT) and can offer clues for the selection of subsequent treatment strategies. Most current predictive models for pCR focus primarily on pre-treatment factors, neglecting the dynamic systemic changes that occur during neoadjuvant chemoradiotherapy, and are constrained by low accuracy and lack of integrity.</div></div><div><h3>Purpose</h3><div>This study devised a novel predictor of pCR using dynamic alterations in systemic inflammation-nutritional marker indexes (SINI) during neoadjuvant therapy and developed a machine-learning model to predict pCR.</div></div><div><h3>Methods</h3><div>Two cohorts of patients with LARC from center one from 2012 to 2017 and from center two from 2020 to 2023 were integrated for analysis. This study compared dynamic changes in blood indexes before and after neoadjuvant therapy and surgical operation. A least absolute shrinkage and selection operator (LASSO) regression analysis was conducted to mitigate collinearity and identify key indexes, constructing the SINI. Univariate and multiple logistic regression analyses were used to identify the independent risk factors associated with pCR. Additionally, 10 machine learning algorithms were employed to develop predictive models to assess risk. The hyperparameters of the machine learning models were optimized using a random search and 10-fold cross-validation. The models were assessed by examining various metrics, including the area under the receiver operating characteristic curves (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis, calibration curves, and the precision and accuracy of the internal and external validation cohorts. Additionally, Shapley's additive explanations (SHAP) were employed to interpret the machine learning models.</div></div><div><h3>Results</h3><div>The study cohort comprised 677 patients from the center one and 224 patients from the center two. Six key indexes were identified, and a predictive index, SINI, was constructed. Univariate and multiple logistic regression analyses revealed that SINI, clinical T-stage, clinical N-stage, tumor size, and the distance from the anal verge were independent risk factors for pCR in patients with LARC following NCRT. The mean AUC value of the extreme gradient boosting (XGB) model in the 10-fold cross-validation of the training set was 0.877. The XGB model demonstrated superior performance in the internal and external validation sets. Specifically, in the internal test set, the XGB model achieved an AUC of 0.86, AUPRC of 0.707, accuracy of 0.82, and precision of 0.80. In the external validation set, the XGB model exhibited an AUC of 0.83, AUPRC of 0.702, accuracy of 0.81, and precision of 0.81. Additionally, the predictions generated by the XGB model were analyzed using SHAP.</div></div><div><h3>Conclusion</h3><div>This study involved developing and validating an XGB model using SINI to predict pCR in patients with LARC. Besides, a SINI-based machine learning model shows promise in accurately predicting pCR following NCRT in patients with resectable LARC, offering valuable insights for personalized treatment approaches.</div></div>","PeriodicalId":11522,"journal":{"name":"Ejso","volume":"50 12","pages":"Article 108738"},"PeriodicalIF":3.5000,"publicationDate":"2024-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ejso","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0748798324007959","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Precise evaluation of pathological complete response (pCR) is essential for determining the prognosis of patients with locally advanced rectal cancer (LARC) undergoing neoadjuvant chemoradiotherapy (NCRT) and can offer clues for the selection of subsequent treatment strategies. Most current predictive models for pCR focus primarily on pre-treatment factors, neglecting the dynamic systemic changes that occur during neoadjuvant chemoradiotherapy, and are constrained by low accuracy and lack of integrity.

Purpose

This study devised a novel predictor of pCR using dynamic alterations in systemic inflammation-nutritional marker indexes (SINI) during neoadjuvant therapy and developed a machine-learning model to predict pCR.

Methods

Two cohorts of patients with LARC from center one from 2012 to 2017 and from center two from 2020 to 2023 were integrated for analysis. This study compared dynamic changes in blood indexes before and after neoadjuvant therapy and surgical operation. A least absolute shrinkage and selection operator (LASSO) regression analysis was conducted to mitigate collinearity and identify key indexes, constructing the SINI. Univariate and multiple logistic regression analyses were used to identify the independent risk factors associated with pCR. Additionally, 10 machine learning algorithms were employed to develop predictive models to assess risk. The hyperparameters of the machine learning models were optimized using a random search and 10-fold cross-validation. The models were assessed by examining various metrics, including the area under the receiver operating characteristic curves (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis, calibration curves, and the precision and accuracy of the internal and external validation cohorts. Additionally, Shapley's additive explanations (SHAP) were employed to interpret the machine learning models.

Results

The study cohort comprised 677 patients from the center one and 224 patients from the center two. Six key indexes were identified, and a predictive index, SINI, was constructed. Univariate and multiple logistic regression analyses revealed that SINI, clinical T-stage, clinical N-stage, tumor size, and the distance from the anal verge were independent risk factors for pCR in patients with LARC following NCRT. The mean AUC value of the extreme gradient boosting (XGB) model in the 10-fold cross-validation of the training set was 0.877. The XGB model demonstrated superior performance in the internal and external validation sets. Specifically, in the internal test set, the XGB model achieved an AUC of 0.86, AUPRC of 0.707, accuracy of 0.82, and precision of 0.80. In the external validation set, the XGB model exhibited an AUC of 0.83, AUPRC of 0.702, accuracy of 0.81, and precision of 0.81. Additionally, the predictions generated by the XGB model were analyzed using SHAP.

Conclusion

This study involved developing and validating an XGB model using SINI to predict pCR in patients with LARC. Besides, a SINI-based machine learning model shows promise in accurately predicting pCR following NCRT in patients with resectable LARC, offering valuable insights for personalized treatment approaches.
基于可解释机器学习的新型预测器,预测和验证新辅助化放疗下局部晚期直肠癌的病理完全反应
背景精确评估病理完全反应(pCR)对于确定接受新辅助化放疗(NCRT)的局部晚期直肠癌(LARC)患者的预后至关重要,并能为后续治疗策略的选择提供线索。目前大多数 pCR 预测模型主要关注治疗前因素,忽视了新辅助化放疗期间发生的动态系统性变化,准确性低且缺乏完整性。目的本研究利用新辅助治疗期间全身炎症-营养标志物指标(SINI)的动态变化设计了一种新型的pCR预测指标,并开发了一种机器学习模型来预测pCR。方法整合了第一中心2012年至2017年和第二中心2020年至2023年的两组LARC患者进行分析。该研究比较了新辅助治疗和手术前后血液指标的动态变化。研究采用最小绝对收缩和选择算子(LASSO)回归分析,以减少共线性并确定关键指标,从而构建 SINI。单变量和多元逻辑回归分析用于确定与 pCR 相关的独立风险因素。此外,还采用了 10 种机器学习算法来开发评估风险的预测模型。机器学习模型的超参数通过随机搜索和 10 倍交叉验证进行了优化。通过检查各种指标,包括接收者操作特征曲线下面积(AUC)、精度-召回曲线下面积(AUPRC)、决策曲线分析、校准曲线以及内部和外部验证队列的精度和准确性,对模型进行了评估。结果研究队列包括第一中心的 677 名患者和第二中心的 224 名患者。研究确定了六个关键指标,并构建了预测指标 SINI。单变量和多元逻辑回归分析表明,SINI、临床T分期、临床N分期、肿瘤大小和与肛缘的距离是NCRT后LARC患者pCR的独立危险因素。在训练集的 10 倍交叉验证中,极梯度提升(XGB)模型的平均 AUC 值为 0.877。XGB 模型在内部和外部验证集中表现出了卓越的性能。具体来说,在内部测试集中,XGB 模型的 AUC 为 0.86,AUPRC 为 0.707,准确率为 0.82,精度为 0.80。在外部验证集中,XGB 模型的 AUC 为 0.83,AUPRC 为 0.702,准确率为 0.81,精度为 0.81。本研究利用 SINI 开发并验证了 XGB 模型,用于预测 LARC 患者的 pCR。此外,基于 SINI 的机器学习模型有望准确预测可切除 LARC 患者 NCRT 后的 pCR,为个性化治疗方法提供有价值的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Ejso
Ejso 医学-外科
CiteScore
6.40
自引率
2.60%
发文量
1148
审稿时长
41 days
期刊介绍: JSO - European Journal of Surgical Oncology ("the Journal of Cancer Surgery") is the Official Journal of the European Society of Surgical Oncology and BASO ~ the Association for Cancer Surgery. The EJSO aims to advance surgical oncology research and practice through the publication of original research articles, review articles, editorials, debates and correspondence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信