Ye Wang , Zhen Pan , Shoufeng Li , Huajun Cai , Ying Huang , Jinfu Zhuang , Xing Liu , Xingrong Lu , Guoxian Guan
{"title":"基于可解释机器学习的新型预测器,预测和验证新辅助化放疗下局部晚期直肠癌的病理完全反应","authors":"Ye Wang , Zhen Pan , Shoufeng Li , Huajun Cai , Ying Huang , Jinfu Zhuang , Xing Liu , Xingrong Lu , Guoxian Guan","doi":"10.1016/j.ejso.2024.108738","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Precise evaluation of pathological complete response (pCR) is essential for determining the prognosis of patients with locally advanced rectal cancer (LARC) undergoing neoadjuvant chemoradiotherapy (NCRT) and can offer clues for the selection of subsequent treatment strategies. Most current predictive models for pCR focus primarily on pre-treatment factors, neglecting the dynamic systemic changes that occur during neoadjuvant chemoradiotherapy, and are constrained by low accuracy and lack of integrity.</div></div><div><h3>Purpose</h3><div>This study devised a novel predictor of pCR using dynamic alterations in systemic inflammation-nutritional marker indexes (SINI) during neoadjuvant therapy and developed a machine-learning model to predict pCR.</div></div><div><h3>Methods</h3><div>Two cohorts of patients with LARC from center one from 2012 to 2017 and from center two from 2020 to 2023 were integrated for analysis. This study compared dynamic changes in blood indexes before and after neoadjuvant therapy and surgical operation. A least absolute shrinkage and selection operator (LASSO) regression analysis was conducted to mitigate collinearity and identify key indexes, constructing the SINI. Univariate and multiple logistic regression analyses were used to identify the independent risk factors associated with pCR. Additionally, 10 machine learning algorithms were employed to develop predictive models to assess risk. The hyperparameters of the machine learning models were optimized using a random search and 10-fold cross-validation. The models were assessed by examining various metrics, including the area under the receiver operating characteristic curves (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis, calibration curves, and the precision and accuracy of the internal and external validation cohorts. Additionally, Shapley's additive explanations (SHAP) were employed to interpret the machine learning models.</div></div><div><h3>Results</h3><div>The study cohort comprised 677 patients from the center one and 224 patients from the center two. Six key indexes were identified, and a predictive index, SINI, was constructed. Univariate and multiple logistic regression analyses revealed that SINI, clinical T-stage, clinical N-stage, tumor size, and the distance from the anal verge were independent risk factors for pCR in patients with LARC following NCRT. The mean AUC value of the extreme gradient boosting (XGB) model in the 10-fold cross-validation of the training set was 0.877. The XGB model demonstrated superior performance in the internal and external validation sets. Specifically, in the internal test set, the XGB model achieved an AUC of 0.86, AUPRC of 0.707, accuracy of 0.82, and precision of 0.80. In the external validation set, the XGB model exhibited an AUC of 0.83, AUPRC of 0.702, accuracy of 0.81, and precision of 0.81. Additionally, the predictions generated by the XGB model were analyzed using SHAP.</div></div><div><h3>Conclusion</h3><div>This study involved developing and validating an XGB model using SINI to predict pCR in patients with LARC. Besides, a SINI-based machine learning model shows promise in accurately predicting pCR following NCRT in patients with resectable LARC, offering valuable insights for personalized treatment approaches.</div></div>","PeriodicalId":11522,"journal":{"name":"Ejso","volume":"50 12","pages":"Article 108738"},"PeriodicalIF":3.5000,"publicationDate":"2024-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction and validation of pathologic complete response for locally advanced rectal cancer under neoadjuvant chemoradiotherapy based on a novel predictor using interpretable machine learning\",\"authors\":\"Ye Wang , Zhen Pan , Shoufeng Li , Huajun Cai , Ying Huang , Jinfu Zhuang , Xing Liu , Xingrong Lu , Guoxian Guan\",\"doi\":\"10.1016/j.ejso.2024.108738\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Precise evaluation of pathological complete response (pCR) is essential for determining the prognosis of patients with locally advanced rectal cancer (LARC) undergoing neoadjuvant chemoradiotherapy (NCRT) and can offer clues for the selection of subsequent treatment strategies. Most current predictive models for pCR focus primarily on pre-treatment factors, neglecting the dynamic systemic changes that occur during neoadjuvant chemoradiotherapy, and are constrained by low accuracy and lack of integrity.</div></div><div><h3>Purpose</h3><div>This study devised a novel predictor of pCR using dynamic alterations in systemic inflammation-nutritional marker indexes (SINI) during neoadjuvant therapy and developed a machine-learning model to predict pCR.</div></div><div><h3>Methods</h3><div>Two cohorts of patients with LARC from center one from 2012 to 2017 and from center two from 2020 to 2023 were integrated for analysis. This study compared dynamic changes in blood indexes before and after neoadjuvant therapy and surgical operation. A least absolute shrinkage and selection operator (LASSO) regression analysis was conducted to mitigate collinearity and identify key indexes, constructing the SINI. Univariate and multiple logistic regression analyses were used to identify the independent risk factors associated with pCR. Additionally, 10 machine learning algorithms were employed to develop predictive models to assess risk. The hyperparameters of the machine learning models were optimized using a random search and 10-fold cross-validation. The models were assessed by examining various metrics, including the area under the receiver operating characteristic curves (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis, calibration curves, and the precision and accuracy of the internal and external validation cohorts. Additionally, Shapley's additive explanations (SHAP) were employed to interpret the machine learning models.</div></div><div><h3>Results</h3><div>The study cohort comprised 677 patients from the center one and 224 patients from the center two. Six key indexes were identified, and a predictive index, SINI, was constructed. Univariate and multiple logistic regression analyses revealed that SINI, clinical T-stage, clinical N-stage, tumor size, and the distance from the anal verge were independent risk factors for pCR in patients with LARC following NCRT. The mean AUC value of the extreme gradient boosting (XGB) model in the 10-fold cross-validation of the training set was 0.877. The XGB model demonstrated superior performance in the internal and external validation sets. Specifically, in the internal test set, the XGB model achieved an AUC of 0.86, AUPRC of 0.707, accuracy of 0.82, and precision of 0.80. In the external validation set, the XGB model exhibited an AUC of 0.83, AUPRC of 0.702, accuracy of 0.81, and precision of 0.81. Additionally, the predictions generated by the XGB model were analyzed using SHAP.</div></div><div><h3>Conclusion</h3><div>This study involved developing and validating an XGB model using SINI to predict pCR in patients with LARC. Besides, a SINI-based machine learning model shows promise in accurately predicting pCR following NCRT in patients with resectable LARC, offering valuable insights for personalized treatment approaches.</div></div>\",\"PeriodicalId\":11522,\"journal\":{\"name\":\"Ejso\",\"volume\":\"50 12\",\"pages\":\"Article 108738\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ejso\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0748798324007959\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ejso","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0748798324007959","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
Prediction and validation of pathologic complete response for locally advanced rectal cancer under neoadjuvant chemoradiotherapy based on a novel predictor using interpretable machine learning
Background
Precise evaluation of pathological complete response (pCR) is essential for determining the prognosis of patients with locally advanced rectal cancer (LARC) undergoing neoadjuvant chemoradiotherapy (NCRT) and can offer clues for the selection of subsequent treatment strategies. Most current predictive models for pCR focus primarily on pre-treatment factors, neglecting the dynamic systemic changes that occur during neoadjuvant chemoradiotherapy, and are constrained by low accuracy and lack of integrity.
Purpose
This study devised a novel predictor of pCR using dynamic alterations in systemic inflammation-nutritional marker indexes (SINI) during neoadjuvant therapy and developed a machine-learning model to predict pCR.
Methods
Two cohorts of patients with LARC from center one from 2012 to 2017 and from center two from 2020 to 2023 were integrated for analysis. This study compared dynamic changes in blood indexes before and after neoadjuvant therapy and surgical operation. A least absolute shrinkage and selection operator (LASSO) regression analysis was conducted to mitigate collinearity and identify key indexes, constructing the SINI. Univariate and multiple logistic regression analyses were used to identify the independent risk factors associated with pCR. Additionally, 10 machine learning algorithms were employed to develop predictive models to assess risk. The hyperparameters of the machine learning models were optimized using a random search and 10-fold cross-validation. The models were assessed by examining various metrics, including the area under the receiver operating characteristic curves (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis, calibration curves, and the precision and accuracy of the internal and external validation cohorts. Additionally, Shapley's additive explanations (SHAP) were employed to interpret the machine learning models.
Results
The study cohort comprised 677 patients from the center one and 224 patients from the center two. Six key indexes were identified, and a predictive index, SINI, was constructed. Univariate and multiple logistic regression analyses revealed that SINI, clinical T-stage, clinical N-stage, tumor size, and the distance from the anal verge were independent risk factors for pCR in patients with LARC following NCRT. The mean AUC value of the extreme gradient boosting (XGB) model in the 10-fold cross-validation of the training set was 0.877. The XGB model demonstrated superior performance in the internal and external validation sets. Specifically, in the internal test set, the XGB model achieved an AUC of 0.86, AUPRC of 0.707, accuracy of 0.82, and precision of 0.80. In the external validation set, the XGB model exhibited an AUC of 0.83, AUPRC of 0.702, accuracy of 0.81, and precision of 0.81. Additionally, the predictions generated by the XGB model were analyzed using SHAP.
Conclusion
This study involved developing and validating an XGB model using SINI to predict pCR in patients with LARC. Besides, a SINI-based machine learning model shows promise in accurately predicting pCR following NCRT in patients with resectable LARC, offering valuable insights for personalized treatment approaches.
期刊介绍:
JSO - European Journal of Surgical Oncology ("the Journal of Cancer Surgery") is the Official Journal of the European Society of Surgical Oncology and BASO ~ the Association for Cancer Surgery.
The EJSO aims to advance surgical oncology research and practice through the publication of original research articles, review articles, editorials, debates and correspondence.