OncoE25: an AI model for predicting postoperative prognosis in early-onset stage I-III colon and rectal cancer-a population-based study using SEER with dual-center cohort validation.

IF 7.5 2区 医学 Q1 MEDICINE, RESEARCH & EXPERIMENTAL
Luyun Yuan, Liyu Wang, Jiamin Gao, Xin Chen, Haoyue Wang, Wei Shan Tan, Kexiang Sun, Yabin Gong, Wanli Deng
{"title":"OncoE25: an AI model for predicting postoperative prognosis in early-onset stage I-III colon and rectal cancer-a population-based study using SEER with dual-center cohort validation.","authors":"Luyun Yuan, Liyu Wang, Jiamin Gao, Xin Chen, Haoyue Wang, Wei Shan Tan, Kexiang Sun, Yabin Gong, Wanli Deng","doi":"10.1186/s12967-025-06663-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Although CRC incidence is declining overall, early-onset colorectal cancers are increasing. No prognostic models currently exist for predicting postoperative survival in Stage I-III early-onset colon or rectal cancer. Such tools are urgently needed to enable individualized risk assessment.</p><p><strong>Methods: </strong>We identified patients with early onset (EO) and late-onset (LO) colon or rectal cancer from the SEER database and randomly split them into training and test cohorts (7:3). External cohorts of early-onset colon and rectal cancer were collected from two Chinese hospitals. After LASSO-Cox feature selection, six models-RSF, LASSO-Cox, S-SVM, XGBSE, GBSA, and DeepSurv-were developed to predict cancer-specific survival (CSS). Performance was assessed using the C-index, Brier score, time-dependent AUC, calibration, and decision curves. SHAP was used for model interpretation. A risk stratification system and an online calculator were constructed based on the best-performing model.</p><p><strong>Results: </strong>A total of 3,997 EO colon cancer, 2,016 EO rectal cancer, 30,621 LO colon cancer, and 8,667 LO rectal cancer patients from SEER, along with 205 EO colon cancer and 153 EO rectal cancer patients from Chinese institutions, were included in the study. Based on comprehensive evaluation across multiple datasets and metrics, the RSF model demonstrated the best and most stable performance, outperforming not only other machine learning models but also the traditional TNM staging system. In EO colon cancer, the RSF model achieved C-indices of 0.738 (test cohort) and 0.829 (external validation), mean AUCs of 0.765 and 0.889, and integrated Brier scores of 0.084 and 0.077, respectively. For EO rectal cancer, C-indices were 0.728 and 0.722, mean AUCs were 0.753 and 0.900, and integrated Brier scores were 0.106 and 0.095, respectively. The calibration and decision curves further confirmed the RSF model's good calibration and clinical net benefit. The RSF model also showed robust performance in LOCRC cohorts. SHAP analysis was used to quantify the marginal contribution of each predictor within each cancer subtype. Based on the RSF model, we developed a CSS-based risk stratification framework and deployed an online prediction tool.</p><p><strong>Conclusions: </strong>In summary, we selected the RSF model for its outstanding predictive performance, naming it OncoE25, to support personalized health management for EO colon and rectal patients.</p>","PeriodicalId":17458,"journal":{"name":"Journal of Translational Medicine","volume":"23 1","pages":"695"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12183820/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Translational Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12967-025-06663-4","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Although CRC incidence is declining overall, early-onset colorectal cancers are increasing. No prognostic models currently exist for predicting postoperative survival in Stage I-III early-onset colon or rectal cancer. Such tools are urgently needed to enable individualized risk assessment.

Methods: We identified patients with early onset (EO) and late-onset (LO) colon or rectal cancer from the SEER database and randomly split them into training and test cohorts (7:3). External cohorts of early-onset colon and rectal cancer were collected from two Chinese hospitals. After LASSO-Cox feature selection, six models-RSF, LASSO-Cox, S-SVM, XGBSE, GBSA, and DeepSurv-were developed to predict cancer-specific survival (CSS). Performance was assessed using the C-index, Brier score, time-dependent AUC, calibration, and decision curves. SHAP was used for model interpretation. A risk stratification system and an online calculator were constructed based on the best-performing model.

Results: A total of 3,997 EO colon cancer, 2,016 EO rectal cancer, 30,621 LO colon cancer, and 8,667 LO rectal cancer patients from SEER, along with 205 EO colon cancer and 153 EO rectal cancer patients from Chinese institutions, were included in the study. Based on comprehensive evaluation across multiple datasets and metrics, the RSF model demonstrated the best and most stable performance, outperforming not only other machine learning models but also the traditional TNM staging system. In EO colon cancer, the RSF model achieved C-indices of 0.738 (test cohort) and 0.829 (external validation), mean AUCs of 0.765 and 0.889, and integrated Brier scores of 0.084 and 0.077, respectively. For EO rectal cancer, C-indices were 0.728 and 0.722, mean AUCs were 0.753 and 0.900, and integrated Brier scores were 0.106 and 0.095, respectively. The calibration and decision curves further confirmed the RSF model's good calibration and clinical net benefit. The RSF model also showed robust performance in LOCRC cohorts. SHAP analysis was used to quantify the marginal contribution of each predictor within each cancer subtype. Based on the RSF model, we developed a CSS-based risk stratification framework and deployed an online prediction tool.

Conclusions: In summary, we selected the RSF model for its outstanding predictive performance, naming it OncoE25, to support personalized health management for EO colon and rectal patients.

OncoE25:预测早发I-III期结肠癌和直肠癌术后预后的人工智能模型——一项基于人群的双中心队列验证SEER研究。
背景:虽然CRC的发病率总体上在下降,但早发性结直肠癌的发病率却在上升。目前还没有预测I-III期早发性结肠癌或直肠癌术后生存的预后模型。迫切需要这样的工具来进行个体化风险评估。方法:我们从SEER数据库中筛选出早发性(EO)和晚发性(LO)结肠癌或直肠癌患者,并将其随机分为训练组和试验组(7:3)。从两家中国医院收集早发性结肠癌和直肠癌的外部队列。在LASSO-Cox特征选择后,开发了rsf、LASSO-Cox、S-SVM、XGBSE、GBSA和deepsur6个模型来预测癌症特异性生存(CSS)。使用c指数、Brier评分、随时间变化的AUC、校准和决策曲线来评估性能。采用SHAP进行模型解释。基于最佳表现模型构建了风险分层系统和在线计算器。结果:SEER共纳入3997例EO结肠癌患者、2016例EO直肠癌患者、30621例LO结肠癌患者、8667例LO直肠癌患者,以及来自中国机构的205例EO结肠癌患者、153例EO直肠癌患者。基于多个数据集和指标的综合评估,RSF模型表现出最佳和最稳定的性能,不仅优于其他机器学习模型,而且优于传统的TNM分期系统。在EO结肠癌中,RSF模型的c指数分别为0.738(测试队列)和0.829(外部验证),平均auc分别为0.765和0.889,综合Brier评分分别为0.084和0.077。EO型直肠癌c指数分别为0.728和0.722,平均auc分别为0.753和0.900,综合Brier评分分别为0.106和0.095。校正曲线和决策曲线进一步证实了RSF模型具有良好的校正效果和临床净效益。RSF模型在LOCRC队列中也显示出稳健的表现。SHAP分析用于量化每种癌症亚型中每种预测因子的边际贡献。基于RSF模型,我们开发了一个基于css的风险分层框架,并部署了一个在线预测工具。结论:总之,我们选择RSF模型是因为其出色的预测性能,并将其命名为OncoE25,以支持EO结肠和直肠患者的个性化健康管理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Translational Medicine
Journal of Translational Medicine 医学-医学:研究与实验
CiteScore
10.00
自引率
1.40%
发文量
537
审稿时长
1 months
期刊介绍: The Journal of Translational Medicine is an open-access journal that publishes articles focusing on information derived from human experimentation to enhance communication between basic and clinical science. It covers all areas of translational medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信