Combining traditional analysis and machine learning to predict early, middle, and long-term recurrence of intrahepatic cholangiocarcinoma

IF 3.5 2区医学 Q2 ONCOLOGY

Ejso Pub Date : 2025-05-09 DOI:10.1016/j.ejso.2025.110141

Ruoyu Zhang , Zengshuai Wang , Min Yang , Bo Chen , Mei Liu , Minhua Zheng , Peter Xiaoping Liu , Liming Wang

{"title":"Combining traditional analysis and machine learning to predict early, middle, and long-term recurrence of intrahepatic cholangiocarcinoma","authors":"Ruoyu Zhang , Zengshuai Wang , Min Yang , Bo Chen , Mei Liu , Minhua Zheng , Peter Xiaoping Liu , Liming Wang","doi":"10.1016/j.ejso.2025.110141","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><div>Intrahepatic cholangiocarcinoma (ICC) is a rare and highly aggressive cancer. Few patients are eligible for radical surgery, and most face the high risk of recurrence.</div></div><div><h3>Methods</h3><div>We developed early-, middle- and long-term (1-, 2-, and 3-year) ICC disease-free survival (DFS) prediction models using traditional Logistic analysis combined with machine learning (ML) and systematically compared the performance of traditional analysis and MLs.</div></div><div><h3>Results</h3><div>275, 256, and 238 ICC patients under radical surgery were included in the 1-, 2-, and 3-year DFS groups respectively. Five-fold cross-validation results demonstrated that both traditional Logistics and ML models exhibited remarkable robustness. MLs outperformed traditional Logistic models for DFS prediction across the AUC, accuracy and F1-scores. Specifically, the average AUC of training cohorts for the ML models were 0.878, 0.897 and 0.917 in 3 groups, compared to 0.657 (P < 0.001), 0.817 (P = 0.05), and 0.798 (P = 0.005) in traditional models. The average AUCs of testing cohorts for ML models were 0.831, 0.768, 0.803 in ML models in 3 groups, compared to 0.619 (P < 0.001), 0.719 (P = 0.008), 0.698 (P < 0.001) in traditional models. SHAP analysis identified lymph node metastasis played significant role in all-round recurrence, T stage and neural invasion had strong correction with middle and long-term recurrence in ICC patients.</div></div><div><h3>Conclusion</h3><div>Models with high predictive efficiency across early, middle, and long-term recurrence have been successfully built. ML models outperformed Logistic models for DFS prediction in ICC patients. This study suggests new possibilities for advancing statistical analysis software, such as SPSS and Stata, through ML integration.</div></div>","PeriodicalId":11522,"journal":{"name":"Ejso","volume":"51 9","pages":"Article 110141"},"PeriodicalIF":3.5000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ejso","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0748798325005694","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction

Intrahepatic cholangiocarcinoma (ICC) is a rare and highly aggressive cancer. Few patients are eligible for radical surgery, and most face the high risk of recurrence.

Methods

We developed early-, middle- and long-term (1-, 2-, and 3-year) ICC disease-free survival (DFS) prediction models using traditional Logistic analysis combined with machine learning (ML) and systematically compared the performance of traditional analysis and MLs.

Results

275, 256, and 238 ICC patients under radical surgery were included in the 1-, 2-, and 3-year DFS groups respectively. Five-fold cross-validation results demonstrated that both traditional Logistics and ML models exhibited remarkable robustness. MLs outperformed traditional Logistic models for DFS prediction across the AUC, accuracy and F1-scores. Specifically, the average AUC of training cohorts for the ML models were 0.878, 0.897 and 0.917 in 3 groups, compared to 0.657 (P < 0.001), 0.817 (P = 0.05), and 0.798 (P = 0.005) in traditional models. The average AUCs of testing cohorts for ML models were 0.831, 0.768, 0.803 in ML models in 3 groups, compared to 0.619 (P < 0.001), 0.719 (P = 0.008), 0.698 (P < 0.001) in traditional models. SHAP analysis identified lymph node metastasis played significant role in all-round recurrence, T stage and neural invasion had strong correction with middle and long-term recurrence in ICC patients.

Conclusion

Models with high predictive efficiency across early, middle, and long-term recurrence have been successfully built. ML models outperformed Logistic models for DFS prediction in ICC patients. This study suggests new possibilities for advancing statistical analysis software, such as SPSS and Stata, through ML integration.

查看原文本刊更多论文

结合传统分析和机器学习预测肝内胆管癌的早期、中期和长期复发

摘要肝内胆管癌（ICC）是一种罕见的高侵袭性肿瘤。很少有患者符合根治性手术的条件，而且大多数患者面临着很高的复发风险。方法采用传统Logistic分析与机器学习（ML）相结合的方法，建立早期、中期和长期（1年、2年和3年）ICC无病生存（DFS）预测模型，并系统比较传统分析与机器学习的性能。结果1年、2年和3年DFS组分别有275例、256例和238例行根治性手术的ICC患者。五重交叉验证结果表明，传统物流和ML模型都表现出显著的鲁棒性。在AUC、准确率和f1分数方面，ml在DFS预测方面优于传统Logistic模型。其中，3组ML模型训练队列的平均AUC分别为0.878、0.897和0.917，而ML模型训练队列的平均AUC为0.657 (P <；0.001), 0.817 (P = 0.05), 0.798 （P = 0.005）。3组ML模型的平均auc分别为0.831、0.768、0.803，而ML模型的平均auc为0.619 (P <；0.001), 0.719 (P = 0.008), 0.698 (P <；0.001)。SHAP分析发现，淋巴结转移在ICC患者的全面复发中起重要作用，T期和神经侵犯在中长期复发中具有较强的纠正作用。结论成功建立了早期、中期和长期复发预测效率较高的模型。ML模型在预测ICC患者的DFS方面优于Logistic模型。这项研究提出了通过ML集成推进统计分析软件（如SPSS和Stata）的新可能性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ejso 医学-外科

CiteScore

6.40

自引率

2.60%

发文量

1148

审稿时长

41 days

期刊介绍： JSO - European Journal of Surgical Oncology ("the Journal of Cancer Surgery") is the Official Journal of the European Society of Surgical Oncology and BASO ~ the Association for Cancer Surgery. The EJSO aims to advance surgical oncology research and practice through the publication of original research articles, review articles, editorials, debates and correspondence.