基于身体成分和炎症营养指数(BCINI)的可解释机器学习模型预测结直肠癌术后早期复发:多中心研究

IF 4.8 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Yongjie Zhou , Jinhong Zhao , Fei Zou , Yongming Tan , Wei Zeng , Jiahui Jiang , Jiale Hu , Qiao Zeng , Lianggeng Gong , Lan Liu , Linhua Zhong
{"title":"基于身体成分和炎症营养指数(BCINI)的可解释机器学习模型预测结直肠癌术后早期复发:多中心研究","authors":"Yongjie Zhou ,&nbsp;Jinhong Zhao ,&nbsp;Fei Zou ,&nbsp;Yongming Tan ,&nbsp;Wei Zeng ,&nbsp;Jiahui Jiang ,&nbsp;Jiale Hu ,&nbsp;Qiao Zeng ,&nbsp;Lianggeng Gong ,&nbsp;Lan Liu ,&nbsp;Linhua Zhong","doi":"10.1016/j.cmpb.2025.108874","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and objective</h3><div>Colorectal cancer (CRC) ranks among the most prevalent cancers worldwide, with early postoperative recurrence remaining a major cause of mortality. Body composition and inflammatory-nutritional indices (BCINI) have demonstrated potential in reflecting patients’ physiological states; however, their association with early recurrence (ER) after CRC resection remains unclear. This study aimed to establish and validate interpretable machine learning (ML) models based on BCINI to predict ER after CRC resection.</div></div><div><h3>Methods</h3><div>Data from three hospitals were collected, including CT-based body composition metrics and blood test variables. After variable selection, six ML algorithms—XGBoost, Complement Naive Bayes (CNB), support vector machine (SVM), k-nearest neighbors (KNN), random forest (RF), and Gaussian Naive Bayes (GNB)—were used to construct ER prediction models. Optimal model selection was based on receiver operating characteristic (ROC) curve analysis. The selected model was externally validated using independent datasets to assess generalizability, while its accuracy and clinical utility were evaluated via calibration curves and decision curve analysis. Additionally, SHapley Additive exPlanations were employed to visualize prediction processes for clinical interpretability.</div></div><div><h3>Results</h3><div>The XGBoost algorithm outperformed other methods in model selection, demonstrating superior accuracy and stability with area under the ROC curve (AUC) values of 0.837 and 0.777 in internal training and validation sets, respectively. This model achieved the lowest Brier score of 0.131 on calibration curves, surpassing the five other ML algorithms. External validation further confirmed its generalizability, yielding AUC values of 0.783 and 0.773 in two independent datasets. Consistent predictive performance was observed across age subgroups (&lt;60 years: AUC 0.762–0.834; ≥60 years: AUC 0.777–0.800) and tumor location subgroups (colon: AUC 0.785–0.845; rectum: AUC 0.751–0.799).</div></div><div><h3>Conclusions</h3><div>The interpretable ML model developed based on BCINI shows promise in predicting ER of CRC. This approach may provide valuable insights for clinical decision-making, enabling early detection and intervention to improve patient outcomes.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"269 ","pages":"Article 108874"},"PeriodicalIF":4.8000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Interpretable machine learning models based on body composition and inflammatory nutritional index (BCINI) to predict early postoperative recurrence of colorectal cancer: Multi-center study\",\"authors\":\"Yongjie Zhou ,&nbsp;Jinhong Zhao ,&nbsp;Fei Zou ,&nbsp;Yongming Tan ,&nbsp;Wei Zeng ,&nbsp;Jiahui Jiang ,&nbsp;Jiale Hu ,&nbsp;Qiao Zeng ,&nbsp;Lianggeng Gong ,&nbsp;Lan Liu ,&nbsp;Linhua Zhong\",\"doi\":\"10.1016/j.cmpb.2025.108874\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background and objective</h3><div>Colorectal cancer (CRC) ranks among the most prevalent cancers worldwide, with early postoperative recurrence remaining a major cause of mortality. Body composition and inflammatory-nutritional indices (BCINI) have demonstrated potential in reflecting patients’ physiological states; however, their association with early recurrence (ER) after CRC resection remains unclear. This study aimed to establish and validate interpretable machine learning (ML) models based on BCINI to predict ER after CRC resection.</div></div><div><h3>Methods</h3><div>Data from three hospitals were collected, including CT-based body composition metrics and blood test variables. After variable selection, six ML algorithms—XGBoost, Complement Naive Bayes (CNB), support vector machine (SVM), k-nearest neighbors (KNN), random forest (RF), and Gaussian Naive Bayes (GNB)—were used to construct ER prediction models. Optimal model selection was based on receiver operating characteristic (ROC) curve analysis. The selected model was externally validated using independent datasets to assess generalizability, while its accuracy and clinical utility were evaluated via calibration curves and decision curve analysis. Additionally, SHapley Additive exPlanations were employed to visualize prediction processes for clinical interpretability.</div></div><div><h3>Results</h3><div>The XGBoost algorithm outperformed other methods in model selection, demonstrating superior accuracy and stability with area under the ROC curve (AUC) values of 0.837 and 0.777 in internal training and validation sets, respectively. This model achieved the lowest Brier score of 0.131 on calibration curves, surpassing the five other ML algorithms. External validation further confirmed its generalizability, yielding AUC values of 0.783 and 0.773 in two independent datasets. Consistent predictive performance was observed across age subgroups (&lt;60 years: AUC 0.762–0.834; ≥60 years: AUC 0.777–0.800) and tumor location subgroups (colon: AUC 0.785–0.845; rectum: AUC 0.751–0.799).</div></div><div><h3>Conclusions</h3><div>The interpretable ML model developed based on BCINI shows promise in predicting ER of CRC. This approach may provide valuable insights for clinical decision-making, enabling early detection and intervention to improve patient outcomes.</div></div>\",\"PeriodicalId\":10624,\"journal\":{\"name\":\"Computer methods and programs in biomedicine\",\"volume\":\"269 \",\"pages\":\"Article 108874\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2025-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer methods and programs in biomedicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169260725002913\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725002913","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

背景与目的结直肠癌(CRC)是世界上最常见的癌症之一,术后早期复发仍然是死亡的主要原因。身体成分和炎症营养指数(BCINI)已被证明具有反映患者生理状态的潜力;然而,它们与结直肠癌切除术后早期复发(ER)的关系尚不清楚。本研究旨在建立并验证基于BCINI的可解释机器学习(ML)模型,以预测结直肠癌切除术后的ER。方法收集三家医院的数据,包括基于ct的身体成分指标和血液检测变量。在变量选择后,使用xgboost、补体朴素贝叶斯(CNB)、支持向量机(SVM)、k近邻(KNN)、随机森林(RF)和高斯朴素贝叶斯(GNB) 6种ML算法构建ER预测模型。最优模型选择基于受试者工作特征(ROC)曲线分析。所选模型使用独立数据集进行外部验证,以评估其通用性,同时通过校准曲线和决策曲线分析评估其准确性和临床实用性。此外,SHapley加性解释用于可视化临床可解释性的预测过程。结果XGBoost算法在模型选择上优于其他方法,在内部训练集和验证集的ROC曲线下面积(AUC)分别为0.837和0.777,准确度和稳定性均优于其他方法。该模型在校准曲线上的Brier得分最低,为0.131,超过了其他五种ML算法。外部验证进一步证实了其泛化性,在两个独立的数据集上得到的AUC值分别为0.783和0.773。各年龄亚组观察到一致的预测性能(60岁:AUC 0.762-0.834;≥60岁:AUC 0.777-0.800)和肿瘤位置亚组(结肠:AUC 0.785-0.845;直肠:AUC 0.751-0.799)。结论基于BCINI建立的可解释ML模型在预测结直肠癌ER方面具有良好的应用前景。这种方法可能为临床决策提供有价值的见解,使早期发现和干预能够改善患者的预后。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Interpretable machine learning models based on body composition and inflammatory nutritional index (BCINI) to predict early postoperative recurrence of colorectal cancer: Multi-center study

Background and objective

Colorectal cancer (CRC) ranks among the most prevalent cancers worldwide, with early postoperative recurrence remaining a major cause of mortality. Body composition and inflammatory-nutritional indices (BCINI) have demonstrated potential in reflecting patients’ physiological states; however, their association with early recurrence (ER) after CRC resection remains unclear. This study aimed to establish and validate interpretable machine learning (ML) models based on BCINI to predict ER after CRC resection.

Methods

Data from three hospitals were collected, including CT-based body composition metrics and blood test variables. After variable selection, six ML algorithms—XGBoost, Complement Naive Bayes (CNB), support vector machine (SVM), k-nearest neighbors (KNN), random forest (RF), and Gaussian Naive Bayes (GNB)—were used to construct ER prediction models. Optimal model selection was based on receiver operating characteristic (ROC) curve analysis. The selected model was externally validated using independent datasets to assess generalizability, while its accuracy and clinical utility were evaluated via calibration curves and decision curve analysis. Additionally, SHapley Additive exPlanations were employed to visualize prediction processes for clinical interpretability.

Results

The XGBoost algorithm outperformed other methods in model selection, demonstrating superior accuracy and stability with area under the ROC curve (AUC) values of 0.837 and 0.777 in internal training and validation sets, respectively. This model achieved the lowest Brier score of 0.131 on calibration curves, surpassing the five other ML algorithms. External validation further confirmed its generalizability, yielding AUC values of 0.783 and 0.773 in two independent datasets. Consistent predictive performance was observed across age subgroups (<60 years: AUC 0.762–0.834; ≥60 years: AUC 0.777–0.800) and tumor location subgroups (colon: AUC 0.785–0.845; rectum: AUC 0.751–0.799).

Conclusions

The interpretable ML model developed based on BCINI shows promise in predicting ER of CRC. This approach may provide valuable insights for clinical decision-making, enabling early detection and intervention to improve patient outcomes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer methods and programs in biomedicine
Computer methods and programs in biomedicine 工程技术-工程:生物医学
CiteScore
12.30
自引率
6.60%
发文量
601
审稿时长
135 days
期刊介绍: To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信