胃腺癌眼转移预测模型:基于机器学习的开发与解读研究

IF 2.7 4区 医学 Q3 ONCOLOGY
Jie Zou, Yan-Kun Shen, Shi-Nan Wu, Hong Wei, Qing-Jian Li, San Hua Xu, Qian Ling, Min Kang, Zhao-Lin Liu, Hui Huang, Xu Chen, Yi-Xin Wang, Xu-Lin Liao, Gang Tan, Yi Shao
{"title":"胃腺癌眼转移预测模型:基于机器学习的开发与解读研究","authors":"Jie Zou, Yan-Kun Shen, Shi-Nan Wu, Hong Wei, Qing-Jian Li, San Hua Xu, Qian Ling, Min Kang, Zhao-Lin Liu, Hui Huang, Xu Chen, Yi-Xin Wang, Xu-Lin Liao, Gang Tan, Yi Shao","doi":"10.1177/15330338231219352","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> Although gastric adenocarcinoma (GA) related ocular metastasis (OM) is rare, its occurrence indicates a more severe disease. We aimed to utilize machine learning (ML) to analyze the risk factors of GA-related OM and predict its risks. <b>Methods:</b> This is a retrospective cohort study. The clinical data of 3532 GA patients were collected and randomly classified into training and validation sets in a ratio of 7:3. Those with or without OM were classified into OM and non-OM (NOM) groups. Univariate and multivariate logistic regression analyses and least absolute shrinkage and selection operator were conducted. We integrated the variables identified through feature importance ranking and further refined the selection process using forward sequential feature selection based on random forest (RF) algorithm before incorporating them into the ML model. We applied six ML algorithms to construct the predictive GA model. The area under the receiver operating characteristic (ROC) curve indicated the model's predictive ability. Also, we established a network risk calculator based on the best performance model. We used Shapley additive interpretation (SHAP) to identify risk factors and to confirm the interpretability of the black box model. We have de-identified all patient details. <b>Results:</b> The ML model, consisting of 13 variables, achieved an optimal predictive performance using the gradient boosting machine (GBM) model, with an impressive area under the curve (AUC) of 0.997 in the test set. Utilizing the SHAP method, we identified crucial factors for OM in GA patients, including LDL, CA724, CEA, AFP, CA125, Hb, CA153, and Ca<sup>2+</sup>. Additionally, we validated the model's reliability through an analysis of two patient cases and developed a functional online web prediction calculator based on the GBM model. <b>Conclusion:</b> We used the ML method to establish a risk prediction model for GA-related OM and showed that GBM performed best among the six ML models. The model may identify patients with GA-related OM to provide early and timely treatment.</p>","PeriodicalId":22203,"journal":{"name":"Technology in Cancer Research & Treatment","volume":"23 ","pages":"15330338231219352"},"PeriodicalIF":2.7000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10865948/pdf/","citationCount":"0","resultStr":"{\"title\":\"Prediction Model of Ocular Metastases in Gastric Adenocarcinoma: Machine Learning-Based Development and Interpretation Study.\",\"authors\":\"Jie Zou, Yan-Kun Shen, Shi-Nan Wu, Hong Wei, Qing-Jian Li, San Hua Xu, Qian Ling, Min Kang, Zhao-Lin Liu, Hui Huang, Xu Chen, Yi-Xin Wang, Xu-Lin Liao, Gang Tan, Yi Shao\",\"doi\":\"10.1177/15330338231219352\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Background:</b> Although gastric adenocarcinoma (GA) related ocular metastasis (OM) is rare, its occurrence indicates a more severe disease. We aimed to utilize machine learning (ML) to analyze the risk factors of GA-related OM and predict its risks. <b>Methods:</b> This is a retrospective cohort study. The clinical data of 3532 GA patients were collected and randomly classified into training and validation sets in a ratio of 7:3. Those with or without OM were classified into OM and non-OM (NOM) groups. Univariate and multivariate logistic regression analyses and least absolute shrinkage and selection operator were conducted. We integrated the variables identified through feature importance ranking and further refined the selection process using forward sequential feature selection based on random forest (RF) algorithm before incorporating them into the ML model. We applied six ML algorithms to construct the predictive GA model. The area under the receiver operating characteristic (ROC) curve indicated the model's predictive ability. Also, we established a network risk calculator based on the best performance model. We used Shapley additive interpretation (SHAP) to identify risk factors and to confirm the interpretability of the black box model. We have de-identified all patient details. <b>Results:</b> The ML model, consisting of 13 variables, achieved an optimal predictive performance using the gradient boosting machine (GBM) model, with an impressive area under the curve (AUC) of 0.997 in the test set. Utilizing the SHAP method, we identified crucial factors for OM in GA patients, including LDL, CA724, CEA, AFP, CA125, Hb, CA153, and Ca<sup>2+</sup>. Additionally, we validated the model's reliability through an analysis of two patient cases and developed a functional online web prediction calculator based on the GBM model. <b>Conclusion:</b> We used the ML method to establish a risk prediction model for GA-related OM and showed that GBM performed best among the six ML models. The model may identify patients with GA-related OM to provide early and timely treatment.</p>\",\"PeriodicalId\":22203,\"journal\":{\"name\":\"Technology in Cancer Research & Treatment\",\"volume\":\"23 \",\"pages\":\"15330338231219352\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10865948/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Technology in Cancer Research & Treatment\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/15330338231219352\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Technology in Cancer Research & Treatment","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/15330338231219352","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:虽然与胃腺癌(GA)相关的眼转移(OM)非常罕见,但它的发生预示着疾病将更加严重。我们旨在利用机器学习(ML)分析胃腺癌相关眼转移的风险因素并预测其风险。方法:这是一项回顾性队列研究:这是一项回顾性队列研究。我们收集了 3532 名 GA 患者的临床数据,并按 7:3 的比例将其随机分为训练集和验证集。将有或无 OM 的患者分为 OM 组和非 OM 组(NOM)。我们进行了单变量和多变量逻辑回归分析以及最小绝对缩减和选择算子分析。我们整合了通过特征重要性排序确定的变量,并使用基于随机森林(RF)算法的前向序列特征选择进一步完善了选择过程,然后将其纳入 ML 模型。我们采用了六种 ML 算法来构建预测性 GA 模型。接收者操作特征曲线(ROC)下的面积显示了模型的预测能力。此外,我们还根据性能最佳的模型建立了网络风险计算器。我们使用夏普利加法解释(SHAP)来识别风险因素,并确认黑盒模型的可解释性。我们对所有患者的详细信息进行了去标识化处理。结果由 13 个变量组成的 ML 模型使用梯度提升机 (GBM) 模型实现了最佳预测性能,在测试集中的曲线下面积 (AUC) 达到了令人印象深刻的 0.997。利用 SHAP 方法,我们确定了 GA 患者 OM 的关键因素,包括 LDL、CA724、CEA、AFP、CA125、Hb、CA153 和 Ca2+。此外,我们还通过对两个患者病例的分析验证了模型的可靠性,并基于 GBM 模型开发了一个功能性在线网络预测计算器。结论:我们使用 ML 方法建立了 GA 相关 OM 的风险预测模型,结果表明 GBM 在六个 ML 模型中表现最佳。该模型可识别 GA 相关 OM 患者,从而提供早期及时的治疗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Prediction Model of Ocular Metastases in Gastric Adenocarcinoma: Machine Learning-Based Development and Interpretation Study.

Background: Although gastric adenocarcinoma (GA) related ocular metastasis (OM) is rare, its occurrence indicates a more severe disease. We aimed to utilize machine learning (ML) to analyze the risk factors of GA-related OM and predict its risks. Methods: This is a retrospective cohort study. The clinical data of 3532 GA patients were collected and randomly classified into training and validation sets in a ratio of 7:3. Those with or without OM were classified into OM and non-OM (NOM) groups. Univariate and multivariate logistic regression analyses and least absolute shrinkage and selection operator were conducted. We integrated the variables identified through feature importance ranking and further refined the selection process using forward sequential feature selection based on random forest (RF) algorithm before incorporating them into the ML model. We applied six ML algorithms to construct the predictive GA model. The area under the receiver operating characteristic (ROC) curve indicated the model's predictive ability. Also, we established a network risk calculator based on the best performance model. We used Shapley additive interpretation (SHAP) to identify risk factors and to confirm the interpretability of the black box model. We have de-identified all patient details. Results: The ML model, consisting of 13 variables, achieved an optimal predictive performance using the gradient boosting machine (GBM) model, with an impressive area under the curve (AUC) of 0.997 in the test set. Utilizing the SHAP method, we identified crucial factors for OM in GA patients, including LDL, CA724, CEA, AFP, CA125, Hb, CA153, and Ca2+. Additionally, we validated the model's reliability through an analysis of two patient cases and developed a functional online web prediction calculator based on the GBM model. Conclusion: We used the ML method to establish a risk prediction model for GA-related OM and showed that GBM performed best among the six ML models. The model may identify patients with GA-related OM to provide early and timely treatment.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.40
自引率
0.00%
发文量
202
审稿时长
2 months
期刊介绍: Technology in Cancer Research & Treatment (TCRT) is a JCR-ranked, broad-spectrum, open access, peer-reviewed publication whose aim is to provide researchers and clinicians with a platform to share and discuss developments in the prevention, diagnosis, treatment, and monitoring of cancer.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信