Zhiqiang Wang, Xingqing Jia, Yukun Yang, Ning Meng, Le Wang, Jie Zheng, Yuanqing Xu
{"title":"基于机器学习的胃癌肝转移患者预后和治疗决策动态预测模型。","authors":"Zhiqiang Wang, Xingqing Jia, Yukun Yang, Ning Meng, Le Wang, Jie Zheng, Yuanqing Xu","doi":"10.62347/MTBM7462","DOIUrl":null,"url":null,"abstract":"<p><p>Gastric cancer with liver metastasis (GCLM) often has a poor prognosis. Therefore, it is crucial to identify risk factors affecting their overall survival (OS) and cancer-specific survival (CSS). This study aimed to construct practical machine learning models to predict survival time and help clinicians choose appropriate treatments. We reviewed the clinical and survival data of GCLM patients from 2010 to 2017 in the Surveillance, Epidemiology, and End Results (SEER) databases and divided the patients into training and testing groups. The risk factors affecting OS and CSS were determined by least absolute shrinkage and selector operator (LASSO), univariate cox regression, best subset regression (BSR) and the stepwise backward regression. Then, five machine learning models, including random survival forest (RSF), Gradient Boosting Machine (GBM), the Cox proportional hazard (CPH), Survival Support Vector Machine (survivalSVM), and eXtreme Gradient Boosting (XGBoost), were built using the identified risk factors. The model with the best predictive ability was determined using concordance index (c-index), area under the curve (AUC), brier score, and decision curve analysis (DCA), and externally verified with data from 233 cases diagnosed with liver metastasis of cancer from The Shijiazhuang People's Hospital, Jinan City People's Hospital, and The Sixth People's Hospital of Huizhou from 2017 to 2018. The study involved a total of 1300 GCLM patients. The prognostic risk factors affecting OS and CSS were the same, including grade, histology, T stage, N stage, surgery, and chemotherapy. The XGBoost model was found to have the best predictive ability for OS, with AUC of 0.891 [95% CI 0.841-0.941], brier score of 0.061 [95% CI 0.046-0.076], and c-index of 0.752 [95% CI 0.742-0.761], as well as for CSS, with AUC of 0.895 [95% CI 0.848-0.942], brier score of 0.064 [95% CI 0.050-0.079], and c-index of 0.746 [95% CI 0.736-0.756]. The AUC score, brier score and c-index all illustrated the accuracy of the model, and the validation using the external datasets further confirmed the reliability of the model. Therefore, the XGBoost model demonstrated significant potential in predicting survival times and selecting appropriate treatment plans.</p>","PeriodicalId":7437,"journal":{"name":"American journal of cancer research","volume":"14 11","pages":"5521-5538"},"PeriodicalIF":3.6000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11626261/pdf/","citationCount":"0","resultStr":"{\"title\":\"Machine learning-based dynamic predictive models for prognosis and treatment decisions in patients with liver metastases from gastric cancer.\",\"authors\":\"Zhiqiang Wang, Xingqing Jia, Yukun Yang, Ning Meng, Le Wang, Jie Zheng, Yuanqing Xu\",\"doi\":\"10.62347/MTBM7462\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Gastric cancer with liver metastasis (GCLM) often has a poor prognosis. Therefore, it is crucial to identify risk factors affecting their overall survival (OS) and cancer-specific survival (CSS). This study aimed to construct practical machine learning models to predict survival time and help clinicians choose appropriate treatments. We reviewed the clinical and survival data of GCLM patients from 2010 to 2017 in the Surveillance, Epidemiology, and End Results (SEER) databases and divided the patients into training and testing groups. The risk factors affecting OS and CSS were determined by least absolute shrinkage and selector operator (LASSO), univariate cox regression, best subset regression (BSR) and the stepwise backward regression. Then, five machine learning models, including random survival forest (RSF), Gradient Boosting Machine (GBM), the Cox proportional hazard (CPH), Survival Support Vector Machine (survivalSVM), and eXtreme Gradient Boosting (XGBoost), were built using the identified risk factors. The model with the best predictive ability was determined using concordance index (c-index), area under the curve (AUC), brier score, and decision curve analysis (DCA), and externally verified with data from 233 cases diagnosed with liver metastasis of cancer from The Shijiazhuang People's Hospital, Jinan City People's Hospital, and The Sixth People's Hospital of Huizhou from 2017 to 2018. The study involved a total of 1300 GCLM patients. The prognostic risk factors affecting OS and CSS were the same, including grade, histology, T stage, N stage, surgery, and chemotherapy. The XGBoost model was found to have the best predictive ability for OS, with AUC of 0.891 [95% CI 0.841-0.941], brier score of 0.061 [95% CI 0.046-0.076], and c-index of 0.752 [95% CI 0.742-0.761], as well as for CSS, with AUC of 0.895 [95% CI 0.848-0.942], brier score of 0.064 [95% CI 0.050-0.079], and c-index of 0.746 [95% CI 0.736-0.756]. The AUC score, brier score and c-index all illustrated the accuracy of the model, and the validation using the external datasets further confirmed the reliability of the model. Therefore, the XGBoost model demonstrated significant potential in predicting survival times and selecting appropriate treatment plans.</p>\",\"PeriodicalId\":7437,\"journal\":{\"name\":\"American journal of cancer research\",\"volume\":\"14 11\",\"pages\":\"5521-5538\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11626261/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of cancer research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.62347/MTBM7462\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of cancer research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.62347/MTBM7462","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
胃癌伴肝转移(GCLM)往往预后较差。因此,确定影响其总生存期(OS)和癌症特异性生存期(CSS)的危险因素至关重要。本研究旨在构建实用的机器学习模型来预测生存时间,并帮助临床医生选择合适的治疗方法。我们回顾了监测、流行病学和最终结果(SEER)数据库中2010年至2017年GCLM患者的临床和生存数据,并将患者分为训练组和试验组。通过最小绝对收缩和选择算子(LASSO)、单变量cox回归、最佳子集回归(BSR)和逐步回归确定影响OS和CSS的危险因素。然后,利用识别出的风险因素,构建随机生存森林(RSF)、梯度增强机(GBM)、Cox比例风险(CPH)、生存支持向量机(survivalSVM)和极端梯度增强(XGBoost) 5个机器学习模型。采用一致性指数(c-index)、曲线下面积(AUC)、brier评分和决策曲线分析(DCA)确定预测能力最佳的模型,并采用石家庄市人民医院、济南市人民医院和惠州市第六人民医院2017 - 2018年诊断为肝癌肝转移的233例数据进行外部验证。这项研究共涉及1300名GCLM患者。影响OS和CSS的预后危险因素相同,包括分级、组织学、T期、N期、手术和化疗。XGBoost模型对OS的预测能力最好,AUC为0.891 [95% CI 0.841-0.941], brier评分为0.061 [95% CI 0.046-0.076], c-index为0.752 [95% CI 0.742-0.761];对CSS的预测能力最好,AUC为0.895 [95% CI 0.848-0.942], brier评分为0.064 [95% CI 0.050-0.079], c-index为0.746 [95% CI 0.736-0.756]。AUC评分、brier评分和c-index都说明了模型的准确性,使用外部数据集的验证进一步证实了模型的可靠性。因此,XGBoost模型在预测生存时间和选择合适的治疗方案方面显示出巨大的潜力。
Machine learning-based dynamic predictive models for prognosis and treatment decisions in patients with liver metastases from gastric cancer.
Gastric cancer with liver metastasis (GCLM) often has a poor prognosis. Therefore, it is crucial to identify risk factors affecting their overall survival (OS) and cancer-specific survival (CSS). This study aimed to construct practical machine learning models to predict survival time and help clinicians choose appropriate treatments. We reviewed the clinical and survival data of GCLM patients from 2010 to 2017 in the Surveillance, Epidemiology, and End Results (SEER) databases and divided the patients into training and testing groups. The risk factors affecting OS and CSS were determined by least absolute shrinkage and selector operator (LASSO), univariate cox regression, best subset regression (BSR) and the stepwise backward regression. Then, five machine learning models, including random survival forest (RSF), Gradient Boosting Machine (GBM), the Cox proportional hazard (CPH), Survival Support Vector Machine (survivalSVM), and eXtreme Gradient Boosting (XGBoost), were built using the identified risk factors. The model with the best predictive ability was determined using concordance index (c-index), area under the curve (AUC), brier score, and decision curve analysis (DCA), and externally verified with data from 233 cases diagnosed with liver metastasis of cancer from The Shijiazhuang People's Hospital, Jinan City People's Hospital, and The Sixth People's Hospital of Huizhou from 2017 to 2018. The study involved a total of 1300 GCLM patients. The prognostic risk factors affecting OS and CSS were the same, including grade, histology, T stage, N stage, surgery, and chemotherapy. The XGBoost model was found to have the best predictive ability for OS, with AUC of 0.891 [95% CI 0.841-0.941], brier score of 0.061 [95% CI 0.046-0.076], and c-index of 0.752 [95% CI 0.742-0.761], as well as for CSS, with AUC of 0.895 [95% CI 0.848-0.942], brier score of 0.064 [95% CI 0.050-0.079], and c-index of 0.746 [95% CI 0.736-0.756]. The AUC score, brier score and c-index all illustrated the accuracy of the model, and the validation using the external datasets further confirmed the reliability of the model. Therefore, the XGBoost model demonstrated significant potential in predicting survival times and selecting appropriate treatment plans.
期刊介绍:
The American Journal of Cancer Research (AJCR) (ISSN 2156-6976), is an independent open access, online only journal to facilitate rapid dissemination of novel discoveries in basic science and treatment of cancer. It was founded by a group of scientists for cancer research and clinical academic oncologists from around the world, who are devoted to the promotion and advancement of our understanding of the cancer and its treatment. The scope of AJCR is intended to encompass that of multi-disciplinary researchers from any scientific discipline where the primary focus of the research is to increase and integrate knowledge about etiology and molecular mechanisms of carcinogenesis with the ultimate aim of advancing the cure and prevention of this increasingly devastating disease. To achieve these aims AJCR will publish review articles, original articles and new techniques in cancer research and therapy. It will also publish hypothesis, case reports and letter to the editor. Unlike most other open access online journals, AJCR will keep most of the traditional features of paper print that we are all familiar with, such as continuous volume, issue numbers, as well as continuous page numbers to retain our comfortable familiarity towards an academic journal.