The establishment of machine learning prognostic prediction models for pineal region tumors based on SEER-A multicenter real-world study

IF 3.5 2区 医学 Q2 ONCOLOGY
Ejso Pub Date : 2025-04-22 DOI:10.1016/j.ejso.2025.110058
Hao Wu , Aierpati Maimaiti , Jinlong Huang , Jing Xue , Qiang Fu , Zening Wang , Mamutijiang Muertizha , Yang Li , Di Li , Qingjiu Zhou , Yongxin Wang
{"title":"The establishment of machine learning prognostic prediction models for pineal region tumors based on SEER-A multicenter real-world study","authors":"Hao Wu ,&nbsp;Aierpati Maimaiti ,&nbsp;Jinlong Huang ,&nbsp;Jing Xue ,&nbsp;Qiang Fu ,&nbsp;Zening Wang ,&nbsp;Mamutijiang Muertizha ,&nbsp;Yang Li ,&nbsp;Di Li ,&nbsp;Qingjiu Zhou ,&nbsp;Yongxin Wang","doi":"10.1016/j.ejso.2025.110058","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Pineal region tumors (PRT) are rare intracranial neoplasms with diverse pathological types and growth characteristics, leading to varied clinical manifestations. This study aims to develop machine learning (ML) models for survival prediction, offering valuable insights for medical practice in the management of PRTs.</div></div><div><h3>Methods</h3><div>Clinical information on PRTs was extracted from the Surveillance, Epidemiology, and End Results (SEER) database. The Kaplan-Meier (K-M) analysis was used to analyze the survival of PRT patients. Univariate and multivariate Cox regression analyses were conducted to identify risk factors for the survival of PRT patients. Then, nomograms were constructed. Seven ML models including Decision Tree, Logistic Regression, LightGBM, Random Forest, XGBoost, K-Nearest Neighbor Algorithm (KNN), and Support Vector Machine (SVM), were developed to predict the prognosis of PRT patients. The predictive value of ML models was evaluated by the area under the receiver's operating characteristic curve (AUC-ROC), tenfold cross verification, calibration curve, and decision curve analysis (DCA).</div></div><div><h3>Results</h3><div>Univariate and multivariate Cox regression revealed that age, histopathology, radiotherapy, and tumor size were independent risk factors for overall survival (OS). Histopathology, surgery, radiotherapy, and tumor size were risk factors for cancer-specific survival (CSS). K-M survival analysis revealed that age, histopathology, marital status, radiotherapy, sex, and surgery significantly impacted OS, while age, histopathology, marital status, race, radiotherapy, sex, and surgery significantly influenced CSS. In the prediction of OS, the ML models with the best clinical utility were RF, Logistic Regression, and XGBoost. For CSS, the most effective models were RF, LightGBM, and RF.</div></div><div><h3>Conclusion</h3><div>ML models demonstrate significant potential and high predictive efficacy in forecasting long-term postoperative survival in PRT patients, providing substantial clinical value.</div></div>","PeriodicalId":11522,"journal":{"name":"Ejso","volume":"51 8","pages":"Article 110058"},"PeriodicalIF":3.5000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ejso","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S074879832500486X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Pineal region tumors (PRT) are rare intracranial neoplasms with diverse pathological types and growth characteristics, leading to varied clinical manifestations. This study aims to develop machine learning (ML) models for survival prediction, offering valuable insights for medical practice in the management of PRTs.

Methods

Clinical information on PRTs was extracted from the Surveillance, Epidemiology, and End Results (SEER) database. The Kaplan-Meier (K-M) analysis was used to analyze the survival of PRT patients. Univariate and multivariate Cox regression analyses were conducted to identify risk factors for the survival of PRT patients. Then, nomograms were constructed. Seven ML models including Decision Tree, Logistic Regression, LightGBM, Random Forest, XGBoost, K-Nearest Neighbor Algorithm (KNN), and Support Vector Machine (SVM), were developed to predict the prognosis of PRT patients. The predictive value of ML models was evaluated by the area under the receiver's operating characteristic curve (AUC-ROC), tenfold cross verification, calibration curve, and decision curve analysis (DCA).

Results

Univariate and multivariate Cox regression revealed that age, histopathology, radiotherapy, and tumor size were independent risk factors for overall survival (OS). Histopathology, surgery, radiotherapy, and tumor size were risk factors for cancer-specific survival (CSS). K-M survival analysis revealed that age, histopathology, marital status, radiotherapy, sex, and surgery significantly impacted OS, while age, histopathology, marital status, race, radiotherapy, sex, and surgery significantly influenced CSS. In the prediction of OS, the ML models with the best clinical utility were RF, Logistic Regression, and XGBoost. For CSS, the most effective models were RF, LightGBM, and RF.

Conclusion

ML models demonstrate significant potential and high predictive efficacy in forecasting long-term postoperative survival in PRT patients, providing substantial clinical value.
基于SEER-A多中心真实研究的松果体区肿瘤机器学习预后预测模型的建立
松果体区肿瘤(PRT)是一种罕见的颅内肿瘤,具有多种病理类型和生长特征,其临床表现也多种多样。本研究旨在开发用于生存预测的机器学习(ML)模型,为prt管理的医疗实践提供有价值的见解。方法从监测、流行病学和最终结果(SEER)数据库中提取prt的临床信息。采用Kaplan-Meier (K-M)分析PRT患者的生存率。进行单因素和多因素Cox回归分析,以确定影响PRT患者生存的危险因素。然后,构造态图。采用Decision Tree、Logistic Regression、LightGBM、Random Forest、XGBoost、K-Nearest Neighbor Algorithm (KNN)、Support Vector Machine (SVM)等7种ML模型预测PRT患者的预后。ML模型的预测价值通过受试者工作特征曲线下面积(AUC-ROC)、十倍交叉验证、校准曲线和决策曲线分析(DCA)来评估。结果单因素和多因素Cox回归分析显示,年龄、组织病理学、放疗和肿瘤大小是影响总生存期(OS)的独立危险因素。组织病理学、手术、放疗和肿瘤大小是癌症特异性生存(CSS)的危险因素。K-M生存分析显示,年龄、组织病理学、婚姻状况、放疗、性别和手术对OS有显著影响,年龄、组织病理学、婚姻状况、种族、放疗、性别和手术对CSS有显著影响。在预测OS方面,临床应用效果最好的ML模型为RF、Logistic回归和XGBoost。对于CSS,最有效的模型是RF、LightGBM和RF。结论ml模型在预测PRT患者术后长期生存方面具有显著的潜力和较高的预测效能,具有重要的临床应用价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Ejso
Ejso 医学-外科
CiteScore
6.40
自引率
2.60%
发文量
1148
审稿时长
41 days
期刊介绍: JSO - European Journal of Surgical Oncology ("the Journal of Cancer Surgery") is the Official Journal of the European Society of Surgical Oncology and BASO ~ the Association for Cancer Surgery. The EJSO aims to advance surgical oncology research and practice through the publication of original research articles, review articles, editorials, debates and correspondence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信