用可解释的机器学习算法预测胰腺神经内分泌肿瘤的肝转移:一项基于seer的研究。

IF 3.1 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL
Frontiers in Medicine Pub Date : 2025-05-01 eCollection Date: 2025-01-01 DOI:10.3389/fmed.2025.1533132
Jinzhe Bi, Yaqun Yu
{"title":"用可解释的机器学习算法预测胰腺神经内分泌肿瘤的肝转移:一项基于seer的研究。","authors":"Jinzhe Bi, Yaqun Yu","doi":"10.3389/fmed.2025.1533132","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Liver metastasis is the most common site of metastasis in pancreatic neuroendocrine tumors (PaNETs), significantly affecting patient prognosis. This study aims to develop machine learning algorithms to predict liver metastasis in PaNETs patients, assisting clinicians in the personalized clinical decision-making for treatment.</p><p><strong>Methods: </strong>We collected data on eligible PaNETs patients from the Surveillance, Epidemiology, and End Results (SEER) database for the period from 2010 to 2021. The Boruta algorithm and the Least Absolute Shrinkage and Selection Operator (LASSO) were used for feature selection. We applied 10 different machine learning algorithms to develop models for predicting the risk of liver metastasis in PaNETs patients. The model's performance was assessed using a variety of metrics, including the area under the receiver operating characteristic curve (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis (DCA), calibration curves, accuracy, sensitivity, specificity, F1 score, and Kappa score. The SHapley Additive exPlanations (SHAP) were employed to interpret models, and the best-performing model was used to develop a web-based calculator.</p><p><strong>Results: </strong>The study included a cohort of 7,463 PaNETs patients, of whom 1,356 (18.2%) were diagnosed with liver metastasis at the time of initial diagnosis. Through the combined use of the Boruta and LASSO methods, T-stage, N-stage, tumor size, grade, surgery, lymphadenectomy, chemotherapy, and bone metastasis were identified as independent risk factors for liver metastasis in PaNETs. Compared to other machine learning algorithms, the gradient boosting machine (GBM) model exhibited superior performance, achieving an AUC of 0.937 (95% CI: 0.931-0.943), an AUPRC of 0.94, and an accuracy of 0.87. DCA and calibration curve analyses demonstrate that the GBM model provides better clinical decision-making capabilities and predictive performance. Furthermore, the SHAP framework revealed that surgery, N-stage, and T-stage are the primary decision factors influencing the machine learning model's predictions. Finally, based on the GBM algorithm, we developed an accessible web-based calculator to predict the risk of liver metastasis in PaNETs.</p><p><strong>Conclusion: </strong>The GBM model excels in predicting the risk of liver metastasis in PaNETs patients, outperforming other machine learning models and providing critical support for developing personalized medical strategies in clinical practice.</p>","PeriodicalId":12488,"journal":{"name":"Frontiers in Medicine","volume":"12 ","pages":"1533132"},"PeriodicalIF":3.1000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12078274/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting liver metastasis in pancreatic neuroendocrine tumors with an interpretable machine learning algorithm: a SEER-based study.\",\"authors\":\"Jinzhe Bi, Yaqun Yu\",\"doi\":\"10.3389/fmed.2025.1533132\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Liver metastasis is the most common site of metastasis in pancreatic neuroendocrine tumors (PaNETs), significantly affecting patient prognosis. This study aims to develop machine learning algorithms to predict liver metastasis in PaNETs patients, assisting clinicians in the personalized clinical decision-making for treatment.</p><p><strong>Methods: </strong>We collected data on eligible PaNETs patients from the Surveillance, Epidemiology, and End Results (SEER) database for the period from 2010 to 2021. The Boruta algorithm and the Least Absolute Shrinkage and Selection Operator (LASSO) were used for feature selection. We applied 10 different machine learning algorithms to develop models for predicting the risk of liver metastasis in PaNETs patients. The model's performance was assessed using a variety of metrics, including the area under the receiver operating characteristic curve (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis (DCA), calibration curves, accuracy, sensitivity, specificity, F1 score, and Kappa score. The SHapley Additive exPlanations (SHAP) were employed to interpret models, and the best-performing model was used to develop a web-based calculator.</p><p><strong>Results: </strong>The study included a cohort of 7,463 PaNETs patients, of whom 1,356 (18.2%) were diagnosed with liver metastasis at the time of initial diagnosis. Through the combined use of the Boruta and LASSO methods, T-stage, N-stage, tumor size, grade, surgery, lymphadenectomy, chemotherapy, and bone metastasis were identified as independent risk factors for liver metastasis in PaNETs. Compared to other machine learning algorithms, the gradient boosting machine (GBM) model exhibited superior performance, achieving an AUC of 0.937 (95% CI: 0.931-0.943), an AUPRC of 0.94, and an accuracy of 0.87. DCA and calibration curve analyses demonstrate that the GBM model provides better clinical decision-making capabilities and predictive performance. Furthermore, the SHAP framework revealed that surgery, N-stage, and T-stage are the primary decision factors influencing the machine learning model's predictions. Finally, based on the GBM algorithm, we developed an accessible web-based calculator to predict the risk of liver metastasis in PaNETs.</p><p><strong>Conclusion: </strong>The GBM model excels in predicting the risk of liver metastasis in PaNETs patients, outperforming other machine learning models and providing critical support for developing personalized medical strategies in clinical practice.</p>\",\"PeriodicalId\":12488,\"journal\":{\"name\":\"Frontiers in Medicine\",\"volume\":\"12 \",\"pages\":\"1533132\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12078274/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3389/fmed.2025.1533132\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fmed.2025.1533132","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

摘要

背景:肝转移是胰腺神经内分泌肿瘤(panet)最常见的转移部位,严重影响患者预后。本研究旨在开发机器学习算法来预测panet患者的肝转移,帮助临床医生进行个性化的临床治疗决策。方法:我们从监测、流行病学和最终结果(SEER)数据库中收集2010年至2021年期间符合条件的PaNETs患者的数据。采用Boruta算法和最小绝对收缩和选择算子(LASSO)进行特征选择。我们应用了10种不同的机器学习算法来开发预测panet患者肝转移风险的模型。使用多种指标评估模型的性能,包括接收者工作特征曲线下面积(AUC)、精确召回率曲线下面积(AUPRC)、决策曲线分析(DCA)、校准曲线、准确性、灵敏度、特异性、F1评分和Kappa评分。采用SHapley加性解释(SHAP)来解释模型,并使用表现最好的模型来开发基于网络的计算器。结果:该研究纳入了7463例panet患者,其中1356例(18.2%)在初始诊断时被诊断为肝转移。通过联合使用Boruta和LASSO方法,确定t分期、n分期、肿瘤大小、分级、手术、淋巴结切除术、化疗和骨转移是panet肝转移的独立危险因素。与其他机器学习算法相比,梯度增强机(gradient boosting machine, GBM)模型表现出更优异的性能,AUC为0.937 (95% CI: 0.931-0.943), AUPRC为0.94,准确率为0.87。DCA和校准曲线分析表明,GBM模型具有较好的临床决策能力和预测性能。此外,SHAP框架显示,手术、n期和t期是影响机器学习模型预测的主要决策因素。最后,基于GBM算法,我们开发了一个可访问的基于web的计算器来预测panet的肝转移风险。结论:GBM模型在预测panet患者肝转移风险方面表现出色,优于其他机器学习模型,为临床实践中制定个性化医疗策略提供重要支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Predicting liver metastasis in pancreatic neuroendocrine tumors with an interpretable machine learning algorithm: a SEER-based study.

Background: Liver metastasis is the most common site of metastasis in pancreatic neuroendocrine tumors (PaNETs), significantly affecting patient prognosis. This study aims to develop machine learning algorithms to predict liver metastasis in PaNETs patients, assisting clinicians in the personalized clinical decision-making for treatment.

Methods: We collected data on eligible PaNETs patients from the Surveillance, Epidemiology, and End Results (SEER) database for the period from 2010 to 2021. The Boruta algorithm and the Least Absolute Shrinkage and Selection Operator (LASSO) were used for feature selection. We applied 10 different machine learning algorithms to develop models for predicting the risk of liver metastasis in PaNETs patients. The model's performance was assessed using a variety of metrics, including the area under the receiver operating characteristic curve (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis (DCA), calibration curves, accuracy, sensitivity, specificity, F1 score, and Kappa score. The SHapley Additive exPlanations (SHAP) were employed to interpret models, and the best-performing model was used to develop a web-based calculator.

Results: The study included a cohort of 7,463 PaNETs patients, of whom 1,356 (18.2%) were diagnosed with liver metastasis at the time of initial diagnosis. Through the combined use of the Boruta and LASSO methods, T-stage, N-stage, tumor size, grade, surgery, lymphadenectomy, chemotherapy, and bone metastasis were identified as independent risk factors for liver metastasis in PaNETs. Compared to other machine learning algorithms, the gradient boosting machine (GBM) model exhibited superior performance, achieving an AUC of 0.937 (95% CI: 0.931-0.943), an AUPRC of 0.94, and an accuracy of 0.87. DCA and calibration curve analyses demonstrate that the GBM model provides better clinical decision-making capabilities and predictive performance. Furthermore, the SHAP framework revealed that surgery, N-stage, and T-stage are the primary decision factors influencing the machine learning model's predictions. Finally, based on the GBM algorithm, we developed an accessible web-based calculator to predict the risk of liver metastasis in PaNETs.

Conclusion: The GBM model excels in predicting the risk of liver metastasis in PaNETs patients, outperforming other machine learning models and providing critical support for developing personalized medical strategies in clinical practice.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Frontiers in Medicine
Frontiers in Medicine Medicine-General Medicine
CiteScore
5.10
自引率
5.10%
发文量
3710
审稿时长
12 weeks
期刊介绍: Frontiers in Medicine publishes rigorously peer-reviewed research linking basic research to clinical practice and patient care, as well as translating scientific advances into new therapies and diagnostic tools. Led by an outstanding Editorial Board of international experts, this multidisciplinary open-access journal is at the forefront of disseminating and communicating scientific knowledge and impactful discoveries to researchers, academics, clinicians and the public worldwide. In addition to papers that provide a link between basic research and clinical practice, a particular emphasis is given to studies that are directly relevant to patient care. In this spirit, the journal publishes the latest research results and medical knowledge that facilitate the translation of scientific advances into new therapies or diagnostic tools. The full listing of the Specialty Sections represented by Frontiers in Medicine is as listed below. As well as the established medical disciplines, Frontiers in Medicine is launching new sections that together will facilitate - the use of patient-reported outcomes under real world conditions - the exploitation of big data and the use of novel information and communication tools in the assessment of new medicines - the scientific bases for guidelines and decisions from regulatory authorities - access to medicinal products and medical devices worldwide - addressing the grand health challenges around the world
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信