用于预测皮肤恶性黑色素瘤淋巴结转移的机器学习模型的构建和验证:一项基于人群的大型研究。

IF 1.5 4区 医学 Q4 ONCOLOGY
Translational cancer research Pub Date : 2025-02-28 Epub Date: 2025-02-18 DOI:10.21037/tcr-24-1672
Ling-Feng Lan, Yi-Long Kai, Xiao-Ling Xu, Jun-Kun Zhang, Guang-Bo Xu, Yan-Bi Dai, Yan Shen, Hua-Ya Lu, Ben Wang
{"title":"用于预测皮肤恶性黑色素瘤淋巴结转移的机器学习模型的构建和验证:一项基于人群的大型研究。","authors":"Ling-Feng Lan, Yi-Long Kai, Xiao-Ling Xu, Jun-Kun Zhang, Guang-Bo Xu, Yan-Bi Dai, Yan Shen, Hua-Ya Lu, Ben Wang","doi":"10.21037/tcr-24-1672","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Lymph node status is essential for determining the prognosis of cutaneous malignant melanoma (CMM). This study aimed to develop a machine learning (ML) model for predicting lymph node metastases (LNM) in CMM.</p><p><strong>Methods: </strong>We gathered data on 6,196 patients from the Surveillance, Epidemiology, and End Results (SEER) database, including known clinicopathologic variables, using six ML algorithms, including logistic regression (LR), support vector machine (SVM), Complement Naive Bayes (CNB), Extreme Gradient Boosting (XGBoost), RandomForest (RF), and k-nearest neighbor algorithm (kNN), to predict the presence of LNM in CMM. Subsequently, we established prediction models. The utilization of the adaptive synthetic (ADASYN) method served to address the challenge posed by imbalanced data. We assessed prediction model performance in terms of average precision (AP), sensitivity, specificity, accuracy, F1 score, precision-recall curves, calibration plots, and decision curve analysis (DCA). Furthermore, employing SHapley Additive exPlanation (SHAP) analysis resulted in the creation of visualized explanations tailored to individual patients.</p><p><strong>Results: </strong>Among the 6,196 CMM cases, 19.9% (n=1,234) presented with LNM. The XGBoost model showed the best predictive performance when compared with the other algorithms (AP of 0.805). XGBoost showed that age and Breslow thickness were the two most important factors related to LNM.</p><p><strong>Conclusions: </strong>The XGBoost model predicted LNM of CMM with a high level of precision. We hope that this model could assist surgeons in accurately evaluating surgical approaches and determining the extent of surgery, while also guiding the subsequent adjuvant therapies, thereby improving the prognosis of patients.</p>","PeriodicalId":23216,"journal":{"name":"Translational cancer research","volume":"14 2","pages":"706-716"},"PeriodicalIF":1.5000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11912072/pdf/","citationCount":"0","resultStr":"{\"title\":\"Construction and validation of machine learning models for predicting lymph node metastasis in cutaneous malignant melanoma: a large population-based study.\",\"authors\":\"Ling-Feng Lan, Yi-Long Kai, Xiao-Ling Xu, Jun-Kun Zhang, Guang-Bo Xu, Yan-Bi Dai, Yan Shen, Hua-Ya Lu, Ben Wang\",\"doi\":\"10.21037/tcr-24-1672\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Lymph node status is essential for determining the prognosis of cutaneous malignant melanoma (CMM). This study aimed to develop a machine learning (ML) model for predicting lymph node metastases (LNM) in CMM.</p><p><strong>Methods: </strong>We gathered data on 6,196 patients from the Surveillance, Epidemiology, and End Results (SEER) database, including known clinicopathologic variables, using six ML algorithms, including logistic regression (LR), support vector machine (SVM), Complement Naive Bayes (CNB), Extreme Gradient Boosting (XGBoost), RandomForest (RF), and k-nearest neighbor algorithm (kNN), to predict the presence of LNM in CMM. Subsequently, we established prediction models. The utilization of the adaptive synthetic (ADASYN) method served to address the challenge posed by imbalanced data. We assessed prediction model performance in terms of average precision (AP), sensitivity, specificity, accuracy, F1 score, precision-recall curves, calibration plots, and decision curve analysis (DCA). Furthermore, employing SHapley Additive exPlanation (SHAP) analysis resulted in the creation of visualized explanations tailored to individual patients.</p><p><strong>Results: </strong>Among the 6,196 CMM cases, 19.9% (n=1,234) presented with LNM. The XGBoost model showed the best predictive performance when compared with the other algorithms (AP of 0.805). XGBoost showed that age and Breslow thickness were the two most important factors related to LNM.</p><p><strong>Conclusions: </strong>The XGBoost model predicted LNM of CMM with a high level of precision. We hope that this model could assist surgeons in accurately evaluating surgical approaches and determining the extent of surgery, while also guiding the subsequent adjuvant therapies, thereby improving the prognosis of patients.</p>\",\"PeriodicalId\":23216,\"journal\":{\"name\":\"Translational cancer research\",\"volume\":\"14 2\",\"pages\":\"706-716\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11912072/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Translational cancer research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.21037/tcr-24-1672\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/18 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Translational cancer research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/tcr-24-1672","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/18 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:淋巴结状态是决定皮肤恶性黑色素瘤(CMM)预后的关键。本研究旨在建立预测CMM淋巴结转移(LNM)的机器学习(ML)模型。方法:我们从监测、流行病学和最终结果(SEER)数据库中收集了6196例患者的数据,包括已知的临床病理变量,使用六种ML算法,包括逻辑回归(LR)、支持向量机(SVM)、互补朴素贝叶斯(CNB)、极端梯度增强(XGBoost)、随机森林(RF)和k近邻算法(kNN),来预测慢性mm中是否存在LNM。随后,我们建立了预测模型。利用自适应合成(ADASYN)方法解决了数据不平衡带来的挑战。我们从平均精度(AP)、灵敏度、特异性、准确度、F1评分、精确召回率曲线、校准图和决策曲线分析(DCA)等方面评估了预测模型的性能。此外,采用SHapley加性解释(SHAP)分析可以创建针对个体患者的可视化解释。结果:6196例CMM中,19.9% (n= 1234)表现为LNM。与其他算法相比,XGBoost模型表现出最好的预测性能(AP为0.805)。XGBoost显示,年龄和Breslow厚度是与LNM相关的两个最重要的因素。结论:XGBoost模型对CMM的LNM有较高的预测精度。我们希望该模型能够帮助外科医生准确评估手术入路,确定手术范围,同时指导后续的辅助治疗,从而改善患者的预后。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Construction and validation of machine learning models for predicting lymph node metastasis in cutaneous malignant melanoma: a large population-based study.

Background: Lymph node status is essential for determining the prognosis of cutaneous malignant melanoma (CMM). This study aimed to develop a machine learning (ML) model for predicting lymph node metastases (LNM) in CMM.

Methods: We gathered data on 6,196 patients from the Surveillance, Epidemiology, and End Results (SEER) database, including known clinicopathologic variables, using six ML algorithms, including logistic regression (LR), support vector machine (SVM), Complement Naive Bayes (CNB), Extreme Gradient Boosting (XGBoost), RandomForest (RF), and k-nearest neighbor algorithm (kNN), to predict the presence of LNM in CMM. Subsequently, we established prediction models. The utilization of the adaptive synthetic (ADASYN) method served to address the challenge posed by imbalanced data. We assessed prediction model performance in terms of average precision (AP), sensitivity, specificity, accuracy, F1 score, precision-recall curves, calibration plots, and decision curve analysis (DCA). Furthermore, employing SHapley Additive exPlanation (SHAP) analysis resulted in the creation of visualized explanations tailored to individual patients.

Results: Among the 6,196 CMM cases, 19.9% (n=1,234) presented with LNM. The XGBoost model showed the best predictive performance when compared with the other algorithms (AP of 0.805). XGBoost showed that age and Breslow thickness were the two most important factors related to LNM.

Conclusions: The XGBoost model predicted LNM of CMM with a high level of precision. We hope that this model could assist surgeons in accurately evaluating surgical approaches and determining the extent of surgery, while also guiding the subsequent adjuvant therapies, thereby improving the prognosis of patients.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.10
自引率
0.00%
发文量
252
期刊介绍: Translational Cancer Research (Transl Cancer Res TCR; Print ISSN: 2218-676X; Online ISSN 2219-6803; http://tcr.amegroups.com/) is an Open Access, peer-reviewed journal, indexed in Science Citation Index Expanded (SCIE). TCR publishes laboratory studies of novel therapeutic interventions as well as clinical trials which evaluate new treatment paradigms for cancer; results of novel research investigations which bridge the laboratory and clinical settings including risk assessment, cellular and molecular characterization, prevention, detection, diagnosis and treatment of human cancers with the overall goal of improving the clinical care of cancer patients. The focus of TCR is original, peer-reviewed, science-based research that successfully advances clinical medicine toward the goal of improving patients'' quality of life. The editors and an international advisory group of scientists and clinician-scientists as well as other experts will hold TCR articles to the high-quality standards. We accept Original Articles as well as Review Articles, Editorials and Brief Articles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信