MoLPre:预测cT1实体肺癌转移的机器学习模型

IF 3.1 3区 医学 Q2 MEDICINE, RESEARCH & EXPERIMENTAL
Jie Lan, Heng Wang, Jing Huang, Weiyi Li, Min Ao, Wanfeng Zhang, Junhao Mu, Li Yang, Longke Ran
{"title":"MoLPre:预测cT1实体肺癌转移的机器学习模型","authors":"Jie Lan,&nbsp;Heng Wang,&nbsp;Jing Huang,&nbsp;Weiyi Li,&nbsp;Min Ao,&nbsp;Wanfeng Zhang,&nbsp;Junhao Mu,&nbsp;Li Yang,&nbsp;Longke Ran","doi":"10.1111/cts.70186","DOIUrl":null,"url":null,"abstract":"<p>Given that more than 20% of patients with cT1 solid NSCLC showed nodal or extrathoracic metastasis, early detection of metastasis is crucial and urgent for improving therapeutic planning and patients' risk stratification in clinical practice. This study collected clinicopathological variables from the pulmonary nodule and lung cancer database of the First Affiliated Hospital of Chongqing Medical University, where patients with early-stage (cT1) solitary lung cancer were evaluated from 2018.11 to 2022.10. The random forest model and Shapley Additive Explanations (SHAP) were used to investigate the importance of clinical features in the feature selection part. Random Forest, Gradient Boosting, and AdaBoost classifiers were applied to build the final model, and the predictive discrimination of each model was compared based on the receiver operating characteristics (ROC) curve and precision and recall curve. With the evaluation of feature importance, 9 features were used to construct the prediction model finally. The Random Forest model yielded an average precision of 0.93 with an area under the curve (AUC) of 0.92 (95% CI: 0.88–0.94) compared with the Gradient Boosting and AdaBoost classifiers in the internal validation dataset, yielding an average precision of 0.87 and 0.91 with AUCs of 0.87 (95% CI: 0.84–0.93) and 0.90 (95% CI: 0.86–0.92), respectively. In addition, the Random Forest classifier performed best in 5 other 5 diagnostic indices. Furthermore, we embedded this model in a web application called MoLPre (https://molpre.cqmu.edu.cn/), a user-friendly tool assisting in the metastasis prediction of cT1 solid lung cancer.</p>","PeriodicalId":50610,"journal":{"name":"Cts-Clinical and Translational Science","volume":"18 4","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cts.70186","citationCount":"0","resultStr":"{\"title\":\"MoLPre: A Machine Learning Model to Predict Metastasis of cT1 Solid Lung Cancer\",\"authors\":\"Jie Lan,&nbsp;Heng Wang,&nbsp;Jing Huang,&nbsp;Weiyi Li,&nbsp;Min Ao,&nbsp;Wanfeng Zhang,&nbsp;Junhao Mu,&nbsp;Li Yang,&nbsp;Longke Ran\",\"doi\":\"10.1111/cts.70186\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Given that more than 20% of patients with cT1 solid NSCLC showed nodal or extrathoracic metastasis, early detection of metastasis is crucial and urgent for improving therapeutic planning and patients' risk stratification in clinical practice. This study collected clinicopathological variables from the pulmonary nodule and lung cancer database of the First Affiliated Hospital of Chongqing Medical University, where patients with early-stage (cT1) solitary lung cancer were evaluated from 2018.11 to 2022.10. The random forest model and Shapley Additive Explanations (SHAP) were used to investigate the importance of clinical features in the feature selection part. Random Forest, Gradient Boosting, and AdaBoost classifiers were applied to build the final model, and the predictive discrimination of each model was compared based on the receiver operating characteristics (ROC) curve and precision and recall curve. With the evaluation of feature importance, 9 features were used to construct the prediction model finally. The Random Forest model yielded an average precision of 0.93 with an area under the curve (AUC) of 0.92 (95% CI: 0.88–0.94) compared with the Gradient Boosting and AdaBoost classifiers in the internal validation dataset, yielding an average precision of 0.87 and 0.91 with AUCs of 0.87 (95% CI: 0.84–0.93) and 0.90 (95% CI: 0.86–0.92), respectively. In addition, the Random Forest classifier performed best in 5 other 5 diagnostic indices. Furthermore, we embedded this model in a web application called MoLPre (https://molpre.cqmu.edu.cn/), a user-friendly tool assisting in the metastasis prediction of cT1 solid lung cancer.</p>\",\"PeriodicalId\":50610,\"journal\":{\"name\":\"Cts-Clinical and Translational Science\",\"volume\":\"18 4\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cts.70186\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cts-Clinical and Translational Science\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/cts.70186\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cts-Clinical and Translational Science","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cts.70186","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

摘要

鉴于超过20%的cT1型实体NSCLC患者出现淋巴结或胸外转移,早期发现转移对于改善临床治疗计划和患者风险分层至关重要。本研究收集重庆医科大学第一附属医院肺结节和肺癌数据库中的临床病理变量,对2018.11 - 20122.10年早期(cT1)孤立性肺癌患者进行评估。在特征选择部分,采用随机森林模型和Shapley加性解释(SHAP)来考察临床特征的重要性。采用随机森林(Random Forest)、梯度增强(Gradient Boosting)和AdaBoost分类器构建最终模型,并根据受试者工作特征(ROC)曲线和查准率和查全率曲线对各模型的预测判别率进行比较。通过对特征重要性的评价,最终利用9个特征构建预测模型。与内部验证数据集中的Gradient Boosting和AdaBoost分类器相比,Random Forest模型的平均精度为0.93,曲线下面积(AUC)为0.92 (95% CI: 0.88-0.94),平均精度为0.87和0.91,AUC分别为0.87 (95% CI: 0.84-0.93)和0.90 (95% CI: 0.86-0.92)。此外,随机森林分类器在其他5个诊断指标中表现最好。此外,我们将该模型嵌入到一个名为MoLPre (https://molpre.cqmu.edu.cn/)的web应用程序中,MoLPre是一个帮助预测cT1实体肺癌转移的用户友好工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

MoLPre: A Machine Learning Model to Predict Metastasis of cT1 Solid Lung Cancer

MoLPre: A Machine Learning Model to Predict Metastasis of cT1 Solid Lung Cancer

Given that more than 20% of patients with cT1 solid NSCLC showed nodal or extrathoracic metastasis, early detection of metastasis is crucial and urgent for improving therapeutic planning and patients' risk stratification in clinical practice. This study collected clinicopathological variables from the pulmonary nodule and lung cancer database of the First Affiliated Hospital of Chongqing Medical University, where patients with early-stage (cT1) solitary lung cancer were evaluated from 2018.11 to 2022.10. The random forest model and Shapley Additive Explanations (SHAP) were used to investigate the importance of clinical features in the feature selection part. Random Forest, Gradient Boosting, and AdaBoost classifiers were applied to build the final model, and the predictive discrimination of each model was compared based on the receiver operating characteristics (ROC) curve and precision and recall curve. With the evaluation of feature importance, 9 features were used to construct the prediction model finally. The Random Forest model yielded an average precision of 0.93 with an area under the curve (AUC) of 0.92 (95% CI: 0.88–0.94) compared with the Gradient Boosting and AdaBoost classifiers in the internal validation dataset, yielding an average precision of 0.87 and 0.91 with AUCs of 0.87 (95% CI: 0.84–0.93) and 0.90 (95% CI: 0.86–0.92), respectively. In addition, the Random Forest classifier performed best in 5 other 5 diagnostic indices. Furthermore, we embedded this model in a web application called MoLPre (https://molpre.cqmu.edu.cn/), a user-friendly tool assisting in the metastasis prediction of cT1 solid lung cancer.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Cts-Clinical and Translational Science
Cts-Clinical and Translational Science 医学-医学:研究与实验
CiteScore
6.70
自引率
2.60%
发文量
234
审稿时长
6-12 weeks
期刊介绍: Clinical and Translational Science (CTS), an official journal of the American Society for Clinical Pharmacology and Therapeutics, highlights original translational medicine research that helps bridge laboratory discoveries with the diagnosis and treatment of human disease. Translational medicine is a multi-faceted discipline with a focus on translational therapeutics. In a broad sense, translational medicine bridges across the discovery, development, regulation, and utilization spectrum. Research may appear as Full Articles, Brief Reports, Commentaries, Phase Forwards (clinical trials), Reviews, or Tutorials. CTS also includes invited didactic content that covers the connections between clinical pharmacology and translational medicine. Best-in-class methodologies and best practices are also welcomed as Tutorials. These additional features provide context for research articles and facilitate understanding for a wide array of individuals interested in clinical and translational science. CTS welcomes high quality, scientifically sound, original manuscripts focused on clinical pharmacology and translational science, including animal, in vitro, in silico, and clinical studies supporting the breadth of drug discovery, development, regulation and clinical use of both traditional drugs and innovative modalities.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信