{"title":"基于SEER数据库的机器学习肺癌生存预测模型的研究进展。","authors":"Ye Zhang, Jiaye Wang, Shiyu Hu, Yufen Xu, Qi Yang, Wenyu Chen","doi":"10.1080/07357907.2025.2563716","DOIUrl":null,"url":null,"abstract":"<p><p>The SEER (Surveillance, Epidemiology, and End Results) database, a comprehensive public repository of clinical oncology data, has been increasingly used to construct clinical prediction models for predicting the prognosis of cancer. With the advances in machine learning, various algorithms including logistic regression (LR), support vector machines (SVM), decision trees (DT), random forest (RF), artificial neural networks (ANN), and extreme gradient boosting (XGBoost) have been successively employed in the development of lung cancer survival prediction models (LCSPMs). This study combs through the progress of these machine learning algorithms in constructing lung cancer survival prediction models, points out the problems of data imbalance, poor model interpretability, and lack of external validation, and clarifies the future development direction.</p>","PeriodicalId":9463,"journal":{"name":"Cancer Investigation","volume":" ","pages":"1-12"},"PeriodicalIF":1.9000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Progress in Development of Lung Cancer Survival Prediction Models Using Machine Learning Based on SEER Database.\",\"authors\":\"Ye Zhang, Jiaye Wang, Shiyu Hu, Yufen Xu, Qi Yang, Wenyu Chen\",\"doi\":\"10.1080/07357907.2025.2563716\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The SEER (Surveillance, Epidemiology, and End Results) database, a comprehensive public repository of clinical oncology data, has been increasingly used to construct clinical prediction models for predicting the prognosis of cancer. With the advances in machine learning, various algorithms including logistic regression (LR), support vector machines (SVM), decision trees (DT), random forest (RF), artificial neural networks (ANN), and extreme gradient boosting (XGBoost) have been successively employed in the development of lung cancer survival prediction models (LCSPMs). This study combs through the progress of these machine learning algorithms in constructing lung cancer survival prediction models, points out the problems of data imbalance, poor model interpretability, and lack of external validation, and clarifies the future development direction.</p>\",\"PeriodicalId\":9463,\"journal\":{\"name\":\"Cancer Investigation\",\"volume\":\" \",\"pages\":\"1-12\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cancer Investigation\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/07357907.2025.2563716\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Investigation","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/07357907.2025.2563716","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
Progress in Development of Lung Cancer Survival Prediction Models Using Machine Learning Based on SEER Database.
The SEER (Surveillance, Epidemiology, and End Results) database, a comprehensive public repository of clinical oncology data, has been increasingly used to construct clinical prediction models for predicting the prognosis of cancer. With the advances in machine learning, various algorithms including logistic regression (LR), support vector machines (SVM), decision trees (DT), random forest (RF), artificial neural networks (ANN), and extreme gradient boosting (XGBoost) have been successively employed in the development of lung cancer survival prediction models (LCSPMs). This study combs through the progress of these machine learning algorithms in constructing lung cancer survival prediction models, points out the problems of data imbalance, poor model interpretability, and lack of external validation, and clarifies the future development direction.
期刊介绍:
Cancer Investigation is one of the most highly regarded and recognized journals in the field of basic and clinical oncology. It is designed to give physicians a comprehensive resource on the current state of progress in the cancer field as well as a broad background of reliable information necessary for effective decision making. In addition to presenting original papers of fundamental significance, it also publishes reviews, essays, specialized presentations of controversies, considerations of new technologies and their applications to specific laboratory problems, discussions of public issues, miniseries on major topics, new and experimental drugs and therapies, and an innovative letters to the editor section. One of the unique features of the journal is its departmentalized editorial sections reporting on more than 30 subject categories covering the broad spectrum of specialized areas that together comprise the field of oncology. Edited by leading physicians and research scientists, these sections make Cancer Investigation the prime resource for clinicians seeking to make sense of the sometimes-overwhelming amount of information available throughout the field. In addition to its peer-reviewed clinical research, the journal also features translational studies that bridge the gap between the laboratory and the clinic.