Nan Jiang, Xing Xiong, Xue Chen, Mengmeng Feng, Yan Guo, Chunhong Hu
{"title":"Machine learning and deep learning to improve overall survival prediction in cervical cancer patients.","authors":"Nan Jiang, Xing Xiong, Xue Chen, Mengmeng Feng, Yan Guo, Chunhong Hu","doi":"10.21037/tcr-2024-2304","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Cervical cancer (CC) is one of the most common gynecological malignancies. Previous studies have shown that the prognosis of CC is affected by many factors. Our study aimed to identify key prognostic factors and use machine learning and deep learning algorithms to construct models to predict the overall survival (OS) of CC patients.</p><p><strong>Methods: </strong>Data of CC patients collected between 2007 and 2016 were collected from the Surveillance, Epidemiology, and End Results (SEER) database, and were randomly divided into the training set (1,743 patients) and test set (747 patients). Moreover, in order to enhance the practical application of the model, we conducted an X-tile analysis to categorize the patients into three distinct strata based on their age and tumor size. Least absolute shrinkage and selection operator (LASSO) and multivariate Cox regression were performed to identify the independent prognostic factors for OS, which were further used to construct CoxBoost, RandomForest, SuperPC XGBoost, and DeepSurv survival models to predict 1-, 3-, and 5-year OS.</p><p><strong>Results: </strong>The parameters, including age, marital status, grade, tumor size, surgery, radiation, race, the American Joint Committee on Cancer (AJCC)_stage, AJCC_T, and AJCC_M, were associated with survival and were further incorporated into the five models. The concordance index (C-index) value was 0.858, 0.848, 0.849, 0.840, and 0.869, respectively, and the receiver operating characteristic (ROC) curves showed exceptional predictive performance. Among the five models, DeepSurv was the model with best performance. The ROC curve validated the area under the curve (AUC) values for 1-year OS, 3-year OS, and 5-year OS, which were 0.936, 0.915, and 0.900, respectively.</p><p><strong>Conclusions: </strong>The prognostic model conducted by DeepSurv algorithm and the independent prognostic factors can potentially be applied in making personalized treatment plans and evaluating the prognosis of CC patients.</p>","PeriodicalId":23216,"journal":{"name":"Translational cancer research","volume":"14 5","pages":"3057-3068"},"PeriodicalIF":1.7000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12170117/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Translational cancer research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/tcr-2024-2304","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/26 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Cervical cancer (CC) is one of the most common gynecological malignancies. Previous studies have shown that the prognosis of CC is affected by many factors. Our study aimed to identify key prognostic factors and use machine learning and deep learning algorithms to construct models to predict the overall survival (OS) of CC patients.
Methods: Data of CC patients collected between 2007 and 2016 were collected from the Surveillance, Epidemiology, and End Results (SEER) database, and were randomly divided into the training set (1,743 patients) and test set (747 patients). Moreover, in order to enhance the practical application of the model, we conducted an X-tile analysis to categorize the patients into three distinct strata based on their age and tumor size. Least absolute shrinkage and selection operator (LASSO) and multivariate Cox regression were performed to identify the independent prognostic factors for OS, which were further used to construct CoxBoost, RandomForest, SuperPC XGBoost, and DeepSurv survival models to predict 1-, 3-, and 5-year OS.
Results: The parameters, including age, marital status, grade, tumor size, surgery, radiation, race, the American Joint Committee on Cancer (AJCC)_stage, AJCC_T, and AJCC_M, were associated with survival and were further incorporated into the five models. The concordance index (C-index) value was 0.858, 0.848, 0.849, 0.840, and 0.869, respectively, and the receiver operating characteristic (ROC) curves showed exceptional predictive performance. Among the five models, DeepSurv was the model with best performance. The ROC curve validated the area under the curve (AUC) values for 1-year OS, 3-year OS, and 5-year OS, which were 0.936, 0.915, and 0.900, respectively.
Conclusions: The prognostic model conducted by DeepSurv algorithm and the independent prognostic factors can potentially be applied in making personalized treatment plans and evaluating the prognosis of CC patients.
期刊介绍:
Translational Cancer Research (Transl Cancer Res TCR; Print ISSN: 2218-676X; Online ISSN 2219-6803; http://tcr.amegroups.com/) is an Open Access, peer-reviewed journal, indexed in Science Citation Index Expanded (SCIE). TCR publishes laboratory studies of novel therapeutic interventions as well as clinical trials which evaluate new treatment paradigms for cancer; results of novel research investigations which bridge the laboratory and clinical settings including risk assessment, cellular and molecular characterization, prevention, detection, diagnosis and treatment of human cancers with the overall goal of improving the clinical care of cancer patients. The focus of TCR is original, peer-reviewed, science-based research that successfully advances clinical medicine toward the goal of improving patients'' quality of life. The editors and an international advisory group of scientists and clinician-scientists as well as other experts will hold TCR articles to the high-quality standards. We accept Original Articles as well as Review Articles, Editorials and Brief Articles.