Development and validation of a machine learning-based model to predict postoperative overall survival in patients with soft tissue sarcoma: a retrospective cohort study.
{"title":"Development and validation of a machine learning-based model to predict postoperative overall survival in patients with soft tissue sarcoma: a retrospective cohort study.","authors":"Xu Liu, Jin Yuan, Xinfeng Wang, Shengji Yu","doi":"10.62347/ZQVY3877","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The aim of this study is to develop a machine learning-based model to predict postoperative overall survival (OS) in patients with soft tissue sarcoma (STS) that demonstrates superior comprehensive performance.</p><p><strong>Methods: </strong>This analysis leveraged data from the SEER database spanning 2010-2020, alongside a STS cohort from the National Cancer Center. Machine learning methods were applied for predictor selection by wrapper methods and the development of the predictive model. The optimal model was determined using the concordance index (C-index), time-dependent calibration curves, time dependent receiver operating characteristic (ROC) curves, and decision curve analysis (DCA).</p><p><strong>Results: </strong>Six machine learning learners identified six feature subsets. Subsequently, six feature subsets and six machine learning learners were combined, resulting in the development of 36 prognostic models. The CAM model, exhibiting the highest prediction performance, was selected. The CAM model achieved a C-index of 0.849 (95% CI 0.837-0.859) in the training cohort and 0.837 (95% CI 0.809-0.871) in the validation cohort. Furthermore, time-dependent calibration curves, time-dependent ROC curves, and DCA indicate that the PAM demonstrates excellent calibration, predictive accuracy, and clinical net benefit. A publicly accessible web tool was developed for the CAM. Notably, CAM's performance exceeds that of all existing STS prognostic nomograms and prediction models.</p><p><strong>Conclusions: </strong>The CAM has the potential to identify postoperative OS in STS patients. This can assist clinicians in assessing the severity of the disease, facilitating patient follow-up, and aiding in the formulation of adjuvant treatment strategies.</p>","PeriodicalId":7437,"journal":{"name":"American journal of cancer research","volume":"14 10","pages":"4731-4746"},"PeriodicalIF":3.6000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11560808/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of cancer research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.62347/ZQVY3877","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The aim of this study is to develop a machine learning-based model to predict postoperative overall survival (OS) in patients with soft tissue sarcoma (STS) that demonstrates superior comprehensive performance.
Methods: This analysis leveraged data from the SEER database spanning 2010-2020, alongside a STS cohort from the National Cancer Center. Machine learning methods were applied for predictor selection by wrapper methods and the development of the predictive model. The optimal model was determined using the concordance index (C-index), time-dependent calibration curves, time dependent receiver operating characteristic (ROC) curves, and decision curve analysis (DCA).
Results: Six machine learning learners identified six feature subsets. Subsequently, six feature subsets and six machine learning learners were combined, resulting in the development of 36 prognostic models. The CAM model, exhibiting the highest prediction performance, was selected. The CAM model achieved a C-index of 0.849 (95% CI 0.837-0.859) in the training cohort and 0.837 (95% CI 0.809-0.871) in the validation cohort. Furthermore, time-dependent calibration curves, time-dependent ROC curves, and DCA indicate that the PAM demonstrates excellent calibration, predictive accuracy, and clinical net benefit. A publicly accessible web tool was developed for the CAM. Notably, CAM's performance exceeds that of all existing STS prognostic nomograms and prediction models.
Conclusions: The CAM has the potential to identify postoperative OS in STS patients. This can assist clinicians in assessing the severity of the disease, facilitating patient follow-up, and aiding in the formulation of adjuvant treatment strategies.
期刊介绍:
The American Journal of Cancer Research (AJCR) (ISSN 2156-6976), is an independent open access, online only journal to facilitate rapid dissemination of novel discoveries in basic science and treatment of cancer. It was founded by a group of scientists for cancer research and clinical academic oncologists from around the world, who are devoted to the promotion and advancement of our understanding of the cancer and its treatment. The scope of AJCR is intended to encompass that of multi-disciplinary researchers from any scientific discipline where the primary focus of the research is to increase and integrate knowledge about etiology and molecular mechanisms of carcinogenesis with the ultimate aim of advancing the cure and prevention of this increasingly devastating disease. To achieve these aims AJCR will publish review articles, original articles and new techniques in cancer research and therapy. It will also publish hypothesis, case reports and letter to the editor. Unlike most other open access online journals, AJCR will keep most of the traditional features of paper print that we are all familiar with, such as continuous volume, issue numbers, as well as continuous page numbers to retain our comfortable familiarity towards an academic journal.