Using a machine learning algorithm and clinical data to predict the risk factors of disease recurrence after adjuvant treatment of advanced-stage oral cavity cancer
{"title":"Using a machine learning algorithm and clinical data to predict the risk factors of disease recurrence after adjuvant treatment of advanced-stage oral cavity cancer","authors":"Sheng-Yao Huang, Ren-Jun Hsu, Dai-Wei Liu, Wen-Lin Hsu","doi":"10.4103/tcmj.tcmj_56_24","DOIUrl":null,"url":null,"abstract":"ABSTRACT\n \n \n \n Head-and-neck cancer is a major cancer in Taiwan. Most patients are in the advanced stage at initial diagnosis. In addition to primary surgery, adjuvant therapy, including chemotherapy and radiotherapy, is also necessary to treat these patients. We used a machine learning tool to determine the factors that may be associated with and predict treatment outcome.\n \n \n \n We retrospectively reviewed 187 patients diagnosed with advanced-stage head-and-neck cancer who received surgery and adjuvant radiotherapy with or without chemotherapy. We used eXtreme Gradient Boosting (XGBoost) – a gradient tree-based model – to analyze data. The features were extracted from the entries we recorded from the electronic health-care system and paper medical record. The patient data were categorized into training and testing datasets, with labeling according to their recurrence status within the 5-year follow-up. The primary endpoint was to predict whether the patients had recurrent disease. The risk factors were identified by analyzing the feature importance in the model. For comparison, we also used regression to perform the variate analysis to identify the risk factors.\n \n \n \n The accuracy, sensitivity, and positive predictive value of the model were 57.89%, 57.14%, and 44.44%, respectively. Pathological lymph node status was the most important feature, followed by whether the patient was receiving chemotherapy. Fraction size, early termination, and interruption were the important factors related to radiotherapy and might affect treatment outcome. The area under the curve of the receiver operating characteristic curve was 0.58. The risk factors identified by XGBoost were consistent with those found by regression.\n \n \n \n We found that several factors were associated with treatment outcome in advanced-stage head-and-neck cancer. In future, we hope to collect the data according to the features introduced in this study and to construct a stronger model to explain and predict outcomes.\n","PeriodicalId":507485,"journal":{"name":"Tzu Chi Medical Journal","volume":" January","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tzu Chi Medical Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4103/tcmj.tcmj_56_24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
ABSTRACT
Head-and-neck cancer is a major cancer in Taiwan. Most patients are in the advanced stage at initial diagnosis. In addition to primary surgery, adjuvant therapy, including chemotherapy and radiotherapy, is also necessary to treat these patients. We used a machine learning tool to determine the factors that may be associated with and predict treatment outcome.
We retrospectively reviewed 187 patients diagnosed with advanced-stage head-and-neck cancer who received surgery and adjuvant radiotherapy with or without chemotherapy. We used eXtreme Gradient Boosting (XGBoost) – a gradient tree-based model – to analyze data. The features were extracted from the entries we recorded from the electronic health-care system and paper medical record. The patient data were categorized into training and testing datasets, with labeling according to their recurrence status within the 5-year follow-up. The primary endpoint was to predict whether the patients had recurrent disease. The risk factors were identified by analyzing the feature importance in the model. For comparison, we also used regression to perform the variate analysis to identify the risk factors.
The accuracy, sensitivity, and positive predictive value of the model were 57.89%, 57.14%, and 44.44%, respectively. Pathological lymph node status was the most important feature, followed by whether the patient was receiving chemotherapy. Fraction size, early termination, and interruption were the important factors related to radiotherapy and might affect treatment outcome. The area under the curve of the receiver operating characteristic curve was 0.58. The risk factors identified by XGBoost were consistent with those found by regression.
We found that several factors were associated with treatment outcome in advanced-stage head-and-neck cancer. In future, we hope to collect the data according to the features introduced in this study and to construct a stronger model to explain and predict outcomes.