{"title":"Explainable machine learning and feature interpretation to predict survival outcomes in the treatment of lung cancer","authors":"Eyachew Misganew Tegaw , Betelhem Bizuneh Asfaw","doi":"10.1016/j.seminoncol.2025.152364","DOIUrl":null,"url":null,"abstract":"<div><div>The treatment outcomes of lung cancer are highly variable, and machine learning (ML) models provide valuable insights into how clinical and biochemical factors influence survival across different treatments. This study will investigate the survival of patients after four major treatments for lung cancer by interpreting the impact of biomarkers on survival using SHapley Additive exPlanations (SHAP). We analyzed 23,658 lung cancer patient records derived from a Kaggle dataset. Using the most relevant clinical and biochemical variables, ML models were employed to study survival outcomes for different treatments. SHAP analysis revealed major survival predictors in each treatment. Survival outcomes are visualized as f(x) (predicted survival) and E[f(x)] (baseline expectation) in SHAP waterfall plots. The most performed model is Gradient Boosting with an accuracy of 88.99%, precision of 89.06%, recall of 88.99%, F1-score of 88.91%, and Receiver Operating Characteristic Curve (AUC-ROC) score of 0.9332. Chemotherapy treatment was positive for survival, the key for survival was phosphorus levels (+0.05), low Alanine Aminotransferase levels (+0.04) and low glucose levels (+0.04). Targeted therapy and radiation had worse survival, while surgery was favorable, especially in cases with high white blood cell and Lactate Dehydrogenase (LDH) levels. SHAP-based ML analysis aptly underlines how clinical and biochemical factors influence the survival rate. It indicates that ML-driven interpretability might drive personalized treatment approaches in lung cancer.</div></div>","PeriodicalId":21750,"journal":{"name":"Seminars in oncology","volume":"52 3","pages":"Article 152364"},"PeriodicalIF":3.0000,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seminars in oncology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0093775425000569","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The treatment outcomes of lung cancer are highly variable, and machine learning (ML) models provide valuable insights into how clinical and biochemical factors influence survival across different treatments. This study will investigate the survival of patients after four major treatments for lung cancer by interpreting the impact of biomarkers on survival using SHapley Additive exPlanations (SHAP). We analyzed 23,658 lung cancer patient records derived from a Kaggle dataset. Using the most relevant clinical and biochemical variables, ML models were employed to study survival outcomes for different treatments. SHAP analysis revealed major survival predictors in each treatment. Survival outcomes are visualized as f(x) (predicted survival) and E[f(x)] (baseline expectation) in SHAP waterfall plots. The most performed model is Gradient Boosting with an accuracy of 88.99%, precision of 89.06%, recall of 88.99%, F1-score of 88.91%, and Receiver Operating Characteristic Curve (AUC-ROC) score of 0.9332. Chemotherapy treatment was positive for survival, the key for survival was phosphorus levels (+0.05), low Alanine Aminotransferase levels (+0.04) and low glucose levels (+0.04). Targeted therapy and radiation had worse survival, while surgery was favorable, especially in cases with high white blood cell and Lactate Dehydrogenase (LDH) levels. SHAP-based ML analysis aptly underlines how clinical and biochemical factors influence the survival rate. It indicates that ML-driven interpretability might drive personalized treatment approaches in lung cancer.
期刊介绍:
Seminars in Oncology brings you current, authoritative, and practical reviews of developments in the etiology, diagnosis and management of cancer. Each issue examines topics of clinical importance, with an emphasis on providing both the basic knowledge needed to better understand a topic as well as evidence-based opinions from leaders in the field. Seminars in Oncology also seeks to be a venue for sharing a diversity of opinions including those that might be considered "outside the box". We welcome a healthy and respectful exchange of opinions and urge you to approach us with your insights as well as suggestions of topics that you deem worthy of coverage. By helping the reader understand the basic biology and the therapy of cancer as they learn the nuances from experts, all in a journal that encourages the exchange of ideas we aim to help move the treatment of cancer forward.