可解释的机器学习和特征解释预测肺癌治疗的生存结果

IF 3 3区 医学 Q2 ONCOLOGY
Eyachew Misganew Tegaw , Betelhem Bizuneh Asfaw
{"title":"可解释的机器学习和特征解释预测肺癌治疗的生存结果","authors":"Eyachew Misganew Tegaw ,&nbsp;Betelhem Bizuneh Asfaw","doi":"10.1016/j.seminoncol.2025.152364","DOIUrl":null,"url":null,"abstract":"<div><div>The treatment outcomes of lung cancer are highly variable, and machine learning (ML) models provide valuable insights into how clinical and biochemical factors influence survival across different treatments. This study will investigate the survival of patients after four major treatments for lung cancer by interpreting the impact of biomarkers on survival using SHapley Additive exPlanations (SHAP). We analyzed 23,658 lung cancer patient records derived from a Kaggle dataset. Using the most relevant clinical and biochemical variables, ML models were employed to study survival outcomes for different treatments. SHAP analysis revealed major survival predictors in each treatment. Survival outcomes are visualized as f(x) (predicted survival) and E[f(x)] (baseline expectation) in SHAP waterfall plots. The most performed model is Gradient Boosting with an accuracy of 88.99%, precision of 89.06%, recall of 88.99%, F1-score of 88.91%, and Receiver Operating Characteristic Curve (AUC-ROC) score of 0.9332. Chemotherapy treatment was positive for survival, the key for survival was phosphorus levels (+0.05), low Alanine Aminotransferase levels (+0.04) and low glucose levels (+0.04). Targeted therapy and radiation had worse survival, while surgery was favorable, especially in cases with high white blood cell and Lactate Dehydrogenase (LDH) levels. SHAP-based ML analysis aptly underlines how clinical and biochemical factors influence the survival rate. It indicates that ML-driven interpretability might drive personalized treatment approaches in lung cancer.</div></div>","PeriodicalId":21750,"journal":{"name":"Seminars in oncology","volume":"52 3","pages":"Article 152364"},"PeriodicalIF":3.0000,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Explainable machine learning and feature interpretation to predict survival outcomes in the treatment of lung cancer\",\"authors\":\"Eyachew Misganew Tegaw ,&nbsp;Betelhem Bizuneh Asfaw\",\"doi\":\"10.1016/j.seminoncol.2025.152364\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The treatment outcomes of lung cancer are highly variable, and machine learning (ML) models provide valuable insights into how clinical and biochemical factors influence survival across different treatments. This study will investigate the survival of patients after four major treatments for lung cancer by interpreting the impact of biomarkers on survival using SHapley Additive exPlanations (SHAP). We analyzed 23,658 lung cancer patient records derived from a Kaggle dataset. Using the most relevant clinical and biochemical variables, ML models were employed to study survival outcomes for different treatments. SHAP analysis revealed major survival predictors in each treatment. Survival outcomes are visualized as f(x) (predicted survival) and E[f(x)] (baseline expectation) in SHAP waterfall plots. The most performed model is Gradient Boosting with an accuracy of 88.99%, precision of 89.06%, recall of 88.99%, F1-score of 88.91%, and Receiver Operating Characteristic Curve (AUC-ROC) score of 0.9332. Chemotherapy treatment was positive for survival, the key for survival was phosphorus levels (+0.05), low Alanine Aminotransferase levels (+0.04) and low glucose levels (+0.04). Targeted therapy and radiation had worse survival, while surgery was favorable, especially in cases with high white blood cell and Lactate Dehydrogenase (LDH) levels. SHAP-based ML analysis aptly underlines how clinical and biochemical factors influence the survival rate. It indicates that ML-driven interpretability might drive personalized treatment approaches in lung cancer.</div></div>\",\"PeriodicalId\":21750,\"journal\":{\"name\":\"Seminars in oncology\",\"volume\":\"52 3\",\"pages\":\"Article 152364\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seminars in oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0093775425000569\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seminars in oncology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0093775425000569","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

肺癌的治疗结果是高度可变的,机器学习(ML)模型为临床和生化因素如何影响不同治疗方法的生存提供了有价值的见解。本研究将通过使用SHapley加法解释(SHAP)解释生物标志物对生存率的影响,研究肺癌四种主要治疗后患者的生存率。我们分析了来自Kaggle数据集的23,658例肺癌患者记录。使用最相关的临床和生化变量,ML模型研究不同治疗的生存结果。SHAP分析揭示了每种治疗的主要生存预测因素。在SHAP瀑布图中,生存结果显示为f(x)(预测生存)和E[f(x)](基线期望)。其中,梯度增强模型的准确率为88.99%,精密度为89.06%,召回率为88.99%,f1得分为88.91%,受试者工作特征曲线(AUC-ROC)得分为0.9332。化疗对生存有利,生存的关键是磷水平(+0.05)、低丙氨酸转氨酶水平(+0.04)和低葡萄糖水平(+0.04)。靶向治疗和放疗的生存率较差,而手术是有利的,特别是在白细胞和乳酸脱氢酶(LDH)水平高的病例中。基于shap的ML分析恰当地强调了临床和生化因素如何影响生存率。这表明机器学习驱动的可解释性可能会推动肺癌的个性化治疗方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Explainable machine learning and feature interpretation to predict survival outcomes in the treatment of lung cancer
The treatment outcomes of lung cancer are highly variable, and machine learning (ML) models provide valuable insights into how clinical and biochemical factors influence survival across different treatments. This study will investigate the survival of patients after four major treatments for lung cancer by interpreting the impact of biomarkers on survival using SHapley Additive exPlanations (SHAP). We analyzed 23,658 lung cancer patient records derived from a Kaggle dataset. Using the most relevant clinical and biochemical variables, ML models were employed to study survival outcomes for different treatments. SHAP analysis revealed major survival predictors in each treatment. Survival outcomes are visualized as f(x) (predicted survival) and E[f(x)] (baseline expectation) in SHAP waterfall plots. The most performed model is Gradient Boosting with an accuracy of 88.99%, precision of 89.06%, recall of 88.99%, F1-score of 88.91%, and Receiver Operating Characteristic Curve (AUC-ROC) score of 0.9332. Chemotherapy treatment was positive for survival, the key for survival was phosphorus levels (+0.05), low Alanine Aminotransferase levels (+0.04) and low glucose levels (+0.04). Targeted therapy and radiation had worse survival, while surgery was favorable, especially in cases with high white blood cell and Lactate Dehydrogenase (LDH) levels. SHAP-based ML analysis aptly underlines how clinical and biochemical factors influence the survival rate. It indicates that ML-driven interpretability might drive personalized treatment approaches in lung cancer.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Seminars in oncology
Seminars in oncology 医学-肿瘤学
CiteScore
6.60
自引率
0.00%
发文量
58
审稿时长
104 days
期刊介绍: Seminars in Oncology brings you current, authoritative, and practical reviews of developments in the etiology, diagnosis and management of cancer. Each issue examines topics of clinical importance, with an emphasis on providing both the basic knowledge needed to better understand a topic as well as evidence-based opinions from leaders in the field. Seminars in Oncology also seeks to be a venue for sharing a diversity of opinions including those that might be considered "outside the box". We welcome a healthy and respectful exchange of opinions and urge you to approach us with your insights as well as suggestions of topics that you deem worthy of coverage. By helping the reader understand the basic biology and the therapy of cancer as they learn the nuances from experts, all in a journal that encourages the exchange of ideas we aim to help move the treatment of cancer forward.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信