Performance Comparison of 10 State-of-the-Art Machine Learning Algorithms for Outcome Prediction Modeling of Radiation-Induced Toxicity.

IF 2.2 Q3 ONCOLOGY
Advances in Radiation Oncology Pub Date : 2024-11-13 eCollection Date: 2025-02-01 DOI:10.1016/j.adro.2024.101675
Ramon M Salazar, Saurabh S Nair, Alexandra O Leone, Ting Xu, Raymond P Mumme, Jack D Duryea, Brian De, Kelsey L Corrigan, Michael K Rooney, Matthew S Ning, Prajnan Das, Emma B Holliday, Zhongxing Liao, Laurence E Court, Joshua S Niedzielski
{"title":"Performance Comparison of 10 State-of-the-Art Machine Learning Algorithms for Outcome Prediction Modeling of Radiation-Induced Toxicity.","authors":"Ramon M Salazar, Saurabh S Nair, Alexandra O Leone, Ting Xu, Raymond P Mumme, Jack D Duryea, Brian De, Kelsey L Corrigan, Michael K Rooney, Matthew S Ning, Prajnan Das, Emma B Holliday, Zhongxing Liao, Laurence E Court, Joshua S Niedzielski","doi":"10.1016/j.adro.2024.101675","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate the efficacy of prominent machine learning algorithms in predicting normal tissue complication probability using clinical data obtained from 2 distinct disease sites and to create a software tool that facilitates the automatic determination of the optimal algorithm to model any given labeled data set.</p><p><strong>Methods and materials: </strong>We obtained 3 sets of radiation toxicity data (478 patients) from our clinic: gastrointestinal toxicity, radiation pneumonitis, and radiation esophagitis. These data comprised clinicopathological and dosimetric information for patients diagnosed with non-small cell lung cancer and anal squamous cell carcinoma. Each data set was modeled using 11 commonly employed machine learning algorithms (elastic net, least absolute shrinkage and selection operator [LASSO], random forest, random forest regression, support vector machine, extreme gradient boosting, light gradient boosting machine, k-nearest neighbors, neural network, Bayesian-LASSO, and Bayesian neural network) by randomly dividing the data set into a training and test set. The training set was used to create and tune the model, and the test set served to assess it by calculating performance metrics. This process was repeated 100 times by each algorithm for each data set. Figures were generated to visually compare the performance of the algorithms. A graphical user interface was developed to automate this whole process.</p><p><strong>Results: </strong>LASSO achieved the highest area under the precision-recall curve (0.807 ± 0.067) for radiation esophagitis, random forest for gastrointestinal toxicity (0.726 ± 0.096), and the neural network for radiation pneumonitis (0.878 ± 0.060). The area under the curve was 0.754 ± 0.069, 0.889 ± 0.043, and 0.905 ± 0.045, respectively. The graphical user interface was used to compare all algorithms for each data set automatically. When averaging the area under the precision-recall curve across all toxicities, Bayesian-LASSO was the best model.</p><p><strong>Conclusions: </strong>Our results show that there is no best algorithm for all data sets. Therefore, it is important to compare multiple algorithms when training an outcome prediction model on a new data set. The graphical user interface created for this study automatically compares the performance of these 11 algorithms for any data set.</p>","PeriodicalId":7390,"journal":{"name":"Advances in Radiation Oncology","volume":"10 2","pages":"101675"},"PeriodicalIF":2.2000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11665468/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Radiation Oncology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.adro.2024.101675","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: To evaluate the efficacy of prominent machine learning algorithms in predicting normal tissue complication probability using clinical data obtained from 2 distinct disease sites and to create a software tool that facilitates the automatic determination of the optimal algorithm to model any given labeled data set.

Methods and materials: We obtained 3 sets of radiation toxicity data (478 patients) from our clinic: gastrointestinal toxicity, radiation pneumonitis, and radiation esophagitis. These data comprised clinicopathological and dosimetric information for patients diagnosed with non-small cell lung cancer and anal squamous cell carcinoma. Each data set was modeled using 11 commonly employed machine learning algorithms (elastic net, least absolute shrinkage and selection operator [LASSO], random forest, random forest regression, support vector machine, extreme gradient boosting, light gradient boosting machine, k-nearest neighbors, neural network, Bayesian-LASSO, and Bayesian neural network) by randomly dividing the data set into a training and test set. The training set was used to create and tune the model, and the test set served to assess it by calculating performance metrics. This process was repeated 100 times by each algorithm for each data set. Figures were generated to visually compare the performance of the algorithms. A graphical user interface was developed to automate this whole process.

Results: LASSO achieved the highest area under the precision-recall curve (0.807 ± 0.067) for radiation esophagitis, random forest for gastrointestinal toxicity (0.726 ± 0.096), and the neural network for radiation pneumonitis (0.878 ± 0.060). The area under the curve was 0.754 ± 0.069, 0.889 ± 0.043, and 0.905 ± 0.045, respectively. The graphical user interface was used to compare all algorithms for each data set automatically. When averaging the area under the precision-recall curve across all toxicities, Bayesian-LASSO was the best model.

Conclusions: Our results show that there is no best algorithm for all data sets. Therefore, it is important to compare multiple algorithms when training an outcome prediction model on a new data set. The graphical user interface created for this study automatically compares the performance of these 11 algorithms for any data set.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Advances in Radiation Oncology
Advances in Radiation Oncology Medicine-Radiology, Nuclear Medicine and Imaging
CiteScore
4.60
自引率
4.30%
发文量
208
审稿时长
98 days
期刊介绍: The purpose of Advances is to provide information for clinicians who use radiation therapy by publishing: Clinical trial reports and reanalyses. Basic science original reports. Manuscripts examining health services research, comparative and cost effectiveness research, and systematic reviews. Case reports documenting unusual problems and solutions. High quality multi and single institutional series, as well as other novel retrospective hypothesis generating series. Timely critical reviews on important topics in radiation oncology, such as side effects. Articles reporting the natural history of disease and patterns of failure, particularly as they relate to treatment volume delineation. Articles on safety and quality in radiation therapy. Essays on clinical experience. Articles on practice transformation in radiation oncology, in particular: Aspects of health policy that may impact the future practice of radiation oncology. How information technology, such as data analytics and systems innovations, will change radiation oncology practice. Articles on imaging as they relate to radiation therapy treatment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信