A scoping methodological review of simulation studies comparing statistical and machine learning approaches to risk prediction for time-to-event data.

IF 2.6

Diagnostic and prognostic research Pub Date : 2022-06-02 DOI:10.1186/s41512-022-00124-y

Hayley Smith, Michael Sweeting, Tim Morris, Michael J Crowther

{"title":"A scoping methodological review of simulation studies comparing statistical and machine learning approaches to risk prediction for time-to-event data.","authors":"Hayley Smith, Michael Sweeting, Tim Morris, Michael J Crowther","doi":"10.1186/s41512-022-00124-y","DOIUrl":null,"url":null,"abstract":"Background: There is substantial interest in the adaptation and application of so-called machine learning approaches to prognostic modelling of censored time-to-event data. These methods must be compared and evaluated against existing methods in a variety of scenarios to determine their predictive performance. A scoping review of how machine learning methods have been compared to traditional survival models is important to identify the comparisons that have been made and issues where they are lacking, biased towards one approach or misleading.Methods: We conducted a scoping review of research articles published between 1 January 2000 and 2 December 2020 using PubMed. Eligible articles were those that used simulation studies to compare statistical and machine learning methods for risk prediction with a time-to-event outcome in a medical/healthcare setting. We focus on data-generating mechanisms (DGMs), the methods that have been compared, the estimands of the simulation studies, and the performance measures used to evaluate them.Results: A total of ten articles were identified as eligible for the review. Six of the articles evaluated a method that was developed by the authors, four of which were machine learning methods, and the results almost always stated that this developed method's performance was equivalent to or better than the other methods compared. Comparisons were often biased towards the novel approach, with the majority only comparing against a basic Cox proportional hazards model, and in scenarios where it is clear it would not perform well. In many of the articles reviewed, key information was unclear, such as the number of simulation repetitions and how performance measures were calculated.Conclusion: It is vital that method comparisons are unbiased and comprehensive, and this should be the goal even if realising it is difficult. Fully assessing how newly developed methods perform and how they compare to a variety of traditional statistical methods for prognostic modelling is imperative as these methods are already being applied in clinical contexts. Evaluations of the performance and usefulness of recently developed methods for risk prediction should be continued and reporting standards improved as these methods become increasingly popular.","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":" ","pages":"10"},"PeriodicalIF":2.6000,"publicationDate":"2022-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9161606/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnostic and prognostic research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s41512-022-00124-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background: There is substantial interest in the adaptation and application of so-called machine learning approaches to prognostic modelling of censored time-to-event data. These methods must be compared and evaluated against existing methods in a variety of scenarios to determine their predictive performance. A scoping review of how machine learning methods have been compared to traditional survival models is important to identify the comparisons that have been made and issues where they are lacking, biased towards one approach or misleading.

Methods: We conducted a scoping review of research articles published between 1 January 2000 and 2 December 2020 using PubMed. Eligible articles were those that used simulation studies to compare statistical and machine learning methods for risk prediction with a time-to-event outcome in a medical/healthcare setting. We focus on data-generating mechanisms (DGMs), the methods that have been compared, the estimands of the simulation studies, and the performance measures used to evaluate them.

Results: A total of ten articles were identified as eligible for the review. Six of the articles evaluated a method that was developed by the authors, four of which were machine learning methods, and the results almost always stated that this developed method's performance was equivalent to or better than the other methods compared. Comparisons were often biased towards the novel approach, with the majority only comparing against a basic Cox proportional hazards model, and in scenarios where it is clear it would not perform well. In many of the articles reviewed, key information was unclear, such as the number of simulation repetitions and how performance measures were calculated.

Conclusion: It is vital that method comparisons are unbiased and comprehensive, and this should be the goal even if realising it is difficult. Fully assessing how newly developed methods perform and how they compare to a variety of traditional statistical methods for prognostic modelling is imperative as these methods are already being applied in clinical contexts. Evaluations of the performance and usefulness of recently developed methods for risk prediction should be continued and reporting standards improved as these methods become increasingly popular.

Abstract Image

查看原文本刊更多论文

模拟研究的范围界定方法综述，比较统计和机器学习方法对事件时间数据的风险预测

背景：人们对所谓的机器学习方法的适应和应用非常感兴趣，这种方法可以用于对截尾时间到事件数据进行预测建模。这些方法必须在各种情况下与现有方法进行比较和评估，以确定其预测性能。对机器学习方法如何与传统生存模型进行比较的范围审查对于确定已经进行的比较以及它们缺乏，偏向于一种方法或误导的问题非常重要。方法：我们对2000年1月1日至2020年12月2日在PubMed上发表的研究文章进行了范围综述。符合条件的文章是那些使用模拟研究将统计和机器学习方法用于风险预测与医疗/医疗保健环境中的事件发生时间结果进行比较的文章。我们重点关注数据生成机制（dgm），已经比较的方法，模拟研究的估计，以及用于评估它们的性能度量。结果：共有10篇文章被确定为符合审查条件。其中六篇文章评估了作者开发的一种方法，其中四种是机器学习方法，结果几乎总是表明这种开发方法的性能等同于或优于其他比较方法。比较往往偏向于新颖的方法，大多数只与基本的Cox比例风险模型进行比较，并且在很明显它不会表现良好的情况下。在审查的许多文章中，关键信息不明确，例如模拟重复的次数以及如何计算性能度量。结论：方法比较是公正和全面的是至关重要的，这应该是目标，即使实现它是困难的。充分评估新开发的方法的性能以及它们与各种传统的预后建模统计方法的比较是必要的，因为这些方法已经在临床环境中应用。随着最近开发的风险预测方法日益普及，应继续对这些方法的性能和有用性进行评价，并改进报告标准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Diagnostic and prognostic research

自引率

0.00%

发文量

审稿时长

18 weeks