Measuring the Performance of Survival Models to Personalize Treatment Choices.

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine Pub Date : 2025-03-30 DOI:10.1002/sim.70050

Orestis Efthimiou, Jeroen Hoogland, Thomas P A Debray, Valerie Aponte Ribero, Wilma Knol, Huiberdina L Koek, Matthias Schwenkglenks, Séverine Henrard, Matthias Egger, Nicolas Rodondi, Ian R White

{"title":"Measuring the Performance of Survival Models to Personalize Treatment Choices.","authors":"Orestis Efthimiou, Jeroen Hoogland, Thomas P A Debray, Valerie Aponte Ribero, Wilma Knol, Huiberdina L Koek, Matthias Schwenkglenks, Séverine Henrard, Matthias Egger, Nicolas Rodondi, Ian R White","doi":"10.1002/sim.70050","DOIUrl":null,"url":null,"abstract":"<p><p>Various statistical and machine learning algorithms can be used to predict treatment effects at the patient level using data from randomized clinical trials (RCTs). Such predictions can facilitate individualized treatment decisions. Recently, a range of methods and metrics were developed for assessing the accuracy of such predictions. Here, we extend these methods, focusing on the case of survival (time-to-event) outcomes. We start by providing alternative definitions of the participant-level treatment benefit; subsequently, we summarize existing and propose new measures for assessing the performance of models estimating participant-level treatment benefits. We explore metrics assessing discrimination and calibration for benefit and decision accuracy. These measures can be used to assess the performance of statistical as well as machine learning models and can be useful during model development (i.e., for model selection or for internal validation) or when testing a model in new settings (i.e., in an external validation). We illustrate methods using simulated data and real data from the OPERAM trial, an RCT in multimorbid older people, which randomized participants to either standard care or a pharmacotherapy optimization intervention. We provide R codes for implementing all models and measures.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 7","pages":"e70050"},"PeriodicalIF":1.8000,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11983264/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/sim.70050","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Various statistical and machine learning algorithms can be used to predict treatment effects at the patient level using data from randomized clinical trials (RCTs). Such predictions can facilitate individualized treatment decisions. Recently, a range of methods and metrics were developed for assessing the accuracy of such predictions. Here, we extend these methods, focusing on the case of survival (time-to-event) outcomes. We start by providing alternative definitions of the participant-level treatment benefit; subsequently, we summarize existing and propose new measures for assessing the performance of models estimating participant-level treatment benefits. We explore metrics assessing discrimination and calibration for benefit and decision accuracy. These measures can be used to assess the performance of statistical as well as machine learning models and can be useful during model development (i.e., for model selection or for internal validation) or when testing a model in new settings (i.e., in an external validation). We illustrate methods using simulated data and real data from the OPERAM trial, an RCT in multimorbid older people, which randomized participants to either standard care or a pharmacotherapy optimization intervention. We provide R codes for implementing all models and measures.

查看原文本刊更多论文

测量生存模型的性能以个性化治疗选择。

各种统计和机器学习算法可用于使用随机临床试验（rct）的数据来预测患者水平的治疗效果。这样的预测可以促进个性化的治疗决策。最近，人们开发了一系列方法和指标来评估这种预测的准确性。在这里，我们扩展了这些方法，重点关注生存（时间到事件）结果的情况。我们首先提供参与者水平治疗获益的不同定义；随后，我们总结了现有的并提出了评估参与者水平治疗效益模型性能的新措施。我们探讨了评估歧视和校准效益和决策准确性的指标。这些度量可用于评估统计和机器学习模型的性能，并且在模型开发期间（即，用于模型选择或内部验证）或在新设置中测试模型时（即，在外部验证中）非常有用。我们使用来自OPERAM试验的模拟数据和真实数据来说明方法，OPERAM试验是一项针对多种疾病老年人的随机对照试验，将参与者随机分配到标准治疗或药物治疗优化干预中。我们提供R代码来实现所有的模型和措施。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Statistics in Medicine 医学-公共卫生、环境卫生与职业卫生

CiteScore

3.40

自引率

10.00%

发文量

334

审稿时长

2-4 weeks

期刊介绍： The journal aims to influence practice in medicine and its associated sciences through the publication of papers on statistical and other quantitative methods. Papers will explain new methods and demonstrate their application, preferably through a substantive, real, motivating example or a comprehensive evaluation based on an illustrative example. Alternatively, papers will report on case-studies where creative use or technical generalizations of established methodology is directed towards a substantive application. Reviews of, and tutorials on, general topics relevant to the application of statistics to medicine will also be published. The main criteria for publication are appropriateness of the statistical methods to a particular medical problem and clarity of exposition. Papers with primarily mathematical content will be excluded. The journal aims to enhance communication between statisticians, clinicians and medical researchers.