Methods for Integrating Trials and Non-experimental Data to Examine Treatment Effect Heterogeneity

IF 3.4 1区数学 Q1 STATISTICS & PROBABILITY

Statistical Science Pub Date : 2023-11-01 DOI:10.1214/23-sts890

Carly Lupton Brantner, Ting-Hsuan Chang, Trang Quynh Nguyen, Hwanhee Hong, Leon Di Stefano, Elizabeth A. Stuart

{"title":"Methods for Integrating Trials and Non-experimental Data to Examine Treatment Effect Heterogeneity","authors":"Carly Lupton Brantner, Ting-Hsuan Chang, Trang Quynh Nguyen, Hwanhee Hong, Leon Di Stefano, Elizabeth A. Stuart","doi":"10.1214/23-sts890","DOIUrl":null,"url":null,"abstract":"Estimating treatment effects conditional on observed covariates can improve the ability to tailor treatments to particular individuals. Doing so effectively requires dealing with potential confounding, and also enough data to adequately estimate effect moderation. A recent influx of work has looked into estimating treatment effect heterogeneity using data from multiple randomized controlled trials and/or observational datasets. With many new methods available for assessing treatment effect heterogeneity using multiple studies, it is important to understand which methods are best used in which setting, how the methods compare to one another, and what needs to be done to continue progress in this field. This paper reviews these methods broken down by data setting: aggregate-level data, federated learning, and individual participant-level data. We define the conditional average treatment effect and discuss differences between parametric and nonparametric estimators, and we list key assumptions, both those that are required within a single study and those that are necessary for data combination. After describing existing approaches, we compare and contrast them and reveal open areas for future research. This review demonstrates that there are many possible approaches for estimating treatment effect heterogeneity through the combination of datasets, but that there is substantial work to be done to compare these methods through case studies and simulations, extend them to different settings, and refine them to account for various challenges present in real data.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"19 2","pages":"0"},"PeriodicalIF":3.4000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1214/23-sts890","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 5

Abstract

Estimating treatment effects conditional on observed covariates can improve the ability to tailor treatments to particular individuals. Doing so effectively requires dealing with potential confounding, and also enough data to adequately estimate effect moderation. A recent influx of work has looked into estimating treatment effect heterogeneity using data from multiple randomized controlled trials and/or observational datasets. With many new methods available for assessing treatment effect heterogeneity using multiple studies, it is important to understand which methods are best used in which setting, how the methods compare to one another, and what needs to be done to continue progress in this field. This paper reviews these methods broken down by data setting: aggregate-level data, federated learning, and individual participant-level data. We define the conditional average treatment effect and discuss differences between parametric and nonparametric estimators, and we list key assumptions, both those that are required within a single study and those that are necessary for data combination. After describing existing approaches, we compare and contrast them and reveal open areas for future research. This review demonstrates that there are many possible approaches for estimating treatment effect heterogeneity through the combination of datasets, but that there is substantial work to be done to compare these methods through case studies and simulations, extend them to different settings, and refine them to account for various challenges present in real data.

查看原文本刊更多论文

综合试验和非实验数据检验治疗效果异质性的方法

根据观察到的协变量估计治疗效果可以提高为特定个体量身定制治疗的能力。要有效地做到这一点，需要处理潜在的混杂因素，还需要有足够的数据来充分估计效果的适度性。最近大量的研究工作着眼于利用多个随机对照试验和/或观察数据集的数据来估计治疗效果的异质性。有许多新的方法可以通过多个研究来评估治疗效果的异质性，重要的是要了解哪种方法在哪种情况下使用最好，这些方法如何相互比较，以及需要做些什么来继续在这一领域取得进展。本文回顾了按数据设置分类的这些方法:聚合级数据、联邦学习和个体参与者级数据。我们定义了条件平均处理效果，并讨论了参数估计器和非参数估计器之间的差异，我们列出了关键假设，包括单个研究中所需的假设和数据组合所必需的假设。在描述了现有的方法之后，我们对它们进行了比较和对比，并揭示了未来研究的开放领域。这篇综述表明，有许多可能的方法可以通过数据集的组合来估计治疗效果的异质性，但是还有大量的工作要做，通过案例研究和模拟来比较这些方法，将它们扩展到不同的环境中，并对它们进行改进，以解释实际数据中存在的各种挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Statistical Science 数学-统计学与概率论

CiteScore

6.50

自引率

1.80%

发文量

审稿时长

>12 weeks

期刊介绍： The central purpose of Statistical Science is to convey the richness, breadth and unity of the field by presenting the full range of contemporary statistical thought at a moderate technical level, accessible to the wide community of practitioners, researchers and students of statistics and probability.