Machine-learning approaches to predict individualized treatment effect using a randomized controlled trial

IF 7.7 1区医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH

European Journal of Epidemiology Pub Date : 2025-02-13 DOI:10.1007/s10654-024-01185-7

Rikuta Hamaya, Konan Hara, JoAnn E. Manson, Eric B. Rimm, Frank M. Sacks, Qiaochu Xue, Lu Qi, Nancy R. Cook

{"title":"Machine-learning approaches to predict individualized treatment effect using a randomized controlled trial","authors":"Rikuta Hamaya, Konan Hara, JoAnn E. Manson, Eric B. Rimm, Frank M. Sacks, Qiaochu Xue, Lu Qi, Nancy R. Cook","doi":"10.1007/s10654-024-01185-7","DOIUrl":null,"url":null,"abstract":"<p>Recent advancements in machine learning (ML) for analyzing heterogeneous treatment effects (HTE) are gaining prominence within the medical and epidemiological communities, offering potential breakthroughs in the realm of precision medicine by enabling the prediction of individual responses to treatments. This paper introduces the methodological frameworks used to study HTEs, particularly based on a single randomized controlled trial (RCT). We focus on methods to estimate conditional average treatment effect (CATE) for multiple covariates, aiming to predict individualized treatment effects. We explore a range of methodologies from basic frameworks like the T-learner, S-learner, and Causal Forest, to more advanced ones such as the DR-learner and R-learner, as well as cross-validation for CATE estimation to enhance statistical efficiency by estimating CATE for all RCT participants. We also provide a practical application of these approaches using the Preventing Overweight Using Novel Dietary Strategies (POUNDS Lost) trial, which compared the effects of high versus low-fat diet interventions on 2-year weight changes. We compared different sets of covariates for CATE estimation, showing that the DR- and R-learners are useful for the estimation of CATE in high-dimensional settings. This paper aims to explain the theoretical underpinnings and methodological nuances of ML-based HTE analysis without relying on technical jargon, making these concepts more accessible to the clinical and epidemiological research communities.</p>","PeriodicalId":11907,"journal":{"name":"European Journal of Epidemiology","volume":"8 1","pages":""},"PeriodicalIF":7.7000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10654-024-01185-7","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advancements in machine learning (ML) for analyzing heterogeneous treatment effects (HTE) are gaining prominence within the medical and epidemiological communities, offering potential breakthroughs in the realm of precision medicine by enabling the prediction of individual responses to treatments. This paper introduces the methodological frameworks used to study HTEs, particularly based on a single randomized controlled trial (RCT). We focus on methods to estimate conditional average treatment effect (CATE) for multiple covariates, aiming to predict individualized treatment effects. We explore a range of methodologies from basic frameworks like the T-learner, S-learner, and Causal Forest, to more advanced ones such as the DR-learner and R-learner, as well as cross-validation for CATE estimation to enhance statistical efficiency by estimating CATE for all RCT participants. We also provide a practical application of these approaches using the Preventing Overweight Using Novel Dietary Strategies (POUNDS Lost) trial, which compared the effects of high versus low-fat diet interventions on 2-year weight changes. We compared different sets of covariates for CATE estimation, showing that the DR- and R-learners are useful for the estimation of CATE in high-dimensional settings. This paper aims to explain the theoretical underpinnings and methodological nuances of ML-based HTE analysis without relying on technical jargon, making these concepts more accessible to the clinical and epidemiological research communities.

查看原文本刊更多论文

使用随机对照试验预测个体化治疗效果的机器学习方法

用于分析异质治疗效果（HTE）的机器学习（ML）的最新进展在医学和流行病学领域日益突出，通过预测个体对治疗的反应，为精准医学领域提供了潜在的突破。本文介绍了用于研究hte的方法学框架，特别是基于单一随机对照试验（RCT）。我们重点研究了多协变量条件平均治疗效果（conditional average treatment effect， CATE）的估计方法，旨在预测个体化治疗效果。我们探索了一系列方法，从基本框架（如t -学习者、s-学习者和因果森林）到更高级的框架（如dr -学习者和r -学习者），以及通过估计所有RCT参与者的CATE来提高统计效率的CATE估计的交叉验证。我们还提供了这些方法的实际应用，通过使用新颖饮食策略预防超重（POUNDS Lost）试验，该试验比较了高脂饮食干预和低脂饮食干预对2年体重变化的影响。我们比较了不同的协变量集用于CATE估计，表明DR-和r -学习器对于高维环境下的CATE估计是有用的。本文旨在解释基于ml的HTE分析的理论基础和方法上的细微差别，而不依赖于技术术语，使这些概念更容易被临床和流行病学研究界所理解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

European Journal of Epidemiology 医学-公共卫生、环境卫生与职业卫生

CiteScore

21.40

自引率

1.50%

发文量

109

审稿时长

6-12 weeks

期刊介绍： The European Journal of Epidemiology, established in 1985, is a peer-reviewed publication that provides a platform for discussions on epidemiology in its broadest sense. It covers various aspects of epidemiologic research and statistical methods. The journal facilitates communication between researchers, educators, and practitioners in epidemiology, including those in clinical and community medicine. Contributions from diverse fields such as public health, preventive medicine, clinical medicine, health economics, and computational biology and data science, in relation to health and disease, are encouraged. While accepting submissions from all over the world, the journal particularly emphasizes European topics relevant to epidemiology. The published articles consist of empirical research findings, developments in methodology, and opinion pieces.