{"title":"From Prediction to Prescription: Machine Learning and Causal Inference for the Heterogeneous Treatment Effect.","authors":"Judith Abécassis, Élise Dumas, Julie Alberge, Gaël Varoquaux","doi":"10.1146/annurev-biodatasci-103123-095750","DOIUrl":null,"url":null,"abstract":"<p><p>The increasing accumulation of medical data brings the hope of data-driven medical decision-making, but data's increasing complexity-as text or images in electronic health records-calls for complex models, such as machine learning. Here, we review how machine learning can be used to inform decisions for individualized interventions, a causal question. Going from prediction to causal effects is challenging, as no individual is seen as both treated and not. We detail how some data can support some causal claims and how to build causal estimators with machine learning. Beyond variable selection to adjust for confounding bias, we cover the broader notions of study design that make or break causal inference. As the problems span across diverse scientific communities, we use didactic yet statistically precise formulations to bridge machine learning to epidemiology.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":7.0000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Review of Biomedical Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1146/annurev-biodatasci-103123-095750","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The increasing accumulation of medical data brings the hope of data-driven medical decision-making, but data's increasing complexity-as text or images in electronic health records-calls for complex models, such as machine learning. Here, we review how machine learning can be used to inform decisions for individualized interventions, a causal question. Going from prediction to causal effects is challenging, as no individual is seen as both treated and not. We detail how some data can support some causal claims and how to build causal estimators with machine learning. Beyond variable selection to adjust for confounding bias, we cover the broader notions of study design that make or break causal inference. As the problems span across diverse scientific communities, we use didactic yet statistically precise formulations to bridge machine learning to epidemiology.
期刊介绍:
The Annual Review of Biomedical Data Science provides comprehensive expert reviews in biomedical data science, focusing on advanced methods to store, retrieve, analyze, and organize biomedical data and knowledge. The scope of the journal encompasses informatics, computational, artificial intelligence (AI), and statistical approaches to biomedical data, including the sub-fields of bioinformatics, computational biology, biomedical informatics, clinical and clinical research informatics, biostatistics, and imaging informatics. The mission of the journal is to identify both emerging and established areas of biomedical data science, and the leaders in these fields.