探索现代机器学习方法以改进因果效应估计

IF 0.6 Q4 STATISTICS & PROBABILITY

Communications for Statistical Applications and Methods Pub Date : 2022-03-31 DOI:10.29220/csam.2022.29.2.177

Yeji Kim, Tae-Kil Choi, Sangbum Choi

{"title":"探索现代机器学习方法以改进因果效应估计","authors":"Yeji Kim, Tae-Kil Choi, Sangbum Choi","doi":"10.29220/csam.2022.29.2.177","DOIUrl":null,"url":null,"abstract":"This paper addresses the use of machine learning methods for causal estimation of treatment effects from observational data. Even though conducting randomized experimental trials is a gold standard to reveal potential causal relationships, observational study is another rich source for investigation of exposure effects, for example, in the research of comparative effectiveness and safety of treatments, where the causal effect can be identified if covariates contain all confounding variables. In this context, statistical regression models for the expected outcome and the probability of treatment are often imposed, which can be combined in a clever way to yield more efficient and robust causal estimators. Recently, targeted maximum likelihood estimation and causal random forest is proposed and extensively studied for the use of data-adaptive regression in estimation of causal inference parameters. Machine learning methods are a natural choice in these settings to improve the quality of the final estimate of the treatment effect. We explore how we can adapt the design and training of several machine learning algorithms for causal inference and study their finite-sample performance through simulation experiments under various scenarios. Application to the percutaneous coronary intervention (PCI) data shows that these adaptations can improve simple linear regression-based methods.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring modern machine learning methods to improve causal-effect estimation\",\"authors\":\"Yeji Kim, Tae-Kil Choi, Sangbum Choi\",\"doi\":\"10.29220/csam.2022.29.2.177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper addresses the use of machine learning methods for causal estimation of treatment effects from observational data. Even though conducting randomized experimental trials is a gold standard to reveal potential causal relationships, observational study is another rich source for investigation of exposure effects, for example, in the research of comparative effectiveness and safety of treatments, where the causal effect can be identified if covariates contain all confounding variables. In this context, statistical regression models for the expected outcome and the probability of treatment are often imposed, which can be combined in a clever way to yield more efficient and robust causal estimators. Recently, targeted maximum likelihood estimation and causal random forest is proposed and extensively studied for the use of data-adaptive regression in estimation of causal inference parameters. Machine learning methods are a natural choice in these settings to improve the quality of the final estimate of the treatment effect. We explore how we can adapt the design and training of several machine learning algorithms for causal inference and study their finite-sample performance through simulation experiments under various scenarios. Application to the percutaneous coronary intervention (PCI) data shows that these adaptations can improve simple linear regression-based methods.\",\"PeriodicalId\":44931,\"journal\":{\"name\":\"Communications for Statistical Applications and Methods\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2022-03-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Communications for Statistical Applications and Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29220/csam.2022.29.2.177\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications for Statistical Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29220/csam.2022.29.2.177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

摘要

本文讨论了使用机器学习方法从观察数据中对治疗效果进行因果估计。尽管进行随机实验试验是揭示潜在因果关系的金标准，但观察性研究是调查暴露效应的另一个丰富来源，例如，在治疗的比较有效性和安全性研究中，如果协变量包含所有混杂变量，则可以确定因果效应。在这种情况下，通常采用预期结果和治疗概率的统计回归模型，它们可以以一种巧妙的方式结合起来，产生更有效和稳健的因果估计。近年来，针对数据自适应回归在因果推理参数估计中的应用，提出了目标极大似然估计和因果随机森林方法，并进行了广泛的研究。在这些设置中，机器学习方法是提高最终治疗效果估计质量的自然选择。我们探索如何将几种机器学习算法的设计和训练适应于因果推理，并通过各种场景下的模拟实验研究它们的有限样本性能。经皮冠状动脉介入治疗(PCI)数据的应用表明，这些适应性可以改进基于简单线性回归的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exploring modern machine learning methods to improve causal-effect estimation

This paper addresses the use of machine learning methods for causal estimation of treatment effects from observational data. Even though conducting randomized experimental trials is a gold standard to reveal potential causal relationships, observational study is another rich source for investigation of exposure effects, for example, in the research of comparative effectiveness and safety of treatments, where the causal effect can be identified if covariates contain all confounding variables. In this context, statistical regression models for the expected outcome and the probability of treatment are often imposed, which can be combined in a clever way to yield more efficient and robust causal estimators. Recently, targeted maximum likelihood estimation and causal random forest is proposed and extensively studied for the use of data-adaptive regression in estimation of causal inference parameters. Machine learning methods are a natural choice in these settings to improve the quality of the final estimate of the treatment effect. We explore how we can adapt the design and training of several machine learning algorithms for causal inference and study their finite-sample performance through simulation experiments under various scenarios. Application to the percutaneous coronary intervention (PCI) data shows that these adaptations can improve simple linear regression-based methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Communications for Statistical Applications and Methods STATISTICS & PROBABILITY-

CiteScore

0.90

自引率

0.00%

发文量

期刊介绍： Communications for Statistical Applications and Methods (Commun. Stat. Appl. Methods, CSAM) is an official journal of the Korean Statistical Society and Korean International Statistical Society. It is an international and Open Access journal dedicated to publishing peer-reviewed, high quality and innovative statistical research. CSAM publishes articles on applied and methodological research in the areas of statistics and probability. It features rapid publication and broad coverage of statistical applications and methods. It welcomes papers on novel applications of statistical methodology in the areas including medicine (pharmaceutical, biotechnology, medical device), business, management, economics, ecology, education, computing, engineering, operational research, biology, sociology and earth science, but papers from other areas are also considered.