{"title":"Data Integration in Causal Inference.","authors":"Xu Shi, Ziyang Pan, Wang Miao","doi":"10.1002/wics.1581","DOIUrl":null,"url":null,"abstract":"<p><p>Integrating data from multiple heterogeneous sources has become increasingly popular to achieve a large sample size and diverse study population. This paper reviews development in causal inference methods that combines multiple datasets collected by potentially different designs from potentially heterogeneous populations. We summarize recent advances on combining randomized clinical trial with external information from observational studies or historical controls, combining samples when no single sample has all relevant variables with application to two-sample Mendelian randomization, distributed data setting under privacy concerns for comparative effectiveness and safety research using real-world data, Bayesian causal inference, and causal discovery methods.</p>","PeriodicalId":47779,"journal":{"name":"Wiley Interdisciplinary Reviews-Computational Statistics","volume":null,"pages":null},"PeriodicalIF":4.4000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9880960/pdf/","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wiley Interdisciplinary Reviews-Computational Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1002/wics.1581","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/4/8 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 9
Abstract
Integrating data from multiple heterogeneous sources has become increasingly popular to achieve a large sample size and diverse study population. This paper reviews development in causal inference methods that combines multiple datasets collected by potentially different designs from potentially heterogeneous populations. We summarize recent advances on combining randomized clinical trial with external information from observational studies or historical controls, combining samples when no single sample has all relevant variables with application to two-sample Mendelian randomization, distributed data setting under privacy concerns for comparative effectiveness and safety research using real-world data, Bayesian causal inference, and causal discovery methods.