基于介入实验数据的网络结构贝叶斯学习

IF 2.8 2区数学 Q2 BIOLOGY

Biometrika Pub Date : 2023-05-11 DOI:10.1093/biomet/asad032

F. Castelletti, S. Peluso

{"title":"基于介入实验数据的网络结构贝叶斯学习","authors":"F. Castelletti, S. Peluso","doi":"10.1093/biomet/asad032","DOIUrl":null,"url":null,"abstract":"\n Directed Acyclic Graphs (DAGs) provide an effective framework for learning causal relationships among variables given multivariate observations. Under pure observational data, DAGs encoding the same conditional independencies cannot be distinguished and are collected into Markov equivalence classes. In many contexts however, observational measurements are supplemented by interventional data that improve DAG identifiability and enhance causal effect estimation. We propose a Bayesian framework for multivariate data partially generated after stochastic interventions. To this end, we introduce an effective prior elicitation procedure leading to a closed-form expression for the DAG marginal likelihood and guaranteeing score equivalence among DAGs that are Markov equivalent post intervention. Under the Gaussian setting we show, in terms of posterior ratio consistency, that the true network will be asymptotically recovered, regardless of the specific distribution of the intervened variables and of the relative asymptotic dominance between observational and interventional measurements. We validate our theoretical results in simulation and we implement on both synthetic and biological protein expression data a Markov chain Monte Carlo sampler for posterior inference on the space of DAGs.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Bayesian learning of network structures from interventional experimental data\",\"authors\":\"F. Castelletti, S. Peluso\",\"doi\":\"10.1093/biomet/asad032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Directed Acyclic Graphs (DAGs) provide an effective framework for learning causal relationships among variables given multivariate observations. Under pure observational data, DAGs encoding the same conditional independencies cannot be distinguished and are collected into Markov equivalence classes. In many contexts however, observational measurements are supplemented by interventional data that improve DAG identifiability and enhance causal effect estimation. We propose a Bayesian framework for multivariate data partially generated after stochastic interventions. To this end, we introduce an effective prior elicitation procedure leading to a closed-form expression for the DAG marginal likelihood and guaranteeing score equivalence among DAGs that are Markov equivalent post intervention. Under the Gaussian setting we show, in terms of posterior ratio consistency, that the true network will be asymptotically recovered, regardless of the specific distribution of the intervened variables and of the relative asymptotic dominance between observational and interventional measurements. We validate our theoretical results in simulation and we implement on both synthetic and biological protein expression data a Markov chain Monte Carlo sampler for posterior inference on the space of DAGs.\",\"PeriodicalId\":9001,\"journal\":{\"name\":\"Biometrika\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-05-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biometrika\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/biomet/asad032\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrika","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biomet/asad032","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 1

摘要

有向无环图(dag)提供了一个有效的框架来学习变量之间的因果关系给定的多变量观察。在纯观测数据下，无法区分编码相同条件独立性的dag，并将其收集到马尔可夫等价类中。然而，在许多情况下，通过干预数据补充观察测量，可提高DAG的可识别性并增强因果效应估计。我们提出了一个贝叶斯框架，用于随机干预后部分生成的多变量数据。为此，我们引入了一个有效的先验启发程序，导致DAG边际似然的封闭形式表达式，并保证干预后马尔可夫等效DAG之间的分数相等。在高斯设置下，根据后验比一致性，我们表明，无论干预变量的具体分布以及观察和干预测量之间的相对渐近优势如何，真实网络都将渐近恢复。我们在模拟中验证了我们的理论结果，并在合成和生物蛋白表达数据上实现了一个马尔可夫链蒙特卡罗采样器，用于对dag空间的后验推理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Bayesian learning of network structures from interventional experimental data

Directed Acyclic Graphs (DAGs) provide an effective framework for learning causal relationships among variables given multivariate observations. Under pure observational data, DAGs encoding the same conditional independencies cannot be distinguished and are collected into Markov equivalence classes. In many contexts however, observational measurements are supplemented by interventional data that improve DAG identifiability and enhance causal effect estimation. We propose a Bayesian framework for multivariate data partially generated after stochastic interventions. To this end, we introduce an effective prior elicitation procedure leading to a closed-form expression for the DAG marginal likelihood and guaranteeing score equivalence among DAGs that are Markov equivalent post intervention. Under the Gaussian setting we show, in terms of posterior ratio consistency, that the true network will be asymptotically recovered, regardless of the specific distribution of the intervened variables and of the relative asymptotic dominance between observational and interventional measurements. We validate our theoretical results in simulation and we implement on both synthetic and biological protein expression data a Markov chain Monte Carlo sampler for posterior inference on the space of DAGs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biometrika 生物-生物学

CiteScore

5.50

自引率

3.70%

发文量

审稿时长

6-12 weeks

期刊介绍： Biometrika is primarily a journal of statistics in which emphasis is placed on papers containing original theoretical contributions of direct or potential value in applications. From time to time, papers in bordering fields are also published.