基于介入实验数据的网络结构贝叶斯学习

IF 2.4 2区 数学 Q2 BIOLOGY
Biometrika Pub Date : 2023-05-11 DOI:10.1093/biomet/asad032
F. Castelletti, S. Peluso
{"title":"基于介入实验数据的网络结构贝叶斯学习","authors":"F. Castelletti, S. Peluso","doi":"10.1093/biomet/asad032","DOIUrl":null,"url":null,"abstract":"\n Directed Acyclic Graphs (DAGs) provide an effective framework for learning causal relationships among variables given multivariate observations. Under pure observational data, DAGs encoding the same conditional independencies cannot be distinguished and are collected into Markov equivalence classes. In many contexts however, observational measurements are supplemented by interventional data that improve DAG identifiability and enhance causal effect estimation. We propose a Bayesian framework for multivariate data partially generated after stochastic interventions. To this end, we introduce an effective prior elicitation procedure leading to a closed-form expression for the DAG marginal likelihood and guaranteeing score equivalence among DAGs that are Markov equivalent post intervention. Under the Gaussian setting we show, in terms of posterior ratio consistency, that the true network will be asymptotically recovered, regardless of the specific distribution of the intervened variables and of the relative asymptotic dominance between observational and interventional measurements. We validate our theoretical results in simulation and we implement on both synthetic and biological protein expression data a Markov chain Monte Carlo sampler for posterior inference on the space of DAGs.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Bayesian learning of network structures from interventional experimental data\",\"authors\":\"F. Castelletti, S. Peluso\",\"doi\":\"10.1093/biomet/asad032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Directed Acyclic Graphs (DAGs) provide an effective framework for learning causal relationships among variables given multivariate observations. Under pure observational data, DAGs encoding the same conditional independencies cannot be distinguished and are collected into Markov equivalence classes. In many contexts however, observational measurements are supplemented by interventional data that improve DAG identifiability and enhance causal effect estimation. We propose a Bayesian framework for multivariate data partially generated after stochastic interventions. To this end, we introduce an effective prior elicitation procedure leading to a closed-form expression for the DAG marginal likelihood and guaranteeing score equivalence among DAGs that are Markov equivalent post intervention. Under the Gaussian setting we show, in terms of posterior ratio consistency, that the true network will be asymptotically recovered, regardless of the specific distribution of the intervened variables and of the relative asymptotic dominance between observational and interventional measurements. We validate our theoretical results in simulation and we implement on both synthetic and biological protein expression data a Markov chain Monte Carlo sampler for posterior inference on the space of DAGs.\",\"PeriodicalId\":9001,\"journal\":{\"name\":\"Biometrika\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2023-05-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biometrika\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/biomet/asad032\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrika","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biomet/asad032","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 1

摘要

有向无环图(dag)提供了一个有效的框架来学习变量之间的因果关系给定的多变量观察。在纯观测数据下,无法区分编码相同条件独立性的dag,并将其收集到马尔可夫等价类中。然而,在许多情况下,通过干预数据补充观察测量,可提高DAG的可识别性并增强因果效应估计。我们提出了一个贝叶斯框架,用于随机干预后部分生成的多变量数据。为此,我们引入了一个有效的先验启发程序,导致DAG边际似然的封闭形式表达式,并保证干预后马尔可夫等效DAG之间的分数相等。在高斯设置下,根据后验比一致性,我们表明,无论干预变量的具体分布以及观察和干预测量之间的相对渐近优势如何,真实网络都将渐近恢复。我们在模拟中验证了我们的理论结果,并在合成和生物蛋白表达数据上实现了一个马尔可夫链蒙特卡罗采样器,用于对dag空间的后验推理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Bayesian learning of network structures from interventional experimental data
Directed Acyclic Graphs (DAGs) provide an effective framework for learning causal relationships among variables given multivariate observations. Under pure observational data, DAGs encoding the same conditional independencies cannot be distinguished and are collected into Markov equivalence classes. In many contexts however, observational measurements are supplemented by interventional data that improve DAG identifiability and enhance causal effect estimation. We propose a Bayesian framework for multivariate data partially generated after stochastic interventions. To this end, we introduce an effective prior elicitation procedure leading to a closed-form expression for the DAG marginal likelihood and guaranteeing score equivalence among DAGs that are Markov equivalent post intervention. Under the Gaussian setting we show, in terms of posterior ratio consistency, that the true network will be asymptotically recovered, regardless of the specific distribution of the intervened variables and of the relative asymptotic dominance between observational and interventional measurements. We validate our theoretical results in simulation and we implement on both synthetic and biological protein expression data a Markov chain Monte Carlo sampler for posterior inference on the space of DAGs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Biometrika
Biometrika 生物-生物学
CiteScore
5.50
自引率
3.70%
发文量
56
审稿时长
6-12 weeks
期刊介绍: Biometrika is primarily a journal of statistics in which emphasis is placed on papers containing original theoretical contributions of direct or potential value in applications. From time to time, papers in bordering fields are also published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信