用经验贝叶斯方法估计共调控基因表达的动态模型

Sara Venkatraman, Sumanta Basu, Andrew G. Clark, Sofie Delbare, Myung Hee Lee, Martin T. Wells
{"title":"用经验贝叶斯方法估计共调控基因表达的动态模型","authors":"Sara Venkatraman, Sumanta Basu, Andrew G. Clark, Sofie Delbare, Myung Hee Lee, Martin T. Wells","doi":"10.1080/26941899.2023.2219707","DOIUrl":null,"url":null,"abstract":"Time-course gene expression datasets provide insight into the dynamics of complex biological processes, such as immune response and organ development. It is of interest to identify genes with similar temporal expression patterns because such genes are often biologically related. However, this task is challenging due to the high dimensionality of these datasets and the nonlinearity of gene expression time dynamics. We propose an empirical Bayes approach to estimating ordinary differential equation (ODE) models of gene expression, from which we derive a similarity metric between genes called the Bayesian lead-lag R2 (LLR2). Importantly, the calculation of the LLR2 leverages biological databases that document known interactions amongst genes; this information is automatically used to define informative prior distributions on the ODE model’s parameters. As a result, the LLR2 is a biologically-informed metric that can be used to identify clusters or networks of functionally-related genes with co-moving or time-delayed expression patterns. We then derive data-driven shrinkage parameters from Stein’s unbiased risk estimate that optimally balance the ODE model’s fit to both data and external biological information. Using real gene expression data, we demonstrate that our methodology allows us to recover interpretable gene clusters and sparse networks. These results reveal new insights about the dynamics of biological systems.","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Empirical Bayes Approach to Estimating Dynamic Models of Co-Regulated Gene Expression\",\"authors\":\"Sara Venkatraman, Sumanta Basu, Andrew G. Clark, Sofie Delbare, Myung Hee Lee, Martin T. Wells\",\"doi\":\"10.1080/26941899.2023.2219707\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Time-course gene expression datasets provide insight into the dynamics of complex biological processes, such as immune response and organ development. It is of interest to identify genes with similar temporal expression patterns because such genes are often biologically related. However, this task is challenging due to the high dimensionality of these datasets and the nonlinearity of gene expression time dynamics. We propose an empirical Bayes approach to estimating ordinary differential equation (ODE) models of gene expression, from which we derive a similarity metric between genes called the Bayesian lead-lag R2 (LLR2). Importantly, the calculation of the LLR2 leverages biological databases that document known interactions amongst genes; this information is automatically used to define informative prior distributions on the ODE model’s parameters. As a result, the LLR2 is a biologically-informed metric that can be used to identify clusters or networks of functionally-related genes with co-moving or time-delayed expression patterns. We then derive data-driven shrinkage parameters from Stein’s unbiased risk estimate that optimally balance the ODE model’s fit to both data and external biological information. Using real gene expression data, we demonstrate that our methodology allows us to recover interpretable gene clusters and sparse networks. These results reveal new insights about the dynamics of biological systems.\",\"PeriodicalId\":72770,\"journal\":{\"name\":\"Data science in science\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data science in science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/26941899.2023.2219707\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data science in science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/26941899.2023.2219707","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

时间过程基因表达数据集提供了深入了解复杂生物过程的动态,如免疫反应和器官发育。鉴定具有相似时间表达模式的基因是有意义的,因为这些基因通常具有生物学相关性。然而,由于这些数据集的高维性和基因表达时间动态的非线性,这项任务具有挑战性。我们提出了一种经验贝叶斯方法来估计基因表达的常微分方程(ODE)模型,从中我们得出了一个称为贝叶斯超前滞后R2 (LLR2)的基因之间的相似性度量。重要的是,LLR2的计算利用了记录基因之间已知相互作用的生物数据库;这些信息被自动用于定义ODE模型参数的先验分布。因此,LLR2是一种生物学信息指标,可用于识别具有共同移动或延迟表达模式的功能相关基因的集群或网络。然后,我们从Stein的无偏风险估计中得出数据驱动的收缩参数,以最佳地平衡ODE模型对数据和外部生物信息的拟合。使用真实的基因表达数据,我们证明了我们的方法允许我们恢复可解释的基因簇和稀疏网络。这些结果揭示了关于生物系统动力学的新见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Empirical Bayes Approach to Estimating Dynamic Models of Co-Regulated Gene Expression
Time-course gene expression datasets provide insight into the dynamics of complex biological processes, such as immune response and organ development. It is of interest to identify genes with similar temporal expression patterns because such genes are often biologically related. However, this task is challenging due to the high dimensionality of these datasets and the nonlinearity of gene expression time dynamics. We propose an empirical Bayes approach to estimating ordinary differential equation (ODE) models of gene expression, from which we derive a similarity metric between genes called the Bayesian lead-lag R2 (LLR2). Importantly, the calculation of the LLR2 leverages biological databases that document known interactions amongst genes; this information is automatically used to define informative prior distributions on the ODE model’s parameters. As a result, the LLR2 is a biologically-informed metric that can be used to identify clusters or networks of functionally-related genes with co-moving or time-delayed expression patterns. We then derive data-driven shrinkage parameters from Stein’s unbiased risk estimate that optimally balance the ODE model’s fit to both data and external biological information. Using real gene expression data, we demonstrate that our methodology allows us to recover interpretable gene clusters and sparse networks. These results reveal new insights about the dynamics of biological systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.60
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信