Sparse factor model for co-expression networks with an application using prior biological knowledge.

IF 0.8 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology Pub Date : 2016-06-01 DOI:10.1515/sagmb-2015-0002

Yuna Blum, Magalie Houée-Bigot, David Causeur

{"title":"Sparse factor model for co-expression networks with an application using prior biological knowledge.","authors":"Yuna Blum, Magalie Houée-Bigot, David Causeur","doi":"10.1515/sagmb-2015-0002","DOIUrl":null,"url":null,"abstract":"Abstract Inference on gene regulatory networks from high-throughput expression data turns out to be one of the main current challenges in systems biology. Such networks can be very insightful for the deep understanding of interactions between genes. Because genes-gene interactions is often viewed as joint contributions to known biological mechanisms, inference on the dependence among gene expressions is expected to be consistent to some extent with the functional characterization of genes which can be derived from ontologies (GO, KEGG, …). The present paper introduces a sparse factor model as a general framework either to account for a prior knowledge on joint contributions of modules of genes to latent biological processes or to infer on the corresponding co-expression network. We propose an ℓ1 – regularized EM algorithm to fit a sparse factor model for correlation. We demonstrate how it helps extracting modules of genes and more generally improves the gene clustering performance. The method is compared to alternative estimation procedures for sparse factor models of relevance networks in a simulation study. The integration of a biological knowledge based on the gene ontology (GO) is also illustrated on a liver expression data generated to understand adiposity variability in chicken.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"15 3","pages":"253-72"},"PeriodicalIF":0.8000,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2015-0002","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Applications in Genetics and Molecular Biology","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/sagmb-2015-0002","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 6

Abstract

Abstract Inference on gene regulatory networks from high-throughput expression data turns out to be one of the main current challenges in systems biology. Such networks can be very insightful for the deep understanding of interactions between genes. Because genes-gene interactions is often viewed as joint contributions to known biological mechanisms, inference on the dependence among gene expressions is expected to be consistent to some extent with the functional characterization of genes which can be derived from ontologies (GO, KEGG, …). The present paper introduces a sparse factor model as a general framework either to account for a prior knowledge on joint contributions of modules of genes to latent biological processes or to infer on the corresponding co-expression network. We propose an ℓ1 – regularized EM algorithm to fit a sparse factor model for correlation. We demonstrate how it helps extracting modules of genes and more generally improves the gene clustering performance. The method is compared to alternative estimation procedures for sparse factor models of relevance networks in a simulation study. The integration of a biological knowledge based on the gene ontology (GO) is also illustrated on a liver expression data generated to understand adiposity variability in chicken.

查看原文本刊更多论文

基于先验生物学知识的共表达网络稀疏因子模型。

从高通量表达数据推断基因调控网络是当前系统生物学的主要挑战之一。这样的网络对于深入理解基因之间的相互作用非常有洞察力。由于基因-基因相互作用通常被视为对已知生物机制的共同贡献，对基因表达之间依赖性的推断预计在某种程度上与可以从本体论(GO, KEGG，…)中衍生的基因的功能表征相一致。本文引入了一个稀疏因子模型作为一般框架，用于解释基因模块对潜在生物过程的联合贡献的先验知识或推断相应的共表达网络。我们提出了一种1 -正则化的EM算法来拟合稀疏因子模型。我们演示了它如何帮助提取基因模块，并更普遍地提高基因聚类性能。在仿真研究中，将该方法与相关网络稀疏因子模型的替代估计方法进行了比较。基于基因本体(GO)的生物学知识的整合也说明了肝脏表达数据的生成，以了解鸡的肥胖变异性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Statistical Applications in Genetics and Molecular Biology BIOCHEMISTRY & MOLECULAR BIOLOGY-MATHEMATICAL & COMPUTATIONAL BIOLOGY

自引率

11.10%

发文量

期刊介绍： Statistical Applications in Genetics and Molecular Biology seeks to publish significant research on the application of statistical ideas to problems arising from computational biology. The focus of the papers should be on the relevant statistical issues but should contain a succinct description of the relevant biological problem being considered. The range of topics is wide and will include topics such as linkage mapping, association studies, gene finding and sequence alignment, protein structure prediction, design and analysis of microarray data, molecular evolution and phylogenetic trees, DNA topology, and data base search strategies. Both original research and review articles will be warmly received.