Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks.

Microarrays Pub Date : 2015-05-14 DOI:10.3390/microarrays4020255

Alina Sîrbu, Martin Crane, Heather J Ruskin

{"title":"Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks.","authors":"Alina Sîrbu, Martin Crane, Heather J Ruskin","doi":"10.3390/microarrays4020255","DOIUrl":null,"url":null,"abstract":"<p><p>Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come. </p>","PeriodicalId":56355,"journal":{"name":"Microarrays","volume":"4 2","pages":"255-69"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3390/microarrays4020255","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microarrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/microarrays4020255","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.

Abstract Image

查看原文本刊更多论文

微阵列的数据集成:基因调控网络的增强推理。

在过去的几十年里，微阵列技术已经成为许多关于基因表达的重要发现的基础。研究产生了描述各种过程的大量数据，由于公共数据库的存在，这些数据可以广泛地用于进一步分析。与较新的测序技术相比，它们的成本更低，成熟度更高，因此尽管数据质量一直存在一些争议，但这些数据仍在继续产生。然而，考虑到生成的大量数据，集成可以帮助克服一些相关问题，例如噪声或时间分辨率降低，同时提供对测序方法无法直接解决的特征的额外见解。在这里，我们提出了一个基于公开的黑腹果蝇数据集(基因表达，结合位点亲和力，已知相互作用)的集成测试案例。使用进化计算框架，我们展示了整合如何增强从这些数据中恢复转录基因调控网络的能力，并指出哪些数据类型对定量和定性网络推断更重要。我们的研究结果表明，当多个数据集集成时，性能有明显的提高，这表明微阵列数据在未来一段时间内仍将是一个有价值和可行的资源。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Microarrays

自引率

0.00%

发文量

审稿时长

11 weeks

期刊介绍： High-Throughput (formerly Microarrays, ISSN 2076-3905) is a multidisciplinary peer-reviewed scientific journal that provides an advanced forum for the publication of studies reporting high-dimensional approaches and developments in Life Sciences, Chemistry and related fields. Our aim is to encourage scientists to publish their experimental and theoretical results based on high-throughput techniques as well as computational and statistical tools for data analysis and interpretation. The full experimental or methodological details must be provided so that the results can be reproduced. There is no restriction on the length of the papers. High-Throughput invites submissions covering several topics, including, but not limited to: Microarrays, DNA Sequencing, RNA Sequencing, Protein Identification and Quantification, Cell-based Approaches, Omics Technologies, Imaging, Bioinformatics, Computational Biology/Chemistry, Statistics, Integrative Omics, Drug Discovery and Development, Microfluidics, Lab-on-a-chip, Data Mining, Databases, Multiplex Assays.