{"title":"Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks.","authors":"Alina Sîrbu, Martin Crane, Heather J Ruskin","doi":"10.3390/microarrays4020255","DOIUrl":null,"url":null,"abstract":"<p><p>Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come. </p>","PeriodicalId":56355,"journal":{"name":"Microarrays","volume":"4 2","pages":"255-69"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3390/microarrays4020255","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microarrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/microarrays4020255","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.
期刊介绍:
High-Throughput (formerly Microarrays, ISSN 2076-3905) is a multidisciplinary peer-reviewed scientific journal that provides an advanced forum for the publication of studies reporting high-dimensional approaches and developments in Life Sciences, Chemistry and related fields. Our aim is to encourage scientists to publish their experimental and theoretical results based on high-throughput techniques as well as computational and statistical tools for data analysis and interpretation. The full experimental or methodological details must be provided so that the results can be reproduced. There is no restriction on the length of the papers. High-Throughput invites submissions covering several topics, including, but not limited to: Microarrays, DNA Sequencing, RNA Sequencing, Protein Identification and Quantification, Cell-based Approaches, Omics Technologies, Imaging, Bioinformatics, Computational Biology/Chemistry, Statistics, Integrative Omics, Drug Discovery and Development, Microfluidics, Lab-on-a-chip, Data Mining, Databases, Multiplex Assays.