{"title":"t-Test at the Probe Level: An Alternative Method to Identify Statistically Significant Genes for Microarray Data.","authors":"Marcelo Boareto, Nestor Caticha","doi":"10.3390/microarrays3040340","DOIUrl":null,"url":null,"abstract":"<p><p>Microarray data analysis typically consists in identifying a list of differentially expressed genes (DEG), i.e., the genes that are differentially expressed between two experimental conditions. Variance shrinkage methods have been considered a better choice than the standard t-test for selecting the DEG because they correct the dependence of the error with the expression level. This dependence is mainly caused by errors in background correction, which more severely affects genes with low expression values. Here, we propose a new method for identifying the DEG that overcomes this issue and does not require background correction or variance shrinkage. Unlike current methods, our methodology is easy to understand and implement. It consists of applying the standard t-test directly on the normalized intensity data, which is possible because the probe intensity is proportional to the gene expression level and because the t-test is scale- and location-invariant. This methodology considerably improves the sensitivity and robustness of the list of DEG when compared with the t-test applied to preprocessed data and to the most widely used shrinkage methods, Significance Analysis of Microarrays (SAM) and Linear Models for Microarray Data (LIMMA). Our approach is useful especially when the genes of interest have small differences in expression and therefore get ignored by standard variance shrinkage methods. </p>","PeriodicalId":56355,"journal":{"name":"Microarrays","volume":"3 4","pages":"340-51"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3390/microarrays3040340","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microarrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/microarrays3040340","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Microarray data analysis typically consists in identifying a list of differentially expressed genes (DEG), i.e., the genes that are differentially expressed between two experimental conditions. Variance shrinkage methods have been considered a better choice than the standard t-test for selecting the DEG because they correct the dependence of the error with the expression level. This dependence is mainly caused by errors in background correction, which more severely affects genes with low expression values. Here, we propose a new method for identifying the DEG that overcomes this issue and does not require background correction or variance shrinkage. Unlike current methods, our methodology is easy to understand and implement. It consists of applying the standard t-test directly on the normalized intensity data, which is possible because the probe intensity is proportional to the gene expression level and because the t-test is scale- and location-invariant. This methodology considerably improves the sensitivity and robustness of the list of DEG when compared with the t-test applied to preprocessed data and to the most widely used shrinkage methods, Significance Analysis of Microarrays (SAM) and Linear Models for Microarray Data (LIMMA). Our approach is useful especially when the genes of interest have small differences in expression and therefore get ignored by standard variance shrinkage methods.
期刊介绍:
High-Throughput (formerly Microarrays, ISSN 2076-3905) is a multidisciplinary peer-reviewed scientific journal that provides an advanced forum for the publication of studies reporting high-dimensional approaches and developments in Life Sciences, Chemistry and related fields. Our aim is to encourage scientists to publish their experimental and theoretical results based on high-throughput techniques as well as computational and statistical tools for data analysis and interpretation. The full experimental or methodological details must be provided so that the results can be reproduced. There is no restriction on the length of the papers. High-Throughput invites submissions covering several topics, including, but not limited to: Microarrays, DNA Sequencing, RNA Sequencing, Protein Identification and Quantification, Cell-based Approaches, Omics Technologies, Imaging, Bioinformatics, Computational Biology/Chemistry, Statistics, Integrative Omics, Drug Discovery and Development, Microfluidics, Lab-on-a-chip, Data Mining, Databases, Multiplex Assays.