{"title":"Sampling bias in microarray data analysis: A demonstration in the field of reproductive biology","authors":"S. Manafi, A. Uyar, A. Bener","doi":"10.1109/HIBIT.2013.6661684","DOIUrl":null,"url":null,"abstract":"The actual benefit from high-throughput microarray experiments strongly relies on elimination of all possible sources of biases during both the experimental procedure and data analysis process. Within the context of reproductive biology, microarray based transcriptomic analysis of oocyte and surrounding cumulus/granulosa cells poses significant challenges due to limited amount of samples and/or potential contaminations from adjacent cells. In this study, we investigated the effect of sampling bias on consistency of the microarray differential expression analysis in the field of reproduction. Experiments were conducted on five datasets obtained from publicly available microarray repositories. For each dataset, probe level expression values were extracted and background adjustment, inter-array quantile normalization and probe set summarization were performed according to the Robust Multi-Chip Average algorithm. Genes with a false discovery rate-corrected p value of <;0.05 and [Fold Change] > 2 were considered as differentially expressed. Results demonstrate that both number of replicates and including different subsets of available samples in the analysis alter the number of differentially expressed genes. We suggest that assessment of inter-sample variance prior to differential expression analysis is an important step in microarray experiments and proper handling of that variance may require alternative normalization and/or statistical test methods.","PeriodicalId":433206,"journal":{"name":"2013 8th International Symposium on Health Informatics and Bioinformatics","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 8th International Symposium on Health Informatics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HIBIT.2013.6661684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The actual benefit from high-throughput microarray experiments strongly relies on elimination of all possible sources of biases during both the experimental procedure and data analysis process. Within the context of reproductive biology, microarray based transcriptomic analysis of oocyte and surrounding cumulus/granulosa cells poses significant challenges due to limited amount of samples and/or potential contaminations from adjacent cells. In this study, we investigated the effect of sampling bias on consistency of the microarray differential expression analysis in the field of reproduction. Experiments were conducted on five datasets obtained from publicly available microarray repositories. For each dataset, probe level expression values were extracted and background adjustment, inter-array quantile normalization and probe set summarization were performed according to the Robust Multi-Chip Average algorithm. Genes with a false discovery rate-corrected p value of <;0.05 and [Fold Change] > 2 were considered as differentially expressed. Results demonstrate that both number of replicates and including different subsets of available samples in the analysis alter the number of differentially expressed genes. We suggest that assessment of inter-sample variance prior to differential expression analysis is an important step in microarray experiments and proper handling of that variance may require alternative normalization and/or statistical test methods.