Abdallah M Eteleeb, Robert M Flight, Benjamin J Harrison, Jeffrey C Petruska, Eric C Rouchka
{"title":"An Island-Based Approach for Differential Expression Analysis.","authors":"Abdallah M Eteleeb, Robert M Flight, Benjamin J Harrison, Jeffrey C Petruska, Eric C Rouchka","doi":"10.1145/2506583.2506589","DOIUrl":null,"url":null,"abstract":"<p><p>High-throughput mRNA sequencing (also known as RNA-Seq) promises to be the technique of choice for studying transcriptome profiles. This technique provides the ability to develop precise methodologies for transcript and gene expression quantification, novel transcript and exon discovery, and splice variant detection. One of the limitations of current RNA-Seq methods is the dependency on annotated biological features (e.g. exons, transcripts, genes) to detect expression differences across samples. This forces the identification of expression levels and the detection of significant changes to known genomic regions. Any significant changes that occur in unannotated regions will not be captured. To overcome this limitation, we developed a novel segmentation approach, Island-Based (IB), for analyzing differential expression in RNA-Seq and targeted sequencing (exome capture) data without specific knowledge of an isoform. The IB segmentation determines individual islands of expression based on windowed read counts that can be compared across experimental conditions to determine differential island expression. In order to detect differentially expressed genes, the significance of islands (<i>p</i>-values) are combined using <i>Fisher's</i> method. We tested and evaluated the performance of our approach by comparing it to the existing differentially expressed gene (DEG) methods: CuffDiff, DESeq, and edgeR using two benchmark MAQC RNA-Seq datasets. The IB algorithm outperforms all three methods in both datasets as illustrated by an increased auROC.</p>","PeriodicalId":90404,"journal":{"name":"2013 ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics : ACM - BCB 2013 : Washington, D.C., U.S.A., September 22 - 25, 2013. ACM Conference on Bioinformatics, Computational Biology and Biomedical Informa...","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2506583.2506589","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics : ACM - BCB 2013 : Washington, D.C., U.S.A., September 22 - 25, 2013. ACM Conference on Bioinformatics, Computational Biology and Biomedical Informa...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2506583.2506589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
High-throughput mRNA sequencing (also known as RNA-Seq) promises to be the technique of choice for studying transcriptome profiles. This technique provides the ability to develop precise methodologies for transcript and gene expression quantification, novel transcript and exon discovery, and splice variant detection. One of the limitations of current RNA-Seq methods is the dependency on annotated biological features (e.g. exons, transcripts, genes) to detect expression differences across samples. This forces the identification of expression levels and the detection of significant changes to known genomic regions. Any significant changes that occur in unannotated regions will not be captured. To overcome this limitation, we developed a novel segmentation approach, Island-Based (IB), for analyzing differential expression in RNA-Seq and targeted sequencing (exome capture) data without specific knowledge of an isoform. The IB segmentation determines individual islands of expression based on windowed read counts that can be compared across experimental conditions to determine differential island expression. In order to detect differentially expressed genes, the significance of islands (p-values) are combined using Fisher's method. We tested and evaluated the performance of our approach by comparing it to the existing differentially expressed gene (DEG) methods: CuffDiff, DESeq, and edgeR using two benchmark MAQC RNA-Seq datasets. The IB algorithm outperforms all three methods in both datasets as illustrated by an increased auROC.