{"title":"JULiP: An efficient model for accurate intron selection from multiple RNA-seq samples","authors":"Guangyu Yang, L. Florea","doi":"10.1109/ICCABS.2016.7802790","DOIUrl":null,"url":null,"abstract":"Accurate alternative splicing detection and transcript reconstruction are essential to characterize gene regulation and function and to understand development and disease. However, current methods for extracting splicing variation from RNA-seq data only analyze signals from a single sample, which limits transcript reconstruction and fails to detect a complete set of alternative splicing events. We developed a novel feature selection method, JULiP, that analyzes information across multiple samples to identify alternative splicing variation in the form of splice junctions (introns). It formulates the selection problem as a regularized program, utilizing the latent information from multiple RNA-seq samples to construct an accurate and comprehensive intron set. JULiP is highly accurate, and could detect thousands more introns in any one sample, >30% more than the most sensitive single-sample method, and 10% more introns than in the cumulative set of samples, at higher or comparable precision (>98%). Tested assemblers included Cufflinks, CLASS2, StringTie and FlipFlop, and the multi-sample assembler ISP. JULiP is multi-threaded and parallelized, taking only one minute to analyze up to 100 data sets on a multi-computer cluster, and can easily scale up to allow analyses of hundreds and thousands of RNA-seq samples.","PeriodicalId":306466,"journal":{"name":"2016 IEEE 6th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 6th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCABS.2016.7802790","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Accurate alternative splicing detection and transcript reconstruction are essential to characterize gene regulation and function and to understand development and disease. However, current methods for extracting splicing variation from RNA-seq data only analyze signals from a single sample, which limits transcript reconstruction and fails to detect a complete set of alternative splicing events. We developed a novel feature selection method, JULiP, that analyzes information across multiple samples to identify alternative splicing variation in the form of splice junctions (introns). It formulates the selection problem as a regularized program, utilizing the latent information from multiple RNA-seq samples to construct an accurate and comprehensive intron set. JULiP is highly accurate, and could detect thousands more introns in any one sample, >30% more than the most sensitive single-sample method, and 10% more introns than in the cumulative set of samples, at higher or comparable precision (>98%). Tested assemblers included Cufflinks, CLASS2, StringTie and FlipFlop, and the multi-sample assembler ISP. JULiP is multi-threaded and parallelized, taking only one minute to analyze up to 100 data sets on a multi-computer cluster, and can easily scale up to allow analyses of hundreds and thousands of RNA-seq samples.