FusionScan:从RNA-Seq数据中准确预测融合基因

Genomics & informatics Pub Date : 2019-07-23 DOI:10.5808/GI.2019.17.3.e26

P. Kim, Y. Jang, Sanghyuk Lee

{"title":"FusionScan:从RNA-Seq数据中准确预测融合基因","authors":"P. Kim, Y. Jang, Sanghyuk Lee","doi":"10.5808/GI.2019.17.3.e26","DOIUrl":null,"url":null,"abstract":"Identification of fusion gene is of prominent importance in cancer research field because of their potential as carcinogenic drivers. RNA sequencing (RNA-Seq) data have been the most useful source for identification of fusion transcripts. Although a number of algorithms have been developed thus far, most programs produce too many false-positives, thus making experimental confirmation almost impossible. We still lack a reliable program that achieves high precision with reasonable recall rate. Here, we present FusionScan, a highly optimized tool for predicting fusion transcripts from RNA-Seq data. We specifically search for split reads composed of intact exons at the fusion boundaries. Using 269 known fusion cases as the reference, we have implemented various mapping and filtering strategies to remove false-positives without discarding genuine fusions. In the performance test using three cell line datasets with validated fusion cases (NCI-H660, K562, and MCF-7), FusionScan outperformed other existing programs by a considerable margin, achieving the precision and recall rates of 60% and 79%, respectively. Simulation test also demonstrated that FusionScan recovered most of true positives without producing an overwhelming number of false-positives regardless of sequencing depth and read length. The computation time was comparable to other leading tools. We also provide several curative means to help users investigate the details of fusion candidates easily. We believe that FusionScan would be a reliable, efficient and convenient program for detecting fusion transcripts that meet the requirements in the clinical and experimental community. FusionScan is freely available at http://fusionscan.ewha.ac.kr/.","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"FusionScan: accurate prediction of fusion genes from RNA-Seq data\",\"authors\":\"P. Kim, Y. Jang, Sanghyuk Lee\",\"doi\":\"10.5808/GI.2019.17.3.e26\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Identification of fusion gene is of prominent importance in cancer research field because of their potential as carcinogenic drivers. RNA sequencing (RNA-Seq) data have been the most useful source for identification of fusion transcripts. Although a number of algorithms have been developed thus far, most programs produce too many false-positives, thus making experimental confirmation almost impossible. We still lack a reliable program that achieves high precision with reasonable recall rate. Here, we present FusionScan, a highly optimized tool for predicting fusion transcripts from RNA-Seq data. We specifically search for split reads composed of intact exons at the fusion boundaries. Using 269 known fusion cases as the reference, we have implemented various mapping and filtering strategies to remove false-positives without discarding genuine fusions. In the performance test using three cell line datasets with validated fusion cases (NCI-H660, K562, and MCF-7), FusionScan outperformed other existing programs by a considerable margin, achieving the precision and recall rates of 60% and 79%, respectively. Simulation test also demonstrated that FusionScan recovered most of true positives without producing an overwhelming number of false-positives regardless of sequencing depth and read length. The computation time was comparable to other leading tools. We also provide several curative means to help users investigate the details of fusion candidates easily. We believe that FusionScan would be a reliable, efficient and convenient program for detecting fusion transcripts that meet the requirements in the clinical and experimental community. FusionScan is freely available at http://fusionscan.ewha.ac.kr/.\",\"PeriodicalId\":94288,\"journal\":{\"name\":\"Genomics & informatics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genomics & informatics\",\"FirstCategoryId\":\"0\",\"ListUrlMain\":\"https://doi.org/10.5808/GI.2019.17.3.e26\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics & informatics","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.5808/GI.2019.17.3.e26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

摘要

融合基因由于其潜在的致癌驱动因素，在癌症研究领域具有重要意义。RNA测序（RNA-Seq）数据是鉴定融合转录物最有用的来源。尽管到目前为止已经开发了许多算法，但大多数程序都会产生太多的误报，因此几乎不可能进行实验验证。我们仍然缺乏一个可靠的程序，以实现高精度和合理的召回率。在这里，我们介绍了FusionScan，一种高度优化的工具，用于从RNA-Seq数据预测融合转录物。我们专门搜索由融合边界处的完整外显子组成的分裂读数。以269例已知融合病例为参考，我们实施了各种映射和过滤策略，以在不丢弃真正融合的情况下去除假阳性。在使用三个具有验证融合案例的细胞系数据集（NCI-H660、K562和MCF-7）的性能测试中，FusionScan的性能显著优于其他现有程序，准确率和召回率分别达到60%和79%。模拟测试还表明，无论测序深度和读取长度如何，FusionScan都能恢复大部分真阳性，而不会产生大量假阳性。计算时间与其他领先工具相当。我们还提供了几种治疗手段，帮助用户轻松调查融合候选者的细节。我们相信FusionScan将是一个可靠、高效和方便的程序，用于检测符合临床和实验社区要求的融合转录本。FusionScan可在http://fusionscan.ewha.ac.kr/.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FusionScan: accurate prediction of fusion genes from RNA-Seq data

Identification of fusion gene is of prominent importance in cancer research field because of their potential as carcinogenic drivers. RNA sequencing (RNA-Seq) data have been the most useful source for identification of fusion transcripts. Although a number of algorithms have been developed thus far, most programs produce too many false-positives, thus making experimental confirmation almost impossible. We still lack a reliable program that achieves high precision with reasonable recall rate. Here, we present FusionScan, a highly optimized tool for predicting fusion transcripts from RNA-Seq data. We specifically search for split reads composed of intact exons at the fusion boundaries. Using 269 known fusion cases as the reference, we have implemented various mapping and filtering strategies to remove false-positives without discarding genuine fusions. In the performance test using three cell line datasets with validated fusion cases (NCI-H660, K562, and MCF-7), FusionScan outperformed other existing programs by a considerable margin, achieving the precision and recall rates of 60% and 79%, respectively. Simulation test also demonstrated that FusionScan recovered most of true positives without producing an overwhelming number of false-positives regardless of sequencing depth and read length. The computation time was comparable to other leading tools. We also provide several curative means to help users investigate the details of fusion candidates easily. We believe that FusionScan would be a reliable, efficient and convenient program for detecting fusion transcripts that meet the requirements in the clinical and experimental community. FusionScan is freely available at http://fusionscan.ewha.ac.kr/.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Genomics & informatics

自引率

0.00%

发文量