全长异构体构造器（FLIC）——一个基于长读的异构体发现工具。

IF 5.4

Bioinformatics (Oxford, England) Pub Date : 2025-09-30 DOI:10.1093/bioinformatics/btaf551

Alexandra M Kasianova, Anna V Klepikova, Oleg A Gusev, Guzel R Gazizova, Maria D Logacheva, Aleksey A Penin

{"title":"全长异构体构造器（FLIC）——一个基于长读的异构体发现工具。","authors":"Alexandra M Kasianova, Anna V Klepikova, Oleg A Gusev, Guzel R Gazizova, Maria D Logacheva, Aleksey A Penin","doi":"10.1093/bioinformatics/btaf551","DOIUrl":null,"url":null,"abstract":"Motivation: Advances in high-throughput sequencing have illuminated the complexity of transcriptome landscape in eukaryotes. An inherent part of this complexity is the presence of multiple isoforms generated by the alternative splicing and the use of alternative transcription start and polyadenylation sites. However, currently available tools have limited capacity to infer full-length isoforms.Results: We developed a new pipeline, FLIC (Full-Length Isoform Constructor). FLIC is based on the long-read transcriptome data and integrates several key features: 1) utilizing biological replicate concordance to filter out noise and artifacts; 2) employing peak calling to precisely identify transcription start and polyadenylation sites; 3) enabling robust isoform reconstruction with minimal reliance on existing annotations. We evaluated FLIC using a dedicated set of real and simulated data of Arabidopsis thaliana cDNA sequencing. Results demonstrate that FLIC accurately reconstructs known and novel isoforms, outperforming existing tools, especially in the absence of reference annotations. A direct comparison with CAGE, currently regarded as the gold standard for transcription start site identification, shows that FLIC is equally accurate, while being much less time-consuming. Thus, FLIC provides a valuable tool for comprehensive transcript characterization, particularly for non-model organisms or when dealing with incomplete or inaccurate annotations.Availability: FLIC is available at https://github.com/albidgy/FLIC.Supplementary information: Supplementary data are available at Bioinformatics online.","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Full-length isoform constructor (FLIC) - a tool for isoform discovery based on long reads.\",\"authors\":\"Alexandra M Kasianova, Anna V Klepikova, Oleg A Gusev, Guzel R Gazizova, Maria D Logacheva, Aleksey A Penin\",\"doi\":\"10.1093/bioinformatics/btaf551\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Motivation: Advances in high-throughput sequencing have illuminated the complexity of transcriptome landscape in eukaryotes. An inherent part of this complexity is the presence of multiple isoforms generated by the alternative splicing and the use of alternative transcription start and polyadenylation sites. However, currently available tools have limited capacity to infer full-length isoforms.Results: We developed a new pipeline, FLIC (Full-Length Isoform Constructor). FLIC is based on the long-read transcriptome data and integrates several key features: 1) utilizing biological replicate concordance to filter out noise and artifacts; 2) employing peak calling to precisely identify transcription start and polyadenylation sites; 3) enabling robust isoform reconstruction with minimal reliance on existing annotations. We evaluated FLIC using a dedicated set of real and simulated data of Arabidopsis thaliana cDNA sequencing. Results demonstrate that FLIC accurately reconstructs known and novel isoforms, outperforming existing tools, especially in the absence of reference annotations. A direct comparison with CAGE, currently regarded as the gold standard for transcription start site identification, shows that FLIC is equally accurate, while being much less time-consuming. Thus, FLIC provides a valuable tool for comprehensive transcript characterization, particularly for non-model organisms or when dealing with incomplete or inaccurate annotations.Availability: FLIC is available at https://github.com/albidgy/FLIC.Supplementary information: Supplementary data are available at Bioinformatics online.\",\"PeriodicalId\":93899,\"journal\":{\"name\":\"Bioinformatics (Oxford, England)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics (Oxford, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btaf551\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf551","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

动机：高通量测序技术的进步揭示了真核生物转录组景观的复杂性。这种复杂性的一个固有部分是由可变剪接和使用可变转录起始位点和聚腺苷化位点产生的多个同种异构体的存在。然而，目前可用的工具推断全长同种异构体的能力有限。结果：我们开发了一个新的管道，FLIC（全长异构体构造器）。FLIC基于长读转录组数据，并集成了几个关键特征：1)利用生物复制一致性过滤噪声和伪影；2)利用峰值召唤精确识别转录起始位点和聚腺苷酸化位点；3)实现对现有注释的最小依赖的鲁棒异构体重建。我们使用拟南芥cDNA测序的真实和模拟数据集来评估FLIC。结果表明，FLIC可以准确地重建已知和新的同种异构体，优于现有的工具，特别是在没有参考注释的情况下。与目前被视为转录起始位点鉴定金标准的CAGE进行直接比较，可以发现FLIC同样准确，而且耗时少得多。因此，FLIC为全面的转录物表征提供了一个有价值的工具，特别是对于非模式生物或处理不完整或不准确的注释时。可获得性：fllic可在https://github.com/albidgy/FLIC.Supplementary information上获得；补充数据可在Bioinformatics在线上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Full-length isoform constructor (FLIC) - a tool for isoform discovery based on long reads.

Motivation: Advances in high-throughput sequencing have illuminated the complexity of transcriptome landscape in eukaryotes. An inherent part of this complexity is the presence of multiple isoforms generated by the alternative splicing and the use of alternative transcription start and polyadenylation sites. However, currently available tools have limited capacity to infer full-length isoforms.

Results: We developed a new pipeline, FLIC (Full-Length Isoform Constructor). FLIC is based on the long-read transcriptome data and integrates several key features: 1) utilizing biological replicate concordance to filter out noise and artifacts; 2) employing peak calling to precisely identify transcription start and polyadenylation sites; 3) enabling robust isoform reconstruction with minimal reliance on existing annotations. We evaluated FLIC using a dedicated set of real and simulated data of Arabidopsis thaliana cDNA sequencing. Results demonstrate that FLIC accurately reconstructs known and novel isoforms, outperforming existing tools, especially in the absence of reference annotations. A direct comparison with CAGE, currently regarded as the gold standard for transcription start site identification, shows that FLIC is equally accurate, while being much less time-consuming. Thus, FLIC provides a valuable tool for comprehensive transcript characterization, particularly for non-model organisms or when dealing with incomplete or inaccurate annotations.

Availability: FLIC is available at https://github.com/albidgy/FLIC.

Supplementary information: Supplementary data are available at Bioinformatics online.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Bioinformatics (Oxford, England)

自引率

0.00%

发文量