TransAnnot-a fast transcriptome annotation pipeline.

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Bioinformatics advances Pub Date : 2024-10-22 eCollection Date: 2024-01-01 DOI:10.1093/bioadv/vbae152

Mariia Zelenskaia, Yazhini Arangasamy, Milot Mirdita, Johannes Söding, Venket Raghavan

{"title":"TransAnnot-a fast transcriptome annotation pipeline.","authors":"Mariia Zelenskaia, Yazhini Arangasamy, Milot Mirdita, Johannes Söding, Venket Raghavan","doi":"10.1093/bioadv/vbae152","DOIUrl":null,"url":null,"abstract":"Summary: The annotation of deeply sequenced, de novo assembled transcriptomes continues to be a challenge as some of the state-of-the-art tools are slow, difficult to install, and hard to use. We have tackled these issues with TransAnnot, a fast, automated transcriptome annotation pipeline that is easy to install and use. Leveraging the fast sequence searches provided by the MMseqs2 suite, TransAnnot offers one-step annotation of homologs from Swiss-Prot, gene ontology terms and orthogroups from eggNOG, and functional domains from Pfam. Users also have the option to annotate against custom databases. TransAnnot accepts sequencing reads (short and long), nucleotide sequences, or amino acid sequences as input for annotation. When benchmarked with test data sets of amino acid sequences, TransAnnot was 333, 284, and 18 times faster than comparable tools such as EnTAP, Trinotate, and eggNOG-mapper respectively.Availability and implementation: TransAnnot is free to use, open sourced under GPLv3, and is implemented in C++ and Bash. Source code, documentation, and pre-compiled binaries are available at https://github.com/soedinglab/transannot. TransAnnot is also available via bioconda (https://anaconda.org/bioconda/transannot).","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae152"},"PeriodicalIF":2.4000,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11530227/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbae152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Summary: The annotation of deeply sequenced, de novo assembled transcriptomes continues to be a challenge as some of the state-of-the-art tools are slow, difficult to install, and hard to use. We have tackled these issues with TransAnnot, a fast, automated transcriptome annotation pipeline that is easy to install and use. Leveraging the fast sequence searches provided by the MMseqs2 suite, TransAnnot offers one-step annotation of homologs from Swiss-Prot, gene ontology terms and orthogroups from eggNOG, and functional domains from Pfam. Users also have the option to annotate against custom databases. TransAnnot accepts sequencing reads (short and long), nucleotide sequences, or amino acid sequences as input for annotation. When benchmarked with test data sets of amino acid sequences, TransAnnot was 333, 284, and 18 times faster than comparable tools such as EnTAP, Trinotate, and eggNOG-mapper respectively.

Availability and implementation: TransAnnot is free to use, open sourced under GPLv3, and is implemented in C++ and Bash. Source code, documentation, and pre-compiled binaries are available at https://github.com/soedinglab/transannot. TransAnnot is also available via bioconda (https://anaconda.org/bioconda/transannot).

查看原文本刊更多论文

TransAnnot--快速转录组注释管道。

摘要：对深度测序、从头组装的转录组进行注释仍然是一项挑战，因为一些最先进的工具速度慢、安装困难、难以使用。我们利用 TransAnnot 解决了这些问题，它是一种易于安装和使用的快速自动转录组注释管道。利用 MMseqs2 套件提供的快速序列搜索，TransAnnot 可以一步注释 Swiss-Prot、eggNOG 中的基因本体术语和正交群，以及 Pfam 中的功能域。用户还可以根据自定义数据库进行注释。TransAnnot 接受测序读数（长短）、核苷酸序列或氨基酸序列作为注释输入。在使用氨基酸序列测试数据集进行基准测试时，TransAnnot 的速度分别是 EnTAP、Trinotate 和 eggNOG-mapper 等同类工具的 333 倍、284 倍和 18 倍：TransAnnot 可免费使用，根据 GPLv3 开放源码，用 C++ 和 Bash 实现。源代码、文档和预编译二进制文件可从 https://github.com/soedinglab/transannot 获取。TransAnnot 也可通过 bioconda (https://anaconda.org/bioconda/transannot) 获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bioinformatics advances

CiteScore

1.60

自引率

0.00%

发文量