TransAnnot-a fast transcriptome annotation pipeline.

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Bioinformatics advances Pub Date : 2024-10-22 eCollection Date: 2024-01-01 DOI:10.1093/bioadv/vbae152
Mariia Zelenskaia, Yazhini Arangasamy, Milot Mirdita, Johannes Söding, Venket Raghavan
{"title":"TransAnnot-a fast transcriptome annotation pipeline.","authors":"Mariia Zelenskaia, Yazhini Arangasamy, Milot Mirdita, Johannes Söding, Venket Raghavan","doi":"10.1093/bioadv/vbae152","DOIUrl":null,"url":null,"abstract":"<p><strong>Summary: </strong>The annotation of deeply sequenced, <i>de novo</i> assembled transcriptomes continues to be a challenge as some of the state-of-the-art tools are slow, difficult to install, and hard to use. We have tackled these issues with TransAnnot, a fast, automated transcriptome annotation pipeline that is easy to install and use. Leveraging the fast sequence searches provided by the MMseqs2 suite, TransAnnot offers one-step annotation of homologs from Swiss-Prot, gene ontology terms and orthogroups from eggNOG, and functional domains from Pfam. Users also have the option to annotate against custom databases. TransAnnot accepts sequencing reads (short and long), nucleotide sequences, or amino acid sequences as input for annotation. When benchmarked with test data sets of amino acid sequences, TransAnnot was 333, 284, and 18 times faster than comparable tools such as EnTAP, Trinotate, and eggNOG-mapper respectively.</p><p><strong>Availability and implementation: </strong>TransAnnot is free to use, open sourced under GPLv3, and is implemented in C++ and Bash. Source code, documentation, and pre-compiled binaries are available at https://github.com/soedinglab/transannot. TransAnnot is also available via bioconda (https://anaconda.org/bioconda/transannot).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11530227/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbae152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Summary: The annotation of deeply sequenced, de novo assembled transcriptomes continues to be a challenge as some of the state-of-the-art tools are slow, difficult to install, and hard to use. We have tackled these issues with TransAnnot, a fast, automated transcriptome annotation pipeline that is easy to install and use. Leveraging the fast sequence searches provided by the MMseqs2 suite, TransAnnot offers one-step annotation of homologs from Swiss-Prot, gene ontology terms and orthogroups from eggNOG, and functional domains from Pfam. Users also have the option to annotate against custom databases. TransAnnot accepts sequencing reads (short and long), nucleotide sequences, or amino acid sequences as input for annotation. When benchmarked with test data sets of amino acid sequences, TransAnnot was 333, 284, and 18 times faster than comparable tools such as EnTAP, Trinotate, and eggNOG-mapper respectively.

Availability and implementation: TransAnnot is free to use, open sourced under GPLv3, and is implemented in C++ and Bash. Source code, documentation, and pre-compiled binaries are available at https://github.com/soedinglab/transannot. TransAnnot is also available via bioconda (https://anaconda.org/bioconda/transannot).

TransAnnot--快速转录组注释管道。
摘要:对深度测序、从头组装的转录组进行注释仍然是一项挑战,因为一些最先进的工具速度慢、安装困难、难以使用。我们利用 TransAnnot 解决了这些问题,它是一种易于安装和使用的快速自动转录组注释管道。利用 MMseqs2 套件提供的快速序列搜索,TransAnnot 可以一步注释 Swiss-Prot、eggNOG 中的基因本体术语和正交群,以及 Pfam 中的功能域。用户还可以根据自定义数据库进行注释。TransAnnot 接受测序读数(长短)、核苷酸序列或氨基酸序列作为注释输入。在使用氨基酸序列测试数据集进行基准测试时,TransAnnot 的速度分别是 EnTAP、Trinotate 和 eggNOG-mapper 等同类工具的 333 倍、284 倍和 18 倍:TransAnnot 可免费使用,根据 GPLv3 开放源码,用 C++ 和 Bash 实现。源代码、文档和预编译二进制文件可从 https://github.com/soedinglab/transannot 获取。TransAnnot 也可通过 bioconda (https://anaconda.org/bioconda/transannot) 获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.60
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信