Microbial abundances retrieved from sequencing data—automated NCBI taxonomy (MARS): a pipeline to create relative microbial abundance data for the microbiome modelling toolbox and utilising homosynonyms for efficient mapping to resources

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Bioinformatics advances Pub Date : 2024-05-10 DOI:10.1093/bioadv/vbae068

T. Hulshof, Bram Nap, Filippo Martinelli, Ines Thiele

{"title":"Microbial abundances retrieved from sequencing data—automated NCBI taxonomy (MARS): a pipeline to create relative microbial abundance data for the microbiome modelling toolbox and utilising homosynonyms for efficient mapping to resources","authors":"T. Hulshof, Bram Nap, Filippo Martinelli, Ines Thiele","doi":"10.1093/bioadv/vbae068","DOIUrl":null,"url":null,"abstract":"\n \n \n Computational approaches to the functional characterisation of the microbiome, such as the Microbiome Modelling Toolbox, require precise information on microbial composition and relative abundances. However, challenges arise from homosynonyms—different names referring to the same taxon, which can hinder the mapping process and lead to missed species mapping when using microbial metabolic reconstruction resources, such as AGORA and APOLLO.\n \n \n \n We introduce the integrated MARS pipeline, a user-friendly Python-based solution that addresses these challenges. MARS automates the extraction of relative abundances from metagenomic reads, maps species and genera onto microbial metabolic reconstructions, and accounts for alternative taxonomic names. It normalises microbial reads, provides an optional cut-off for low-abundance taxa, and produces relative abundance tables apt for integration with the Microbiome Modelling Toolbox. A sub-component of the pipeline automates the task of identifying homosynonyms, leveraging web scraping to find taxonomic IDs of given species, searching NCBI for alternative names, and cross-reference them with microbial reconstruction resources. Taken together, MARS streamlines the entire process from processed metagenomic reads to relative abundance, thereby significantly reducing time and effort when working with microbiome data.\n \n \n \n MARS is implemented in Python. It can be found as an interactive application here: https://mars-pipeline.streamlit.app/along with a detailed documentation here: https://github.com/ThieleLab/mars-pipeline.\n","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbae068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Computational approaches to the functional characterisation of the microbiome, such as the Microbiome Modelling Toolbox, require precise information on microbial composition and relative abundances. However, challenges arise from homosynonyms—different names referring to the same taxon, which can hinder the mapping process and lead to missed species mapping when using microbial metabolic reconstruction resources, such as AGORA and APOLLO. We introduce the integrated MARS pipeline, a user-friendly Python-based solution that addresses these challenges. MARS automates the extraction of relative abundances from metagenomic reads, maps species and genera onto microbial metabolic reconstructions, and accounts for alternative taxonomic names. It normalises microbial reads, provides an optional cut-off for low-abundance taxa, and produces relative abundance tables apt for integration with the Microbiome Modelling Toolbox. A sub-component of the pipeline automates the task of identifying homosynonyms, leveraging web scraping to find taxonomic IDs of given species, searching NCBI for alternative names, and cross-reference them with microbial reconstruction resources. Taken together, MARS streamlines the entire process from processed metagenomic reads to relative abundance, thereby significantly reducing time and effort when working with microbiome data. MARS is implemented in Python. It can be found as an interactive application here: https://mars-pipeline.streamlit.app/along with a detailed documentation here: https://github.com/ThieleLab/mars-pipeline.

查看原文本刊更多论文

从测序数据中检索到的微生物丰度--NCBI 自动分类法（MARS）：为微生物组建模工具箱创建相对微生物丰度数据的管道，并利用同义词高效地映射资源

微生物组功能表征的计算方法（如微生物组建模工具箱）需要有关微生物组成和相对丰度的精确信息。然而，在使用 AGORA 和 APOLLO 等微生物代谢重建资源时，同源异名（指同一分类群的不同名称）会阻碍绘图过程并导致错过物种绘图。我们介绍了集成的 MARS 管道，这是一种基于 Python 的用户友好型解决方案，可以解决这些难题。MARS 可自动从元基因组读数中提取相对丰度，将物种和属映射到微生物代谢重建上，并考虑到其他分类名称。它对微生物读数进行归一化处理，为低丰度类群提供可选的截止值，并生成适合与微生物组建模工具箱整合的相对丰度表。该管道的一个子组件可自动识别同义词，利用网络搜索功能查找给定物种的分类标识，在 NCBI 中搜索替代名称，并与微生物重建资源相互参照。总之，MARS 简化了从处理元基因组读数到相对丰度的整个过程，从而大大减少了处理微生物组数据的时间和精力。 MARS 使用 Python 实现。它的交互式应用程序见 https://mars-pipeline.streamlit.app/along，详细文档见 https://github.com/ThieleLab/mars-pipeline。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bioinformatics advances

CiteScore

1.60

自引率

0.00%

发文量