QIIME2 enhances multi-amplicon sequencing data analysis: a standardized and validated open-source pipeline for comprehensive 16S rRNA gene profiling.

IF 3.8 2区 生物学 Q2 MICROBIOLOGY
Armando G Licata, Marica Zoppi, Chiara Dossena, Federico Rossignoli, Davide Rizzo, Manuela Marra, Giorgio Gargari, Giacomo Mantegazza, Simone Guglielmetti, Luca Bergamaschi, Olga Nigro, Stefano Chiaravalli, Maura Massimino, Loris De Cecco
{"title":"QIIME2 enhances multi-amplicon sequencing data analysis: a standardized and validated open-source pipeline for comprehensive 16S rRNA gene profiling.","authors":"Armando G Licata, Marica Zoppi, Chiara Dossena, Federico Rossignoli, Davide Rizzo, Manuela Marra, Giorgio Gargari, Giacomo Mantegazza, Simone Guglielmetti, Luca Bergamaschi, Olga Nigro, Stefano Chiaravalli, Maura Massimino, Loris De Cecco","doi":"10.1128/spectrum.01673-25","DOIUrl":null,"url":null,"abstract":"<p><p>Multi-amplicon sequencing is a cost-effective method for profiling multiple regions of the 16S rRNA gene, offering a more comprehensive view of microbial diversity. However, implementing such pipelines on open-source platforms (e.g., QIIME2) is often hindered by limited documentation and lack of validation against established tools. This lack of standardization poses challenges for researchers, particularly in clinical and experimental settings. This study aims to: (i) develop and benchmark a standardized, open-source QIIME2- and R-based pipeline for 16S rRNA gene profiling using semiconductor-based sequencing, comparing it with a commercial, closed-source software; and (ii) validate its effectiveness in a pediatric cancer cohort to examine parental influence on the microbiome and child-caregiver microbial relationships. We generated 16S rRNA profiles from 5 mock communities and 12 child-caregiver fecal sample pairs. Benchmarking against commercial software showed that the multi-region (V2-9) approach produced microbial profiles nearly identical to proprietary outputs, with higher sequencing depth and improved taxonomic resolution compared to single-region analyses. Both approaches demonstrated similar microbial richness, accurate mock community reconstruction, and high reproducibility (<i>R</i> = 0.99, <i>P</i> < 0.0001). These findings were further validated using fecal samples. Application of the pipeline to pediatric samples revealed distinct, differentially abundant <i>Bifidobacterium bifidum</i> and <i>Bifidobacterium adolescentis</i> variants in children whose microbiota closely resembled that of their caregivers. Overall, this study presents a validated, open-source QIIME2 and R pipeline for multi-amplicon sequencing, providing a standardized and reproducible framework for 16S rRNA gene profiling in clinical and research contexts.IMPORTANCEMulti-amplicon sequencing comprehensively characterizes microbial communities by targeting multiple regions of the 16S rRNA gene. However, analytical workflows and reference databases provided by commercial library preparation kits frequently rely on proprietary primers and closed-source pipelines, which can limit transparency, reproducibility, and adaptability. To address these limitations, we developed and validated an open-source bioinformatics pipeline utilizing QIIME2 and R. Our pipeline integrates data from all targeted 16S regions, generating microbial profiles comparable to those produced by proprietary software. Validation was performed using mock samples and fecal samples collected from pediatric cancer patients and their caregivers, confirming the pipeline's reliability and broad applicability. Furthermore, our pipeline enables detailed analysis of microbial variants, surpassing traditional genus-level restrictions and fully leveraging the enhanced coverage offered by multi-amplicon sequencing. Our findings highlight the necessity of adopting open-source solutions to ensure scientific reproducibility and adaptability to emerging methodologies.</p>","PeriodicalId":18670,"journal":{"name":"Microbiology spectrum","volume":" ","pages":"e0167325"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microbiology spectrum","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1128/spectrum.01673-25","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Multi-amplicon sequencing is a cost-effective method for profiling multiple regions of the 16S rRNA gene, offering a more comprehensive view of microbial diversity. However, implementing such pipelines on open-source platforms (e.g., QIIME2) is often hindered by limited documentation and lack of validation against established tools. This lack of standardization poses challenges for researchers, particularly in clinical and experimental settings. This study aims to: (i) develop and benchmark a standardized, open-source QIIME2- and R-based pipeline for 16S rRNA gene profiling using semiconductor-based sequencing, comparing it with a commercial, closed-source software; and (ii) validate its effectiveness in a pediatric cancer cohort to examine parental influence on the microbiome and child-caregiver microbial relationships. We generated 16S rRNA profiles from 5 mock communities and 12 child-caregiver fecal sample pairs. Benchmarking against commercial software showed that the multi-region (V2-9) approach produced microbial profiles nearly identical to proprietary outputs, with higher sequencing depth and improved taxonomic resolution compared to single-region analyses. Both approaches demonstrated similar microbial richness, accurate mock community reconstruction, and high reproducibility (R = 0.99, P < 0.0001). These findings were further validated using fecal samples. Application of the pipeline to pediatric samples revealed distinct, differentially abundant Bifidobacterium bifidum and Bifidobacterium adolescentis variants in children whose microbiota closely resembled that of their caregivers. Overall, this study presents a validated, open-source QIIME2 and R pipeline for multi-amplicon sequencing, providing a standardized and reproducible framework for 16S rRNA gene profiling in clinical and research contexts.IMPORTANCEMulti-amplicon sequencing comprehensively characterizes microbial communities by targeting multiple regions of the 16S rRNA gene. However, analytical workflows and reference databases provided by commercial library preparation kits frequently rely on proprietary primers and closed-source pipelines, which can limit transparency, reproducibility, and adaptability. To address these limitations, we developed and validated an open-source bioinformatics pipeline utilizing QIIME2 and R. Our pipeline integrates data from all targeted 16S regions, generating microbial profiles comparable to those produced by proprietary software. Validation was performed using mock samples and fecal samples collected from pediatric cancer patients and their caregivers, confirming the pipeline's reliability and broad applicability. Furthermore, our pipeline enables detailed analysis of microbial variants, surpassing traditional genus-level restrictions and fully leveraging the enhanced coverage offered by multi-amplicon sequencing. Our findings highlight the necessity of adopting open-source solutions to ensure scientific reproducibility and adaptability to emerging methodologies.

QIIME2增强了多扩增子测序数据分析:一个标准化和经过验证的开源管道,用于全面的16S rRNA基因分析。
多扩增子测序是一种具有成本效益的方法,可以分析16S rRNA基因的多个区域,提供更全面的微生物多样性视图。然而,在开源平台(例如QIIME2)上实现这样的管道常常受到有限的文档和缺乏对已建立工具的验证的阻碍。这种标准化的缺乏给研究人员带来了挑战,特别是在临床和实验环境中。本研究的目的是:(i)开发一个标准化的、开源的QIIME2和基于r的管道,并对其进行基准测试,使用基于半导体的测序来进行16S rRNA基因分析,并将其与商业的、闭源的软件进行比较;并且(ii)验证其在儿童癌症队列中的有效性,以检查父母对微生物组和儿童照顾者微生物关系的影响。我们从5个模拟社区和12对儿童照顾者粪便样本中生成了16S rRNA谱。对商业软件的基准测试表明,多区域(V2-9)方法产生的微生物图谱几乎与专有输出相同,与单区域分析相比,具有更高的测序深度和更高的分类分辨率。两种方法均表现出相似的微生物丰富度、准确的模拟群落重建和高重复性(R = 0.99, P < 0.0001)。使用粪便样本进一步验证了这些发现。将该管道应用于儿科样本显示,两歧双歧杆菌和青少年双歧杆菌变体在微生物群与其照顾者非常相似的儿童中明显丰富。总的来说,本研究提出了一个经过验证的、开源的QIIME2和R管道,用于多扩增子测序,为临床和研究背景下的16S rRNA基因分析提供了一个标准化和可重复的框架。重要性多扩增子测序通过靶向16S rRNA基因的多个区域来全面表征微生物群落。然而,商业文库准备套件提供的分析工作流程和参考数据库经常依赖于专有引物和闭源管道,这可能会限制透明度、可重复性和适应性。为了解决这些限制,我们利用QIIME2和r开发并验证了一个开源的生物信息学管道。我们的管道集成了所有目标16S区域的数据,生成了与专有软件产生的数据相当的微生物图谱。使用模拟样本和从儿童癌症患者及其护理人员收集的粪便样本进行验证,证实了该管道的可靠性和广泛适用性。此外,我们的产品线能够详细分析微生物变异,超越传统的属级限制,并充分利用多扩增子测序提供的增强覆盖范围。我们的发现强调了采用开源解决方案的必要性,以确保科学的可重复性和对新兴方法的适应性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Microbiology spectrum
Microbiology spectrum Biochemistry, Genetics and Molecular Biology-Genetics
CiteScore
3.20
自引率
5.40%
发文量
1800
期刊介绍: Microbiology Spectrum publishes commissioned review articles on topics in microbiology representing ten content areas: Archaea; Food Microbiology; Bacterial Genetics, Cell Biology, and Physiology; Clinical Microbiology; Environmental Microbiology and Ecology; Eukaryotic Microbes; Genomics, Computational, and Synthetic Microbiology; Immunology; Pathogenesis; and Virology. Reviews are interrelated, with each review linking to other related content. A large board of Microbiology Spectrum editors aids in the development of topics for potential reviews and in the identification of an editor, or editors, who shepherd each collection.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信