Armando G Licata, Marica Zoppi, Chiara Dossena, Federico Rossignoli, Davide Rizzo, Manuela Marra, Giorgio Gargari, Giacomo Mantegazza, Simone Guglielmetti, Luca Bergamaschi, Olga Nigro, Stefano Chiaravalli, Maura Massimino, Loris De Cecco
{"title":"QIIME2 enhances multi-amplicon sequencing data analysis: a standardized and validated open-source pipeline for comprehensive 16S rRNA gene profiling.","authors":"Armando G Licata, Marica Zoppi, Chiara Dossena, Federico Rossignoli, Davide Rizzo, Manuela Marra, Giorgio Gargari, Giacomo Mantegazza, Simone Guglielmetti, Luca Bergamaschi, Olga Nigro, Stefano Chiaravalli, Maura Massimino, Loris De Cecco","doi":"10.1128/spectrum.01673-25","DOIUrl":null,"url":null,"abstract":"<p><p>Multi-amplicon sequencing is a cost-effective method for profiling multiple regions of the 16S rRNA gene, offering a more comprehensive view of microbial diversity. However, implementing such pipelines on open-source platforms (e.g., QIIME2) is often hindered by limited documentation and lack of validation against established tools. This lack of standardization poses challenges for researchers, particularly in clinical and experimental settings. This study aims to: (i) develop and benchmark a standardized, open-source QIIME2- and R-based pipeline for 16S rRNA gene profiling using semiconductor-based sequencing, comparing it with a commercial, closed-source software; and (ii) validate its effectiveness in a pediatric cancer cohort to examine parental influence on the microbiome and child-caregiver microbial relationships. We generated 16S rRNA profiles from 5 mock communities and 12 child-caregiver fecal sample pairs. Benchmarking against commercial software showed that the multi-region (V2-9) approach produced microbial profiles nearly identical to proprietary outputs, with higher sequencing depth and improved taxonomic resolution compared to single-region analyses. Both approaches demonstrated similar microbial richness, accurate mock community reconstruction, and high reproducibility (<i>R</i> = 0.99, <i>P</i> < 0.0001). These findings were further validated using fecal samples. Application of the pipeline to pediatric samples revealed distinct, differentially abundant <i>Bifidobacterium bifidum</i> and <i>Bifidobacterium adolescentis</i> variants in children whose microbiota closely resembled that of their caregivers. Overall, this study presents a validated, open-source QIIME2 and R pipeline for multi-amplicon sequencing, providing a standardized and reproducible framework for 16S rRNA gene profiling in clinical and research contexts.IMPORTANCEMulti-amplicon sequencing comprehensively characterizes microbial communities by targeting multiple regions of the 16S rRNA gene. However, analytical workflows and reference databases provided by commercial library preparation kits frequently rely on proprietary primers and closed-source pipelines, which can limit transparency, reproducibility, and adaptability. To address these limitations, we developed and validated an open-source bioinformatics pipeline utilizing QIIME2 and R. Our pipeline integrates data from all targeted 16S regions, generating microbial profiles comparable to those produced by proprietary software. Validation was performed using mock samples and fecal samples collected from pediatric cancer patients and their caregivers, confirming the pipeline's reliability and broad applicability. Furthermore, our pipeline enables detailed analysis of microbial variants, surpassing traditional genus-level restrictions and fully leveraging the enhanced coverage offered by multi-amplicon sequencing. Our findings highlight the necessity of adopting open-source solutions to ensure scientific reproducibility and adaptability to emerging methodologies.</p>","PeriodicalId":18670,"journal":{"name":"Microbiology spectrum","volume":" ","pages":"e0167325"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microbiology spectrum","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1128/spectrum.01673-25","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-amplicon sequencing is a cost-effective method for profiling multiple regions of the 16S rRNA gene, offering a more comprehensive view of microbial diversity. However, implementing such pipelines on open-source platforms (e.g., QIIME2) is often hindered by limited documentation and lack of validation against established tools. This lack of standardization poses challenges for researchers, particularly in clinical and experimental settings. This study aims to: (i) develop and benchmark a standardized, open-source QIIME2- and R-based pipeline for 16S rRNA gene profiling using semiconductor-based sequencing, comparing it with a commercial, closed-source software; and (ii) validate its effectiveness in a pediatric cancer cohort to examine parental influence on the microbiome and child-caregiver microbial relationships. We generated 16S rRNA profiles from 5 mock communities and 12 child-caregiver fecal sample pairs. Benchmarking against commercial software showed that the multi-region (V2-9) approach produced microbial profiles nearly identical to proprietary outputs, with higher sequencing depth and improved taxonomic resolution compared to single-region analyses. Both approaches demonstrated similar microbial richness, accurate mock community reconstruction, and high reproducibility (R = 0.99, P < 0.0001). These findings were further validated using fecal samples. Application of the pipeline to pediatric samples revealed distinct, differentially abundant Bifidobacterium bifidum and Bifidobacterium adolescentis variants in children whose microbiota closely resembled that of their caregivers. Overall, this study presents a validated, open-source QIIME2 and R pipeline for multi-amplicon sequencing, providing a standardized and reproducible framework for 16S rRNA gene profiling in clinical and research contexts.IMPORTANCEMulti-amplicon sequencing comprehensively characterizes microbial communities by targeting multiple regions of the 16S rRNA gene. However, analytical workflows and reference databases provided by commercial library preparation kits frequently rely on proprietary primers and closed-source pipelines, which can limit transparency, reproducibility, and adaptability. To address these limitations, we developed and validated an open-source bioinformatics pipeline utilizing QIIME2 and R. Our pipeline integrates data from all targeted 16S regions, generating microbial profiles comparable to those produced by proprietary software. Validation was performed using mock samples and fecal samples collected from pediatric cancer patients and their caregivers, confirming the pipeline's reliability and broad applicability. Furthermore, our pipeline enables detailed analysis of microbial variants, surpassing traditional genus-level restrictions and fully leveraging the enhanced coverage offered by multi-amplicon sequencing. Our findings highlight the necessity of adopting open-source solutions to ensure scientific reproducibility and adaptability to emerging methodologies.
期刊介绍:
Microbiology Spectrum publishes commissioned review articles on topics in microbiology representing ten content areas: Archaea; Food Microbiology; Bacterial Genetics, Cell Biology, and Physiology; Clinical Microbiology; Environmental Microbiology and Ecology; Eukaryotic Microbes; Genomics, Computational, and Synthetic Microbiology; Immunology; Pathogenesis; and Virology. Reviews are interrelated, with each review linking to other related content. A large board of Microbiology Spectrum editors aids in the development of topics for potential reviews and in the identification of an editor, or editors, who shepherd each collection.