A Practical Comparison of Short- and Long-Read Metabarcoding Sequencing: Challenges and Solutions for Plastid Read Removal and Microbial Community Exploration of Seaweed Samples.
Coralie Rousseau, Nicolas Henry, Sylvie Rousvoal, Gwenn Tanguy, Erwan Legeay, Catherine Leblanc, Simon M Dittami
{"title":"A Practical Comparison of Short- and Long-Read Metabarcoding Sequencing: Challenges and Solutions for Plastid Read Removal and Microbial Community Exploration of Seaweed Samples.","authors":"Coralie Rousseau, Nicolas Henry, Sylvie Rousvoal, Gwenn Tanguy, Erwan Legeay, Catherine Leblanc, Simon M Dittami","doi":"10.1111/1755-0998.14129","DOIUrl":null,"url":null,"abstract":"<p><p>Short-read metabarcoding analysis is the gold standard for accessing partial 16S and ITS genes with high read quality. With the advent of long-read sequencing, the amplification of full-length target genes is possible, but with low read accuracy. Moreover, 16S rRNA gene amplification in seaweed results in a large proportion of plastid reads, which are directly or indirectly derived from cyanobacteria. Primers designed not to amplify plastid sequences are available for short-read sequencing, while Oxford Nanopore Technology (ONT) offers adaptive sampling, a unique way to remove reads in real time. In this study, we compare three options to address the issue of plastid reads: deleting plastid reads with adaptive sampling, using optimised primers with Illumina MiSeq technology, and sequencing large numbers of reads with Illumina NovaSeq technology with universal primers. We show that adaptive sampling using the default settings of the MinKNOW software was ineffective for plastid depletion. NovaSeq sequencing with universal primers stood out with its deep coverage, low error rate, and ability to include both eukaryotes and bacteria in the same sequencing run, but it had limitations regarding the identification of fungi. The ONT sequencing helped us explore the fungal diversity and allowed for the retrieval of taxonomic information for genera poorly represented in the sequence databases. We also demonstrated with a mock community that the SAMBA workflow provided more accurate taxonomic assignment at the bacterial genus level than the IDTAXA and KRAKEN2 pipelines, but many false positives were generated at the species level.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14129"},"PeriodicalIF":5.5000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/1755-0998.14129","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Short-read metabarcoding analysis is the gold standard for accessing partial 16S and ITS genes with high read quality. With the advent of long-read sequencing, the amplification of full-length target genes is possible, but with low read accuracy. Moreover, 16S rRNA gene amplification in seaweed results in a large proportion of plastid reads, which are directly or indirectly derived from cyanobacteria. Primers designed not to amplify plastid sequences are available for short-read sequencing, while Oxford Nanopore Technology (ONT) offers adaptive sampling, a unique way to remove reads in real time. In this study, we compare three options to address the issue of plastid reads: deleting plastid reads with adaptive sampling, using optimised primers with Illumina MiSeq technology, and sequencing large numbers of reads with Illumina NovaSeq technology with universal primers. We show that adaptive sampling using the default settings of the MinKNOW software was ineffective for plastid depletion. NovaSeq sequencing with universal primers stood out with its deep coverage, low error rate, and ability to include both eukaryotes and bacteria in the same sequencing run, but it had limitations regarding the identification of fungi. The ONT sequencing helped us explore the fungal diversity and allowed for the retrieval of taxonomic information for genera poorly represented in the sequence databases. We also demonstrated with a mock community that the SAMBA workflow provided more accurate taxonomic assignment at the bacterial genus level than the IDTAXA and KRAKEN2 pipelines, but many false positives were generated at the species level.
期刊介绍:
Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines.
In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.