A Comprehensive Evaluation of Taxonomic Classifiers in Marine Vertebrate eDNA Studies.

IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Philipp E Bayer, Adam Bennett, Georgia Nester, Shannon Corrigan, Eric J Raes, Madalyn Cooper, Marcelle E Ayad, Philip McVey, Anya Kardailsky, Jessica Pearce, Matthew W Fraser, Priscila Goncalves, Stephen Burnell, Sebastian Rauschert
{"title":"A Comprehensive Evaluation of Taxonomic Classifiers in Marine Vertebrate eDNA Studies.","authors":"Philipp E Bayer, Adam Bennett, Georgia Nester, Shannon Corrigan, Eric J Raes, Madalyn Cooper, Marcelle E Ayad, Philip McVey, Anya Kardailsky, Jessica Pearce, Matthew W Fraser, Priscila Goncalves, Stephen Burnell, Sebastian Rauschert","doi":"10.1111/1755-0998.14107","DOIUrl":null,"url":null,"abstract":"<p><p>Environmental DNA (eDNA) metabarcoding is a widely used tool for surveying marine vertebrate biodiversity. To this end, many computational tools have been released and a plethora of bioinformatic approaches are used for eDNA-based community composition analysis. Simulation studies and careful evaluation of taxonomic classifiers are essential to establish reliable benchmarks to improve the accuracy and reproducibility of eDNA-based findings. Here we present a comprehensive evaluation of nine taxonomic classifiers exploring three widely used mitochondrial markers (12S rDNA, 16S rDNA and COI) in Australian marine vertebrates. Curated reference databases and exclusion database tests were used to simulate diverse species compositions, including three positive control and two negative control datasets. Using these simulated datasets ranging from 36 to 302 marker genes, we were able to identify between 19% and 89% of marine vertebrate species using mitochondrial markers. We show that MMSeqs2 and Metabuli generally outperform BLAST with 10% and 11% higher F1 scores for 12S and 16S rDNA markers, respectively, and that Naive Bayes Classifiers such as Mothur outperform sequence-based classifiers except MMSeqs2 for COI markers by 11%. Database exclusion tests reveal that MMSeqs2 and BLAST are less susceptible to false positives compared to Kraken2 with default parameters. Based on these findings, we recommend that MMSeqs2 is used for taxonomic classification of marine vertebrates given its ability to improve species-level assignments while reducing the number of false positives. Our work contributes to the establishment of best practices in eDNA-based biodiversity analysis to ultimately increase the reliability of this monitoring tool in the context of marine vertebrate conservation.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14107"},"PeriodicalIF":5.5000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/1755-0998.14107","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Environmental DNA (eDNA) metabarcoding is a widely used tool for surveying marine vertebrate biodiversity. To this end, many computational tools have been released and a plethora of bioinformatic approaches are used for eDNA-based community composition analysis. Simulation studies and careful evaluation of taxonomic classifiers are essential to establish reliable benchmarks to improve the accuracy and reproducibility of eDNA-based findings. Here we present a comprehensive evaluation of nine taxonomic classifiers exploring three widely used mitochondrial markers (12S rDNA, 16S rDNA and COI) in Australian marine vertebrates. Curated reference databases and exclusion database tests were used to simulate diverse species compositions, including three positive control and two negative control datasets. Using these simulated datasets ranging from 36 to 302 marker genes, we were able to identify between 19% and 89% of marine vertebrate species using mitochondrial markers. We show that MMSeqs2 and Metabuli generally outperform BLAST with 10% and 11% higher F1 scores for 12S and 16S rDNA markers, respectively, and that Naive Bayes Classifiers such as Mothur outperform sequence-based classifiers except MMSeqs2 for COI markers by 11%. Database exclusion tests reveal that MMSeqs2 and BLAST are less susceptible to false positives compared to Kraken2 with default parameters. Based on these findings, we recommend that MMSeqs2 is used for taxonomic classification of marine vertebrates given its ability to improve species-level assignments while reducing the number of false positives. Our work contributes to the establishment of best practices in eDNA-based biodiversity analysis to ultimately increase the reliability of this monitoring tool in the context of marine vertebrate conservation.

海洋脊椎动物eDNA研究中分类器的综合评价。
环境DNA (Environmental DNA, eDNA)元条形码是一种广泛应用于海洋脊椎动物生物多样性调查的工具。为此,许多计算工具已经发布,并且大量的生物信息学方法被用于基于edna的群落组成分析。模拟研究和分类分类器的仔细评估对于建立可靠的基准以提高基于edna的发现的准确性和可重复性至关重要。在此,我们对澳大利亚海洋脊椎动物的9个分类分类器进行了综合评价,探索了3种广泛使用的线粒体标记(12S rDNA、16S rDNA和COI)。采用精选参考数据库和排除数据库测试模拟不同物种组成,包括3个阳性对照和2个阴性对照数据集。利用这些从36到302个标记基因的模拟数据集,我们能够使用线粒体标记识别19%到89%的海洋脊椎动物物种。我们发现MMSeqs2和Metabuli在12S和16S rDNA标记上的F1分数分别高出BLAST 10%和11%,而mother等Naive Bayes分类器在COI标记上的分数比MMSeqs2以外的序列分类器高出11%。数据库排除测试显示,与默认参数的Kraken2相比,MMSeqs2和BLAST更不容易出现误报。基于这些发现,我们建议将MMSeqs2用于海洋脊椎动物的分类分类,因为它能够提高物种水平的分配,同时减少假阳性的数量。我们的工作有助于建立基于edna的生物多样性分析的最佳实践,最终提高该监测工具在海洋脊椎动物保护背景下的可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular Ecology Resources
Molecular Ecology Resources 生物-进化生物学
CiteScore
15.60
自引率
5.20%
发文量
170
审稿时长
3 months
期刊介绍: Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines. In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信