Andrea Polanco F., Romane Rozanski, Virginie Marques, Martin Helmkampf, David Mouillot, Stéphanie Manel, Camille Albouy, Oscar Puebla, Loïc Pellissier
{"title":"A Confidence Scoring Procedure for eDNA Metabarcoding Records and Its Application to a Global Marine Fish Dataset","authors":"Andrea Polanco F., Romane Rozanski, Virginie Marques, Martin Helmkampf, David Mouillot, Stéphanie Manel, Camille Albouy, Oscar Puebla, Loïc Pellissier","doi":"10.1002/edn3.70077","DOIUrl":null,"url":null,"abstract":"<p>Environmental DNA (eDNA) metabarcoding is changing the way biodiversity is surveyed in many types of ecosystems. eDNA surveys are now commonly performed and integrated into biodiversity monitoring programs and public databases. Although it is widely recognized that eDNA records require interpretation in light of taxonomy and biogeography, there remains a range of perceptions about how thoroughly records should be evaluated and which ones should be reported. Here, we present a modular procedure, available as an R script, that uses a set of five steps to assess the confidence of species-level eDNA records by assigning them a score from 0 to 5. This procedure includes evaluations of the known geographic distribution of each taxon, the taxonomic resolution of the marker used, the regional completeness of the reference database, the diversification rate, and the range map of each taxon. We tested the procedure on a large-scale marine fish eDNA dataset (572 samples) covering 15 ecoregions worldwide, from the poles to the tropics, using the <i>teleo</i> marker on the mitochondrial 12S ribosomal gene. Our analysis revealed broad variation in the average confidence score of eDNA records among regions, with the highest scores occurring along the European and Eastern Atlantic coasts. Generalized linear models applied to record covariates highlighted the significant influences of latitude and species richness on low confidence scores (< 2.5). The polar regions notably displayed high proportions of low confidence scores, probably due to the limited completeness of the regional reference databases and the taxonomic resolution of the <i>teleo</i> marker. We conclude that only records with high confidence scores (> 2.5) should be integrated into biodiversity databases. The medium (2.5) to relatively low-confidence (< 2.5) records correspond to species that require further investigation and may be integrated after inspection to ensure high-quality species records.</p>","PeriodicalId":52828,"journal":{"name":"Environmental DNA","volume":"7 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/edn3.70077","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental DNA","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/edn3.70077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Environmental DNA (eDNA) metabarcoding is changing the way biodiversity is surveyed in many types of ecosystems. eDNA surveys are now commonly performed and integrated into biodiversity monitoring programs and public databases. Although it is widely recognized that eDNA records require interpretation in light of taxonomy and biogeography, there remains a range of perceptions about how thoroughly records should be evaluated and which ones should be reported. Here, we present a modular procedure, available as an R script, that uses a set of five steps to assess the confidence of species-level eDNA records by assigning them a score from 0 to 5. This procedure includes evaluations of the known geographic distribution of each taxon, the taxonomic resolution of the marker used, the regional completeness of the reference database, the diversification rate, and the range map of each taxon. We tested the procedure on a large-scale marine fish eDNA dataset (572 samples) covering 15 ecoregions worldwide, from the poles to the tropics, using the teleo marker on the mitochondrial 12S ribosomal gene. Our analysis revealed broad variation in the average confidence score of eDNA records among regions, with the highest scores occurring along the European and Eastern Atlantic coasts. Generalized linear models applied to record covariates highlighted the significant influences of latitude and species richness on low confidence scores (< 2.5). The polar regions notably displayed high proportions of low confidence scores, probably due to the limited completeness of the regional reference databases and the taxonomic resolution of the teleo marker. We conclude that only records with high confidence scores (> 2.5) should be integrated into biodiversity databases. The medium (2.5) to relatively low-confidence (< 2.5) records correspond to species that require further investigation and may be integrated after inspection to ensure high-quality species records.