Evaluating the reliability of DNA Barcoding for Central American Pacific shallow water echinoderms identification: a molecular taxonomy and database accuracy analysis

J. Chacón-Monge, J. I. Abarca-Odio, Kaylen González-Sánchez
{"title":"Evaluating the reliability of DNA Barcoding for Central American Pacific shallow water echinoderms identification: a molecular taxonomy and database accuracy analysis","authors":"J. Chacón-Monge, J. I. Abarca-Odio, Kaylen González-Sánchez","doi":"10.15517/rev.biol.trop..v72is1.58997","DOIUrl":null,"url":null,"abstract":"Introduction: Molecular divergence thresholds have been proposed to distinguish recently separated evolutive units, often displaying more accurate putative species assignments in taxonomic research compared to traditional morphological approaches. This makes DNA barcoding an attractive identification tool for a variety of marine invertebrates, especially for cryptic species complexes. Although GenBank and the Barcode of Life Data System (BOLD) are the major sequence repositories worldwide, very few have tested their performance in the identification of echinoderm sequences. \nObjective: We use COI echinoderm sequences from local samples and the molecular identification platforms from GenBank and BOLD, in order to test their accuracy and reliability in the DNA barcoding identification for Central American shallow water echinoderms, at genus and species level. \nMethods: We conducted sampling, tissue extraction, COI amplification, sequencing, and taxonomic identification for 475 specimens. The 348 obtained sequences were individually enquired with BLAST in GenBank as well as using the Identification System (IDS) in BOLD. Query sequences were classified depending on the best match result. McNemar’s chi-squared, Kruskal-Wallis’s and Mann-Whitney’s U tests were performed to prove differences between the results from both databases. Additionally, we recorded an updated list of species reported for the shallow waters of the Central American Pacific. \nResults: We found 324 echinoderm species reported for Central American Pacific shallow waters. Only 118 and 110 were present in GenBank and BOLD databases respectively. We proposed 325 solved morphology-based identities and 21 provisional identifications in 50 putative taxa. GenBank retrieved 348 molecular-based identifications in 58 species, including twelve provisional identifications in tree taxa. BOLD recovered 170 COI identifications in 23 species with one provisional identification. Nevertheless, 178 sequences retrieved unmatched terms (in 34 morphology-based taxa). Only 86 sequences (25 %) were retrieved as correct identifications and 128 (37 %) as identification errors in both platforms. We include 84 sequences for eleven species not represented in GenBank and 65 sequences for ten species in BOLD Echinoderm COI databases. The identification accuracy using BLAST (175 correct and 152 incorrect identifications) was greater than with IDS engine (110 correct and 218 identification errors), therefore GenBank outperforms BOLD (Kruskal-Wallis = 41.625, df = 1, p < 0.001). \nConclusions: Additional echinoderm sample references are needed to improve the utility of the evaluated DNA barcoding identification tools. Identification discordances in both databases may obey specific parameters used in each search algorithm engine and the available sequences. We recommend the use of barcoding as a complementary identification source for Central American Pacific shallow water echinoderm species.","PeriodicalId":504082,"journal":{"name":"Revista de Biología Tropical","volume":"12 12","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista de Biología Tropical","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15517/rev.biol.trop..v72is1.58997","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Molecular divergence thresholds have been proposed to distinguish recently separated evolutive units, often displaying more accurate putative species assignments in taxonomic research compared to traditional morphological approaches. This makes DNA barcoding an attractive identification tool for a variety of marine invertebrates, especially for cryptic species complexes. Although GenBank and the Barcode of Life Data System (BOLD) are the major sequence repositories worldwide, very few have tested their performance in the identification of echinoderm sequences. Objective: We use COI echinoderm sequences from local samples and the molecular identification platforms from GenBank and BOLD, in order to test their accuracy and reliability in the DNA barcoding identification for Central American shallow water echinoderms, at genus and species level. Methods: We conducted sampling, tissue extraction, COI amplification, sequencing, and taxonomic identification for 475 specimens. The 348 obtained sequences were individually enquired with BLAST in GenBank as well as using the Identification System (IDS) in BOLD. Query sequences were classified depending on the best match result. McNemar’s chi-squared, Kruskal-Wallis’s and Mann-Whitney’s U tests were performed to prove differences between the results from both databases. Additionally, we recorded an updated list of species reported for the shallow waters of the Central American Pacific. Results: We found 324 echinoderm species reported for Central American Pacific shallow waters. Only 118 and 110 were present in GenBank and BOLD databases respectively. We proposed 325 solved morphology-based identities and 21 provisional identifications in 50 putative taxa. GenBank retrieved 348 molecular-based identifications in 58 species, including twelve provisional identifications in tree taxa. BOLD recovered 170 COI identifications in 23 species with one provisional identification. Nevertheless, 178 sequences retrieved unmatched terms (in 34 morphology-based taxa). Only 86 sequences (25 %) were retrieved as correct identifications and 128 (37 %) as identification errors in both platforms. We include 84 sequences for eleven species not represented in GenBank and 65 sequences for ten species in BOLD Echinoderm COI databases. The identification accuracy using BLAST (175 correct and 152 incorrect identifications) was greater than with IDS engine (110 correct and 218 identification errors), therefore GenBank outperforms BOLD (Kruskal-Wallis = 41.625, df = 1, p < 0.001). Conclusions: Additional echinoderm sample references are needed to improve the utility of the evaluated DNA barcoding identification tools. Identification discordances in both databases may obey specific parameters used in each search algorithm engine and the available sequences. We recommend the use of barcoding as a complementary identification source for Central American Pacific shallow water echinoderm species.
评估中美洲太平洋浅水棘皮动物鉴定 DNA 条形码的可靠性:分子分类和数据库准确性分析
导言:与传统的形态学方法相比,分子分歧阈值可用于区分新近分离的进化单元,在分类研究中往往能显示出更准确的推定物种分配。这使得 DNA 条形码成为多种海洋无脊椎动物的一种极具吸引力的鉴定工具,尤其是对于隐蔽物种群而言。虽然 GenBank 和生命条形码数据系统(BOLD)是全球主要的序列库,但很少有人测试过它们在棘皮动物序列鉴定中的性能。我们的目标是我们利用当地样本中的棘皮动物 COI 序列以及 GenBank 和 BOLD 的分子鉴定平台,测试它们在中美洲浅水棘皮动物属和种的 DNA 条形码鉴定中的准确性和可靠性。方法:我们对 475 个标本进行了取样、组织提取、COI 扩增、测序和分类鉴定。利用 GenBank 中的 BLAST 以及 BOLD 中的识别系统(IDS)对获得的 348 个序列进行了单独查询。根据最佳匹配结果对查询序列进行分类。为了证明两个数据库结果的差异,我们进行了麦克尼玛卡方检验(McNemar's chi-squared)、克鲁斯卡尔-瓦利斯检验(Kruskal-Wallis's)和曼-惠特尼U检验(Mann-Whitney's U)。此外,我们还记录了中美洲太平洋浅水区报告的最新物种清单。结果我们在中美洲太平洋浅水区发现了 324 种棘皮动物。GenBank 和 BOLD 数据库中分别只有 118 和 110 个物种。我们对 50 个推定类群提出了 325 项基于形态学的已解决鉴定和 21 项临时鉴定。GenBank 在 58 个物种中检索到 348 个基于分子的鉴定结果,包括在树分类群中的 12 个临时鉴定结果。BOLD 在 23 个物种中检索到 170 个 COI 鉴定序列和一个临时鉴定序列。然而,有 178 个序列检索到了不匹配的术语(在 34 个基于形态分类群中)。在这两个平台中,只有 86 个序列(25%)被检索到正确的鉴定结果,128 个序列(37%)被检索到错误的鉴定结果。我们收录了 GenBank 中未收录的 11 个物种的 84 条序列和 BOLD 棘皮动物 COI 数据库中 10 个物种的 65 条序列。使用 BLAST 的鉴定准确率(175 个正确鉴定和 152 个错误鉴定)高于使用 IDS 引擎的鉴定准确率(110 个正确鉴定和 218 个错误鉴定),因此 GenBank 的鉴定准确率高于 BOLD(Kruskal-Wallis = 41.625,df = 1,p < 0.001)。结论要提高所评估的 DNA 条形码鉴定工具的实用性,还需要更多的棘皮动物样本参考。两个数据库中鉴定结果的不一致可能与每个搜索算法引擎使用的特定参数和可用序列有关。我们建议使用条形码作为中美洲太平洋浅水棘皮动物物种鉴定的补充来源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信