Simple approaches for evaluation of OTU quality based on dissimilarity arrays

M. Cros, Jean-Marc Frigerio, N. Peyrard, Alain Franc
{"title":"Simple approaches for evaluation of OTU quality based on dissimilarity arrays","authors":"M. Cros, Jean-Marc Frigerio, N. Peyrard, Alain Franc","doi":"10.3897/mbmg.8.108649","DOIUrl":null,"url":null,"abstract":"An accurate and complete taxonomic description of the diversity present in an environmental sample is out of reach at this time. Instead, metabarcoding is used today and it is expected that OTUs represent a category relevant for biodiversity inventories on a molecular basis. However, artefacts in the production of OTUs can occur at different stages and may impact ecological conclusions. We propose to evaluate the quality of OTUs in a sample by characterising the deviation of each OTU’s dissimilarity array from that of an ideal OTU where all sequences are at distances smaller than the barcoding gap. We consider two deviations: the creation of composed OTUs, corresponding to the artificial merging of several OTUs and the creation of noisy OTUs that contain some sequences that are loosely associated with the core sequence of the OTUs and that do not form a compact subgroup. We propose a simple and automatic 2-step method that successively categorises the OTUs of a sample as composed or single and then identifies OTUs with noise amongst the single ones. The associated code is available at https://forgemia.inra.fr/alain.franc/otu_shape. We applied the method on 32 samples of diatoms from Arcachon Bay (France) that represent contrasted environmental conditions and we obtained good agreement with expert categorisation of OTUs. We suggest that single OTUs without noise can be used as such for further ecological studies. Composed OTUs should be post-treated with classical clustering or community detection tools. The quality of single OTUs with noise remains to be further tested via supplementary studies on a diversity of organisms.","PeriodicalId":18374,"journal":{"name":"Metabarcoding and Metagenomics","volume":"16 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Metabarcoding and Metagenomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/mbmg.8.108649","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

An accurate and complete taxonomic description of the diversity present in an environmental sample is out of reach at this time. Instead, metabarcoding is used today and it is expected that OTUs represent a category relevant for biodiversity inventories on a molecular basis. However, artefacts in the production of OTUs can occur at different stages and may impact ecological conclusions. We propose to evaluate the quality of OTUs in a sample by characterising the deviation of each OTU’s dissimilarity array from that of an ideal OTU where all sequences are at distances smaller than the barcoding gap. We consider two deviations: the creation of composed OTUs, corresponding to the artificial merging of several OTUs and the creation of noisy OTUs that contain some sequences that are loosely associated with the core sequence of the OTUs and that do not form a compact subgroup. We propose a simple and automatic 2-step method that successively categorises the OTUs of a sample as composed or single and then identifies OTUs with noise amongst the single ones. The associated code is available at https://forgemia.inra.fr/alain.franc/otu_shape. We applied the method on 32 samples of diatoms from Arcachon Bay (France) that represent contrasted environmental conditions and we obtained good agreement with expert categorisation of OTUs. We suggest that single OTUs without noise can be used as such for further ecological studies. Composed OTUs should be post-treated with classical clustering or community detection tools. The quality of single OTUs with noise remains to be further tested via supplementary studies on a diversity of organisms.
基于异质性阵列评估 OTU 质量的简单方法
对环境样本中存在的多样性进行准确而完整的分类描述目前还无法实现。相反,目前使用的是代谢条码,预计 OTU 代表了与分子基础上的生物多样性清单相关的类别。然而,OTU 生成过程中的误差可能发生在不同阶段,并可能影响生态学结论。我们建议通过描述每个 OTU 的异质性阵列与理想 OTU 的异质性阵列之间的偏差来评估样本中 OTU 的质量,在理想 OTU 中,所有序列的距离都小于条码间隙。我们考虑了两种偏差:一种是人为合并多个 OTU 而产生的组成 OTU,另一种是包含一些与 OTU 核心序列关联松散且未形成紧凑亚群的序列的噪声 OTU。我们提出了一种简单、自动的两步法,可将样本中的 OTU 分为组成 OTU 和单一 OTU,然后在单一 OTU 中识别出带有噪声的 OTU。相关代码见 https://forgemia.inra.fr/alain.franc/otu_shape。我们将该方法应用于代表不同环境条件的阿卡雄湾(法国)32 个硅藻样本,结果与专家的 OTU 分类结果非常吻合。我们建议,在进一步的生态学研究中,可以使用无噪声的单个 OTU。合成的 OTU 应使用经典的聚类或群落检测工具进行后处理。有噪声的单个 OTU 的质量还有待通过对多种生物进行补充研究来进一步检验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Metabarcoding and Metagenomics
Metabarcoding and Metagenomics Agricultural and Biological Sciences-Animal Science and Zoology
CiteScore
5.40
自引率
0.00%
发文量
25
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信