{"title":"MetaBIDx: a new computational approach to bacteria identification in microbiomes","authors":"Diem-Trang Pham, Vinhthuy T. Phan","doi":"10.20517/mrr.2024.01","DOIUrl":null,"url":null,"abstract":"Objectives: This study introduces MetaBIDx, a computational method designed to enhance species prediction in metagenomic environments. The method addresses the challenge of accurate species identification in complex microbiomes, which is due to the large number of generated reads and the ever-expanding number of bacterial genomes. Bacterial identification is essential for disease diagnosis and tracing outbreaks associated with microbial infections.\n Methods: MetaBIDx utilizes a modified Bloom filter for efficient indexing of reference genomes and incorporates a novel strategy for reducing false positives by clustering species based on their genomic coverages by identified reads. The approach was evaluated and compared with several well-established tools across various datasets. Precision, recall, and F1-score were used to quantify the accuracy of species prediction.\n Results: MetaBIDx demonstrated superior performance compared to other tools, especially in terms of precision and F1-score. The application of clustering based on approximate coverages significantly improved precision in species identification, effectively minimizing false positives. We further demonstrated that other methods can also benefit from our approach to removing false positives by clustering species based on approximate coverages.\n Conclusion: With a novel approach to reducing false positives and the effective use of a modified Bloom filter to index species, MetaBIDx represents an advancement in metagenomic analysis. The findings suggest that the proposed approach could also benefit other metagenomic tools, indicating its potential for broader application in the field. The study lays the groundwork for future improvements in computational efficiency and the expansion of microbial databases.","PeriodicalId":507408,"journal":{"name":"Microbiome Research Reports","volume":"551 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microbiome Research Reports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20517/mrr.2024.01","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: This study introduces MetaBIDx, a computational method designed to enhance species prediction in metagenomic environments. The method addresses the challenge of accurate species identification in complex microbiomes, which is due to the large number of generated reads and the ever-expanding number of bacterial genomes. Bacterial identification is essential for disease diagnosis and tracing outbreaks associated with microbial infections.
Methods: MetaBIDx utilizes a modified Bloom filter for efficient indexing of reference genomes and incorporates a novel strategy for reducing false positives by clustering species based on their genomic coverages by identified reads. The approach was evaluated and compared with several well-established tools across various datasets. Precision, recall, and F1-score were used to quantify the accuracy of species prediction.
Results: MetaBIDx demonstrated superior performance compared to other tools, especially in terms of precision and F1-score. The application of clustering based on approximate coverages significantly improved precision in species identification, effectively minimizing false positives. We further demonstrated that other methods can also benefit from our approach to removing false positives by clustering species based on approximate coverages.
Conclusion: With a novel approach to reducing false positives and the effective use of a modified Bloom filter to index species, MetaBIDx represents an advancement in metagenomic analysis. The findings suggest that the proposed approach could also benefit other metagenomic tools, indicating its potential for broader application in the field. The study lays the groundwork for future improvements in computational efficiency and the expansion of microbial databases.
研究目的本研究介绍了一种旨在提高元基因组环境中物种预测能力的计算方法--MetaBIDx。该方法解决了在复杂微生物组中准确识别物种的难题,这是由于生成的读数数量庞大,细菌基因组的数量也在不断扩大。细菌鉴定对于疾病诊断和追踪与微生物感染相关的疫情爆发至关重要。方法MetaBIDx 利用改进的布鲁姆过滤器对参考基因组进行有效索引,并采用一种新颖的策略,根据已识别读数的基因组覆盖率对物种进行聚类,从而减少假阳性。在各种数据集上对该方法进行了评估,并与几种成熟的工具进行了比较。精确度、召回率和 F1 分数被用来量化物种预测的准确性。结果显示与其他工具相比,MetaBIDx 表现出更优越的性能,尤其是在精确度和 F1 分数方面。基于近似覆盖率的聚类应用大大提高了物种识别的精确度,有效地减少了误报。我们进一步证明,通过基于近似覆盖率对物种进行聚类来消除误报,其他方法也能从中受益。结论MetaBIDx 采用新颖的方法来减少误报,并有效利用改进的布鲁姆过滤器来索引物种,是元基因组分析的一大进步。研究结果表明,所提出的方法也能使其他元基因组工具受益,这表明它有可能在该领域得到更广泛的应用。这项研究为未来提高计算效率和扩展微生物数据库奠定了基础。