KARGA: Multi-platform Toolkit for k-mer-based Antibiotic Resistance Gene Analysis of High-throughput Sequencing Data.

Mattia Prosperi, Simone Marini
{"title":"KARGA: Multi-platform Toolkit for <i>k</i>-mer-based Antibiotic Resistance Gene Analysis of High-throughput Sequencing Data.","authors":"Mattia Prosperi,&nbsp;Simone Marini","doi":"10.1109/bhi50953.2021.9508479","DOIUrl":null,"url":null,"abstract":"<p><p>High-throughput sequencing is widely used for strain detection and characterization of antibiotic resistance in microbial metagenomic samples. Current analytical tools use curated antibiotic resistance gene (ARG) databases to classify individual sequencing reads or assembled contigs. However, identifying ARGs from raw read data can be time consuming (especially if assembly or alignment is required) and challenging, due to genome rearrangements and mutations. Here, we present the <i>k</i>-mer-based antibiotic gene resistance analyzer (KARGA), a multi-platform Java toolkit for identifying ARGs from metagenomic short read data. KARGA does not perform alignment; it uses an efficient double-lookup strategy, statistical filtering on false positives, and provides individual read classification as well as covering of the database resistome. On simulated data, KARGA's antibiotic resistance class recall is 99.89% for error/mutation rates within 10%, and of 83.37% for error/mutation rates between 10% and 25%, while it is 99.92% on ARGs with rearrangements. On empirical data, KARGA provides higher hit score (≥1.5-fold) than AMRPlusPlus, DeepARG, and MetaMARC. KARGA has also faster runtimes than all other tools (2x faster than AMRPlusPlus, 7x than DeepARG, and over 100x than MetaMARC). KARGA is available under the MIT license at https://github.com/DataIntellSystLab/KARGA.</p>","PeriodicalId":72024,"journal":{"name":"... IEEE-EMBS International Conference on Biomedical and Health Informatics. IEEE-EMBS International Conference on Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8383893/pdf/nihms-1734284.pdf","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"... IEEE-EMBS International Conference on Biomedical and Health Informatics. IEEE-EMBS International Conference on Biomedical and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/bhi50953.2021.9508479","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/8/10 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

High-throughput sequencing is widely used for strain detection and characterization of antibiotic resistance in microbial metagenomic samples. Current analytical tools use curated antibiotic resistance gene (ARG) databases to classify individual sequencing reads or assembled contigs. However, identifying ARGs from raw read data can be time consuming (especially if assembly or alignment is required) and challenging, due to genome rearrangements and mutations. Here, we present the k-mer-based antibiotic gene resistance analyzer (KARGA), a multi-platform Java toolkit for identifying ARGs from metagenomic short read data. KARGA does not perform alignment; it uses an efficient double-lookup strategy, statistical filtering on false positives, and provides individual read classification as well as covering of the database resistome. On simulated data, KARGA's antibiotic resistance class recall is 99.89% for error/mutation rates within 10%, and of 83.37% for error/mutation rates between 10% and 25%, while it is 99.92% on ARGs with rearrangements. On empirical data, KARGA provides higher hit score (≥1.5-fold) than AMRPlusPlus, DeepARG, and MetaMARC. KARGA has also faster runtimes than all other tools (2x faster than AMRPlusPlus, 7x than DeepARG, and over 100x than MetaMARC). KARGA is available under the MIT license at https://github.com/DataIntellSystLab/KARGA.

Abstract Image

KARGA:基于k-mer的抗生素耐药基因分析高通量测序数据的多平台工具包。
高通量测序被广泛用于微生物宏基因组样品的菌株检测和抗生素耐药性鉴定。目前的分析工具使用精心策划的抗生素耐药基因(ARG)数据库对单个测序读段或组装的contigs进行分类。然而,由于基因组重排和突变,从原始读取数据中识别ARGs可能非常耗时(特别是如果需要组装或比对)并且具有挑战性。在这里,我们提出了基于k-mer的抗生素基因耐药性分析仪(KARGA),这是一个多平台Java工具包,用于从宏基因组短读数据中识别ARGs。KARGA不执行对齐;它使用有效的双重查找策略,对误报进行统计过滤,并提供单独的读取分类以及覆盖数据库阻力组。模拟数据显示,当错误/突变率在10%以内时,KARGA的抗生素耐药性类别召回率为99.89%,当错误/突变率在10% - 25%之间时,召回率为83.37%,而对于重排ARGs,召回率为99.92%。在经验数据上,KARGA比AMRPlusPlus、DeepARG和MetaMARC提供更高的命中分数(≥1.5倍)。KARGA的运行速度也比所有其他工具都快(比AMRPlusPlus快2倍,比DeepARG快7倍,比MetaMARC快100倍)。KARGA在MIT许可下可在https://github.com/DataIntellSystLab/KARGA获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信