Jee In Kim, Alexander Manuele, Finlay Maguire, Rahat Zaheer, Tim A McAllister, Robert G Beiko
{"title":"Identification of key drivers of antimicrobial resistance in <i>Enterococcus</i> using machine learning.","authors":"Jee In Kim, Alexander Manuele, Finlay Maguire, Rahat Zaheer, Tim A McAllister, Robert G Beiko","doi":"10.1139/cjm-2024-0049","DOIUrl":null,"url":null,"abstract":"<p><p>With antimicrobial resistance (AMR) rapidly evolving in pathogens, quick and accurate identification of genetic determinants of phenotypic resistance is essential for improving surveillance, stewardship, and clinical mitigation. Machine learning (ML) models show promise for AMR prediction in diagnostics but require a deep understanding of internal processes to use effectively. Our study utilised AMR gene, pangenomic, and predicted plasmid features from 647 <i>Enterococcus faecium</i> and <i>Enterococcus</i> <i>faecalis</i> genomes across the One Health continuum, along with corresponding resistance phenotypes, to develop interpretive ML classifiers. Vancomycin resistance could be predicted with 99% accuracy with AMR gene features, 98% with pangenome features, and 96% with plasmid clusters. Top pangenome features overlapped with the resistance genes of the <i>vanA</i> operon, which are often laterally transmitted via plasmids. Doxycycline resistance prediction achieved approximately 92% accuracy with pangenome features, with the top feature being elements of Tn<i>916</i> conjugative transposon, a <i>tet</i>(M) carrier. Erythromycin resistance prediction models achieved about 90% accuracy, but top features were negatively correlated with resistance due to the confounding effect of population structure. This work demonstrates the importance of reviewing ML models' features to discern biological relevance even when achieving high-performance metrics. Our workflow offers the potential to propose hypotheses for experimental testing, enhancing the understanding of AMR mechanisms, which are crucial for combating the AMR crisis.</p>","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1139/cjm-2024-0049","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/30 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
With antimicrobial resistance (AMR) rapidly evolving in pathogens, quick and accurate identification of genetic determinants of phenotypic resistance is essential for improving surveillance, stewardship, and clinical mitigation. Machine learning (ML) models show promise for AMR prediction in diagnostics but require a deep understanding of internal processes to use effectively. Our study utilised AMR gene, pangenomic, and predicted plasmid features from 647 Enterococcus faecium and Enterococcusfaecalis genomes across the One Health continuum, along with corresponding resistance phenotypes, to develop interpretive ML classifiers. Vancomycin resistance could be predicted with 99% accuracy with AMR gene features, 98% with pangenome features, and 96% with plasmid clusters. Top pangenome features overlapped with the resistance genes of the vanA operon, which are often laterally transmitted via plasmids. Doxycycline resistance prediction achieved approximately 92% accuracy with pangenome features, with the top feature being elements of Tn916 conjugative transposon, a tet(M) carrier. Erythromycin resistance prediction models achieved about 90% accuracy, but top features were negatively correlated with resistance due to the confounding effect of population structure. This work demonstrates the importance of reviewing ML models' features to discern biological relevance even when achieving high-performance metrics. Our workflow offers the potential to propose hypotheses for experimental testing, enhancing the understanding of AMR mechanisms, which are crucial for combating the AMR crisis.
随着病原体中抗菌药耐药性(AMR)的快速发展,快速准确地识别表型耐药性的基因决定因素对于改善监控、管理和临床缓解至关重要。机器学习(ML)模型为诊断中的 AMR 预测带来了希望,但需要深入了解内部过程才能有效使用。我们的研究利用 "一个健康 "连续体中 647 个粪肠球菌和粪肠球菌基因组的 AMR 基因、泛基因组和预测质粒特征,以及相应的耐药性表型,开发了解释性 ML 分类器。利用 AMR 基因特征预测万古霉素耐药性的准确率为 99%,利用泛基因组特征预测准确率为 98%,利用质粒群预测准确率为 96%。pangenome的主要特征与vanA操作子的抗性基因重叠,而vanA操作子通常通过质粒横向传播。利用泛基因组特征预测强力霉素耐药性的准确率约为 92%,其中最重要的特征是 Tn916 共轭转座子(一种 tet(M) 载体)的元素。红霉素耐药性预测模型的准确率约为 90%,但由于种群结构的混杂效应,首要特征与耐药性呈负相关。这项工作表明,即使实现了高性能指标,审查 ML 模型的特征以辨别生物学相关性也很重要。我们的工作流程有可能为实验测试提出假设,从而加深对 AMR 机制的了解,这对抗击 AMR 危机至关重要。
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.