Julian G. Saliba, Wenshu Zheng, Qingbo Shu, Liqiang Li, Chi Wu, Yi Xie, Christopher J. Lyon, Jiuxin Qu, Hairong Huang, Binwu Ying, Tony Ye Hu
{"title":"Enhanced diagnosis of multi-drug-resistant microbes using group association modeling and machine learning","authors":"Julian G. Saliba, Wenshu Zheng, Qingbo Shu, Liqiang Li, Chi Wu, Yi Xie, Christopher J. Lyon, Jiuxin Qu, Hairong Huang, Binwu Ying, Tony Ye Hu","doi":"10.1038/s41467-025-58214-6","DOIUrl":null,"url":null,"abstract":"<p>New solutions are needed to detect genotype-phenotype associations involved in microbial drug resistance. Herein, we describe a Group Association Model (GAM) that accurately identifies genetic variants linked to drug resistance and mitigates false-positive cross-resistance artifacts without prior knowledge. GAM analysis of 7,179 <i>Mycobacterium tuberculosis</i> (<i>Mtb</i>) isolates identifies gene targets for all analyzed drugs, revealing comparable performance but fewer cross-resistance artifacts than World Health Organization (WHO) mutation catalogue approach, which requires expert rules and precedents. GAM also reveals generalizability, demonstrating high predictive accuracy with 3,942 <i>S. aureus</i> isolates. GAM refinement by machine learning (ML) improves predictive accuracy with small or incomplete datasets. These findings were validated using 427 <i>Mtb</i> isolates from three sites, where GAM inputs are also found to be more suitable in ML prediction models than WHO inputs. GAM + ML could thus address the limitations of current drug resistance prediction methods to improve treatment decisions for drug-resistant microbial infections.</p>","PeriodicalId":19066,"journal":{"name":"Nature Communications","volume":"9 1","pages":""},"PeriodicalIF":14.7000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Communications","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41467-025-58214-6","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
摘要
需要新的解决方案来检测涉及微生物耐药性的基因型-表型关联。在本文中,我们介绍了一种群体关联模型(GAM),它能准确识别与耐药性相关的基因变异,并在不事先了解的情况下减少交叉耐药性假阳性伪影。对 7,179 株结核分枝杆菌(Mtb)分离株进行的 GAM 分析确定了所有分析药物的基因靶点,与需要专家规则和先例的世界卫生组织(WHO)突变目录方法相比,GAM 性能相当,但交叉耐药性伪影较少。GAM 还具有通用性,在 3,942 例金黄色葡萄球菌分离物中显示出较高的预测准确性。通过机器学习(ML)对 GAM 进行改进,提高了小数据集或不完整数据集的预测准确性。使用来自三个地点的 427 个分离出的金黄色葡萄球菌验证了这些发现,发现 GAM 输入比 WHO 输入更适合 ML 预测模型。因此,GAM + ML 可以解决目前耐药性预测方法的局限性,从而改善耐药性微生物感染的治疗决策。
Enhanced diagnosis of multi-drug-resistant microbes using group association modeling and machine learning
New solutions are needed to detect genotype-phenotype associations involved in microbial drug resistance. Herein, we describe a Group Association Model (GAM) that accurately identifies genetic variants linked to drug resistance and mitigates false-positive cross-resistance artifacts without prior knowledge. GAM analysis of 7,179 Mycobacterium tuberculosis (Mtb) isolates identifies gene targets for all analyzed drugs, revealing comparable performance but fewer cross-resistance artifacts than World Health Organization (WHO) mutation catalogue approach, which requires expert rules and precedents. GAM also reveals generalizability, demonstrating high predictive accuracy with 3,942 S. aureus isolates. GAM refinement by machine learning (ML) improves predictive accuracy with small or incomplete datasets. These findings were validated using 427 Mtb isolates from three sites, where GAM inputs are also found to be more suitable in ML prediction models than WHO inputs. GAM + ML could thus address the limitations of current drug resistance prediction methods to improve treatment decisions for drug-resistant microbial infections.
期刊介绍:
Nature Communications, an open-access journal, publishes high-quality research spanning all areas of the natural sciences. Papers featured in the journal showcase significant advances relevant to specialists in each respective field. With a 2-year impact factor of 16.6 (2022) and a median time of 8 days from submission to the first editorial decision, Nature Communications is committed to rapid dissemination of research findings. As a multidisciplinary journal, it welcomes contributions from biological, health, physical, chemical, Earth, social, mathematical, applied, and engineering sciences, aiming to highlight important breakthroughs within each domain.