基因组洞察非伤寒沙门氏菌:预测抗菌素耐药性与全基因组为基础的机器学习。

IF 4.6 2区 医学 Q1 INFECTIOUS DISEASES
Pei Yee Woh , Fadjar Soengkono , Yehao Chen , Zati Hakim Azizul Hasan , Siti Nursheena Mohd Zain , Jose Quiroga , Kevin Wing Hin Kwok
{"title":"基因组洞察非伤寒沙门氏菌:预测抗菌素耐药性与全基因组为基础的机器学习。","authors":"Pei Yee Woh ,&nbsp;Fadjar Soengkono ,&nbsp;Yehao Chen ,&nbsp;Zati Hakim Azizul Hasan ,&nbsp;Siti Nursheena Mohd Zain ,&nbsp;Jose Quiroga ,&nbsp;Kevin Wing Hin Kwok","doi":"10.1016/j.ijantimicag.2025.107575","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Nontyphoidal <em>Salmonella</em> is a world-leading foodborne pathogen associated with an increased rate of antimicrobial resistance (AMR) and remains endemic in Asia. Utilizing whole genome sequencing (WGS) could significantly contribute to AMR prediction, from bioinformatic phylogenomic analysis to the advancement of machine learning (ML), leading towards automated AMR diagnostic.</div></div><div><h3>Methods</h3><div>We obtained the <em>Salmonella</em> WGS from the National Centre for Biotechnology Information database and analysed their resistance profiles. We extracted, transformed, and labelled the resistance data with one-hot encoding platform for eXtreme Gradient Boosting (XGBoost) and convolutional neural network (CNN) model construction, training, and evaluation.</div></div><div><h3>Results</h3><div>We selected a total of 788 <em>Salmonella</em> isolates associated with resistance genotype and phenotype data. These isolates had high resistance to aminoglycoside, beta-lactam, phenicol, quinolone, sulphonamide, tetracycline, and trimethoprim. <em>S</em>. Weltevreden ST365 (<em>n</em> = 121) was the most common serovar with the highest occurrence in food products. Through ML, both XGBoost and CNN models enabled highly accurate AMR prediction with performance accuracy of 0.97625 and 0.9904, respectively. Moreover, the interpretation of Shapley Additive exPlanations values uncovers the most valuable genomic features and associated genes for each antimicrobial agent tested.</div></div><div><h3>Conclusions</h3><div>Our study provides new knowledge in demonstrating the AMR phylogeographical relatedness and AMR prediction through XGBoost and CNN with competitive performance. Hence, WGS-based ML prediction and its machine application could be promoted as a promising tool for AMR work in food safety and public health settings.</div></div><div><h3>Video Abstract</h3><div><span><span><span><span><video><source></source></video></span><span><span>Download: <span>Download video (5MB)</span></span></span></span></span></span></div></div>","PeriodicalId":13818,"journal":{"name":"International Journal of Antimicrobial Agents","volume":"66 5","pages":"Article 107575"},"PeriodicalIF":4.6000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Genomic insights into nontyphoidal Salmonella: Prediction of antimicrobial resistance with whole genome-based machine learning\",\"authors\":\"Pei Yee Woh ,&nbsp;Fadjar Soengkono ,&nbsp;Yehao Chen ,&nbsp;Zati Hakim Azizul Hasan ,&nbsp;Siti Nursheena Mohd Zain ,&nbsp;Jose Quiroga ,&nbsp;Kevin Wing Hin Kwok\",\"doi\":\"10.1016/j.ijantimicag.2025.107575\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Nontyphoidal <em>Salmonella</em> is a world-leading foodborne pathogen associated with an increased rate of antimicrobial resistance (AMR) and remains endemic in Asia. Utilizing whole genome sequencing (WGS) could significantly contribute to AMR prediction, from bioinformatic phylogenomic analysis to the advancement of machine learning (ML), leading towards automated AMR diagnostic.</div></div><div><h3>Methods</h3><div>We obtained the <em>Salmonella</em> WGS from the National Centre for Biotechnology Information database and analysed their resistance profiles. We extracted, transformed, and labelled the resistance data with one-hot encoding platform for eXtreme Gradient Boosting (XGBoost) and convolutional neural network (CNN) model construction, training, and evaluation.</div></div><div><h3>Results</h3><div>We selected a total of 788 <em>Salmonella</em> isolates associated with resistance genotype and phenotype data. These isolates had high resistance to aminoglycoside, beta-lactam, phenicol, quinolone, sulphonamide, tetracycline, and trimethoprim. <em>S</em>. Weltevreden ST365 (<em>n</em> = 121) was the most common serovar with the highest occurrence in food products. Through ML, both XGBoost and CNN models enabled highly accurate AMR prediction with performance accuracy of 0.97625 and 0.9904, respectively. Moreover, the interpretation of Shapley Additive exPlanations values uncovers the most valuable genomic features and associated genes for each antimicrobial agent tested.</div></div><div><h3>Conclusions</h3><div>Our study provides new knowledge in demonstrating the AMR phylogeographical relatedness and AMR prediction through XGBoost and CNN with competitive performance. Hence, WGS-based ML prediction and its machine application could be promoted as a promising tool for AMR work in food safety and public health settings.</div></div><div><h3>Video Abstract</h3><div><span><span><span><span><video><source></source></video></span><span><span>Download: <span>Download video (5MB)</span></span></span></span></span></span></div></div>\",\"PeriodicalId\":13818,\"journal\":{\"name\":\"International Journal of Antimicrobial Agents\",\"volume\":\"66 5\",\"pages\":\"Article 107575\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Antimicrobial Agents\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S092485792500130X\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"INFECTIOUS DISEASES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Antimicrobial Agents","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092485792500130X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0

摘要

背景:非伤寒沙门氏菌是一种世界领先的食源性病原体,与抗菌素耐药性(AMR)的增加有关,并且在亚洲仍然流行。利用全基因组测序(WGS)可以显著有助于AMR预测,从生物信息学系统基因组分析到机器学习的进步,导致自动AMR诊断。方法:从NCBI数据库中获取WGS沙门氏菌,对其耐药谱进行分析。我们利用极限梯度增强(XGBoost)和卷积神经网络(CNN)模型构建、训练和评估的一热编码平台对阻力数据进行提取、转换和标记。结果:共筛选出788株与耐药基因型和表型相关的沙门氏菌分离株。这些分离株对氨基糖苷、β -内酰胺、苯酚、喹诺酮、磺胺、四环素和甲氧苄啶具有高耐药性。S. Weltevreden ST365 (n=121)是最常见的血清型,在食品中的发生率最高。通过机器学习,XGBoost和CNN模型的AMR预测精度都很高,性能精度分别为0.97625和0.9904。此外,Shapley加性解释(SHAP)值的解释揭示了每种抗微生物药物最有价值的基因组特征和相关基因。结论:本研究为利用XGBoost和具有竞争性能的CNN验证AMR的系统地理相关性和AMR预测提供了新的知识。因此,基于wgs的机器学习预测及其机器应用可以作为食品安全和公共卫生环境中抗菌素耐药性工作的有前途的工具得到推广。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Genomic insights into nontyphoidal Salmonella: Prediction of antimicrobial resistance with whole genome-based machine learning

Genomic insights into nontyphoidal Salmonella: Prediction of antimicrobial resistance with whole genome-based machine learning

Background

Nontyphoidal Salmonella is a world-leading foodborne pathogen associated with an increased rate of antimicrobial resistance (AMR) and remains endemic in Asia. Utilizing whole genome sequencing (WGS) could significantly contribute to AMR prediction, from bioinformatic phylogenomic analysis to the advancement of machine learning (ML), leading towards automated AMR diagnostic.

Methods

We obtained the Salmonella WGS from the National Centre for Biotechnology Information database and analysed their resistance profiles. We extracted, transformed, and labelled the resistance data with one-hot encoding platform for eXtreme Gradient Boosting (XGBoost) and convolutional neural network (CNN) model construction, training, and evaluation.

Results

We selected a total of 788 Salmonella isolates associated with resistance genotype and phenotype data. These isolates had high resistance to aminoglycoside, beta-lactam, phenicol, quinolone, sulphonamide, tetracycline, and trimethoprim. S. Weltevreden ST365 (n = 121) was the most common serovar with the highest occurrence in food products. Through ML, both XGBoost and CNN models enabled highly accurate AMR prediction with performance accuracy of 0.97625 and 0.9904, respectively. Moreover, the interpretation of Shapley Additive exPlanations values uncovers the most valuable genomic features and associated genes for each antimicrobial agent tested.

Conclusions

Our study provides new knowledge in demonstrating the AMR phylogeographical relatedness and AMR prediction through XGBoost and CNN with competitive performance. Hence, WGS-based ML prediction and its machine application could be promoted as a promising tool for AMR work in food safety and public health settings.

Video Abstract

Download: Download video (5MB)
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
21.60
自引率
0.90%
发文量
176
审稿时长
36 days
期刊介绍: The International Journal of Antimicrobial Agents is a peer-reviewed publication offering comprehensive and current reference information on the physical, pharmacological, in vitro, and clinical properties of individual antimicrobial agents, covering antiviral, antiparasitic, antibacterial, and antifungal agents. The journal not only communicates new trends and developments through authoritative review articles but also addresses the critical issue of antimicrobial resistance, both in hospital and community settings. Published content includes solicited reviews by leading experts and high-quality original research papers in the specified fields.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信