Pei Yee Woh , Fadjar Soengkono , Yehao Chen , Zati Hakim Azizul Hasan , Siti Nursheena Mohd Zain , Jose Quiroga , Kevin Wing Hin Kwok
{"title":"基因组洞察非伤寒沙门氏菌:预测抗菌素耐药性与全基因组为基础的机器学习。","authors":"Pei Yee Woh , Fadjar Soengkono , Yehao Chen , Zati Hakim Azizul Hasan , Siti Nursheena Mohd Zain , Jose Quiroga , Kevin Wing Hin Kwok","doi":"10.1016/j.ijantimicag.2025.107575","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Nontyphoidal <em>Salmonella</em> is a world-leading foodborne pathogen associated with an increased rate of antimicrobial resistance (AMR) and remains endemic in Asia. Utilizing whole genome sequencing (WGS) could significantly contribute to AMR prediction, from bioinformatic phylogenomic analysis to the advancement of machine learning (ML), leading towards automated AMR diagnostic.</div></div><div><h3>Methods</h3><div>We obtained the <em>Salmonella</em> WGS from the National Centre for Biotechnology Information database and analysed their resistance profiles. We extracted, transformed, and labelled the resistance data with one-hot encoding platform for eXtreme Gradient Boosting (XGBoost) and convolutional neural network (CNN) model construction, training, and evaluation.</div></div><div><h3>Results</h3><div>We selected a total of 788 <em>Salmonella</em> isolates associated with resistance genotype and phenotype data. These isolates had high resistance to aminoglycoside, beta-lactam, phenicol, quinolone, sulphonamide, tetracycline, and trimethoprim. <em>S</em>. Weltevreden ST365 (<em>n</em> = 121) was the most common serovar with the highest occurrence in food products. Through ML, both XGBoost and CNN models enabled highly accurate AMR prediction with performance accuracy of 0.97625 and 0.9904, respectively. Moreover, the interpretation of Shapley Additive exPlanations values uncovers the most valuable genomic features and associated genes for each antimicrobial agent tested.</div></div><div><h3>Conclusions</h3><div>Our study provides new knowledge in demonstrating the AMR phylogeographical relatedness and AMR prediction through XGBoost and CNN with competitive performance. Hence, WGS-based ML prediction and its machine application could be promoted as a promising tool for AMR work in food safety and public health settings.</div></div><div><h3>Video Abstract</h3><div><span><span><span><span><video><source></source></video></span><span><span>Download: <span>Download video (5MB)</span></span></span></span></span></span></div></div>","PeriodicalId":13818,"journal":{"name":"International Journal of Antimicrobial Agents","volume":"66 5","pages":"Article 107575"},"PeriodicalIF":4.6000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Genomic insights into nontyphoidal Salmonella: Prediction of antimicrobial resistance with whole genome-based machine learning\",\"authors\":\"Pei Yee Woh , Fadjar Soengkono , Yehao Chen , Zati Hakim Azizul Hasan , Siti Nursheena Mohd Zain , Jose Quiroga , Kevin Wing Hin Kwok\",\"doi\":\"10.1016/j.ijantimicag.2025.107575\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Nontyphoidal <em>Salmonella</em> is a world-leading foodborne pathogen associated with an increased rate of antimicrobial resistance (AMR) and remains endemic in Asia. Utilizing whole genome sequencing (WGS) could significantly contribute to AMR prediction, from bioinformatic phylogenomic analysis to the advancement of machine learning (ML), leading towards automated AMR diagnostic.</div></div><div><h3>Methods</h3><div>We obtained the <em>Salmonella</em> WGS from the National Centre for Biotechnology Information database and analysed their resistance profiles. We extracted, transformed, and labelled the resistance data with one-hot encoding platform for eXtreme Gradient Boosting (XGBoost) and convolutional neural network (CNN) model construction, training, and evaluation.</div></div><div><h3>Results</h3><div>We selected a total of 788 <em>Salmonella</em> isolates associated with resistance genotype and phenotype data. These isolates had high resistance to aminoglycoside, beta-lactam, phenicol, quinolone, sulphonamide, tetracycline, and trimethoprim. <em>S</em>. Weltevreden ST365 (<em>n</em> = 121) was the most common serovar with the highest occurrence in food products. Through ML, both XGBoost and CNN models enabled highly accurate AMR prediction with performance accuracy of 0.97625 and 0.9904, respectively. Moreover, the interpretation of Shapley Additive exPlanations values uncovers the most valuable genomic features and associated genes for each antimicrobial agent tested.</div></div><div><h3>Conclusions</h3><div>Our study provides new knowledge in demonstrating the AMR phylogeographical relatedness and AMR prediction through XGBoost and CNN with competitive performance. Hence, WGS-based ML prediction and its machine application could be promoted as a promising tool for AMR work in food safety and public health settings.</div></div><div><h3>Video Abstract</h3><div><span><span><span><span><video><source></source></video></span><span><span>Download: <span>Download video (5MB)</span></span></span></span></span></span></div></div>\",\"PeriodicalId\":13818,\"journal\":{\"name\":\"International Journal of Antimicrobial Agents\",\"volume\":\"66 5\",\"pages\":\"Article 107575\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Antimicrobial Agents\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S092485792500130X\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"INFECTIOUS DISEASES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Antimicrobial Agents","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092485792500130X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
Genomic insights into nontyphoidal Salmonella: Prediction of antimicrobial resistance with whole genome-based machine learning
Background
Nontyphoidal Salmonella is a world-leading foodborne pathogen associated with an increased rate of antimicrobial resistance (AMR) and remains endemic in Asia. Utilizing whole genome sequencing (WGS) could significantly contribute to AMR prediction, from bioinformatic phylogenomic analysis to the advancement of machine learning (ML), leading towards automated AMR diagnostic.
Methods
We obtained the Salmonella WGS from the National Centre for Biotechnology Information database and analysed their resistance profiles. We extracted, transformed, and labelled the resistance data with one-hot encoding platform for eXtreme Gradient Boosting (XGBoost) and convolutional neural network (CNN) model construction, training, and evaluation.
Results
We selected a total of 788 Salmonella isolates associated with resistance genotype and phenotype data. These isolates had high resistance to aminoglycoside, beta-lactam, phenicol, quinolone, sulphonamide, tetracycline, and trimethoprim. S. Weltevreden ST365 (n = 121) was the most common serovar with the highest occurrence in food products. Through ML, both XGBoost and CNN models enabled highly accurate AMR prediction with performance accuracy of 0.97625 and 0.9904, respectively. Moreover, the interpretation of Shapley Additive exPlanations values uncovers the most valuable genomic features and associated genes for each antimicrobial agent tested.
Conclusions
Our study provides new knowledge in demonstrating the AMR phylogeographical relatedness and AMR prediction through XGBoost and CNN with competitive performance. Hence, WGS-based ML prediction and its machine application could be promoted as a promising tool for AMR work in food safety and public health settings.
期刊介绍:
The International Journal of Antimicrobial Agents is a peer-reviewed publication offering comprehensive and current reference information on the physical, pharmacological, in vitro, and clinical properties of individual antimicrobial agents, covering antiviral, antiparasitic, antibacterial, and antifungal agents. The journal not only communicates new trends and developments through authoritative review articles but also addresses the critical issue of antimicrobial resistance, both in hospital and community settings. Published content includes solicited reviews by leading experts and high-quality original research papers in the specified fields.