Yichen He, Xiujuan Zhou, Lida Zhang, Yan Cui, Yiping He, Andrew Gehring, Xiangyu Deng, Xianming Shi
{"title":"利用机器学习分析沙门氏菌泛基因组和泛抵抗组特征预测抗生素耐药表型和最低抑菌浓度。","authors":"Yichen He, Xiujuan Zhou, Lida Zhang, Yan Cui, Yiping He, Andrew Gehring, Xiangyu Deng, Xianming Shi","doi":"10.1089/fpd.2024.0170","DOIUrl":null,"url":null,"abstract":"<p><p>Traditional experimental methods for determining antibiotic resistance phenotypes (ARPs) and minimum inhibitory concentrations (MICs) in bacteria are laborious and time consuming. This study aims to explore the potential of whole-genome sequencing data combined with machine learning models for robustly predicting ARPs and MICs in <i>Salmonella</i>. Using a training set of 6394 <i>Salmonella</i> genomes alongside antimicrobial susceptibility testing results, we built two machine learning (ML) predictive models based on the pan-genome and pan-resistome. Each model was implemented using three algorithms: random forest, extreme gradient boosting (XGB), and convolutional neural network. Among them, XGB achieved the highest overall accuracy, with the pan-genome and pan-resistome models accurately predicting ARPs (98.51% and 97.77%) and MICs (81.42% and 78.99%) for 15 commonly used antibiotics. Feature extraction from pan-genome and pan-resistome data effectively reduced computational complexity and significantly decreased computation time. Notably, fewer than 10 key genomic features, often linked to known resistance or mobile genes, were sufficient for robust predictions for each antibiotic. This study also identified challenges, including imbalanced resistance classes and imprecise MIC measurements, which impacted prediction accuracy. These findings highlight the importance of using multiple evaluation metrics to assess model performance comprehensively. Overall, our findings demonstrated that ML, utilizing pan-genome or pan-resistome features, was highly effective in predicting antibiotic resistance and identifying correlated genetic features in <i>Salmonella</i>. This approach holds great potential to supplement conventional culture-based methods for routine surveillance of antibiotic-resistant bacteria.</p>","PeriodicalId":12333,"journal":{"name":"Foodborne pathogens and disease","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of Antibiotic Resistance Phenotypes and Minimum Inhibitory Concentrations in <i>Salmonella</i> Using Machine Learning Analysis of Its Pan-Genome and Pan-Resistome Features.\",\"authors\":\"Yichen He, Xiujuan Zhou, Lida Zhang, Yan Cui, Yiping He, Andrew Gehring, Xiangyu Deng, Xianming Shi\",\"doi\":\"10.1089/fpd.2024.0170\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Traditional experimental methods for determining antibiotic resistance phenotypes (ARPs) and minimum inhibitory concentrations (MICs) in bacteria are laborious and time consuming. This study aims to explore the potential of whole-genome sequencing data combined with machine learning models for robustly predicting ARPs and MICs in <i>Salmonella</i>. Using a training set of 6394 <i>Salmonella</i> genomes alongside antimicrobial susceptibility testing results, we built two machine learning (ML) predictive models based on the pan-genome and pan-resistome. Each model was implemented using three algorithms: random forest, extreme gradient boosting (XGB), and convolutional neural network. Among them, XGB achieved the highest overall accuracy, with the pan-genome and pan-resistome models accurately predicting ARPs (98.51% and 97.77%) and MICs (81.42% and 78.99%) for 15 commonly used antibiotics. Feature extraction from pan-genome and pan-resistome data effectively reduced computational complexity and significantly decreased computation time. Notably, fewer than 10 key genomic features, often linked to known resistance or mobile genes, were sufficient for robust predictions for each antibiotic. This study also identified challenges, including imbalanced resistance classes and imprecise MIC measurements, which impacted prediction accuracy. These findings highlight the importance of using multiple evaluation metrics to assess model performance comprehensively. Overall, our findings demonstrated that ML, utilizing pan-genome or pan-resistome features, was highly effective in predicting antibiotic resistance and identifying correlated genetic features in <i>Salmonella</i>. This approach holds great potential to supplement conventional culture-based methods for routine surveillance of antibiotic-resistant bacteria.</p>\",\"PeriodicalId\":12333,\"journal\":{\"name\":\"Foodborne pathogens and disease\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Foodborne pathogens and disease\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.1089/fpd.2024.0170\",\"RegionNum\":2,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"FOOD SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foodborne pathogens and disease","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1089/fpd.2024.0170","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}
Prediction of Antibiotic Resistance Phenotypes and Minimum Inhibitory Concentrations in Salmonella Using Machine Learning Analysis of Its Pan-Genome and Pan-Resistome Features.
Traditional experimental methods for determining antibiotic resistance phenotypes (ARPs) and minimum inhibitory concentrations (MICs) in bacteria are laborious and time consuming. This study aims to explore the potential of whole-genome sequencing data combined with machine learning models for robustly predicting ARPs and MICs in Salmonella. Using a training set of 6394 Salmonella genomes alongside antimicrobial susceptibility testing results, we built two machine learning (ML) predictive models based on the pan-genome and pan-resistome. Each model was implemented using three algorithms: random forest, extreme gradient boosting (XGB), and convolutional neural network. Among them, XGB achieved the highest overall accuracy, with the pan-genome and pan-resistome models accurately predicting ARPs (98.51% and 97.77%) and MICs (81.42% and 78.99%) for 15 commonly used antibiotics. Feature extraction from pan-genome and pan-resistome data effectively reduced computational complexity and significantly decreased computation time. Notably, fewer than 10 key genomic features, often linked to known resistance or mobile genes, were sufficient for robust predictions for each antibiotic. This study also identified challenges, including imbalanced resistance classes and imprecise MIC measurements, which impacted prediction accuracy. These findings highlight the importance of using multiple evaluation metrics to assess model performance comprehensively. Overall, our findings demonstrated that ML, utilizing pan-genome or pan-resistome features, was highly effective in predicting antibiotic resistance and identifying correlated genetic features in Salmonella. This approach holds great potential to supplement conventional culture-based methods for routine surveillance of antibiotic-resistant bacteria.
期刊介绍:
Foodborne Pathogens and Disease is one of the most inclusive scientific publications on the many disciplines that contribute to food safety. Spanning an array of issues from "farm-to-fork," the Journal bridges the gap between science and policy to reduce the burden of foodborne illness worldwide.
Foodborne Pathogens and Disease coverage includes:
Agroterrorism
Safety of organically grown and genetically modified foods
Emerging pathogens
Emergence of drug resistance
Methods and technology for rapid and accurate detection
Strategies to destroy or control foodborne pathogens
Novel strategies for the prevention and control of plant and animal diseases that impact food safety
Biosecurity issues and the implications of new regulatory guidelines
Impact of changing lifestyles and consumer demands on food safety.