Jun Ren, Jintao Xia, Mengyu Zhang, Chunhong Liu, Yuanyuan Xu, Jianing Wu, Yingzhu Li, Mingming Zhou, Shengjie Li, Wenjun Cao
{"title":"利用MALDI-TOF质谱和机器学习技术自动鉴定沙门氏菌血清型。","authors":"Jun Ren, Jintao Xia, Mengyu Zhang, Chunhong Liu, Yuanyuan Xu, Jianing Wu, Yingzhu Li, Mingming Zhou, Shengjie Li, Wenjun Cao","doi":"10.1128/jcm.00037-25","DOIUrl":null,"url":null,"abstract":"<p><p><i>Salmonella</i> serotyping is essential for epidemiological studies and clinical treatment guidance. However, traditional serological agglutination methods are time-consuming, technically complex, and difficult to adopt at scale. Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) is a rapid and cost-effective microbial identification technique, but it cannot be used to differentiate <i>Salmonella</i> serotypes. This study aims to integrate MALDI-TOF MS with machine learning algorithms to develop and validate a model for <i>Salmonella</i> serotype identification, improving efficiency and simplifying workflows. A total of 692 <i>Salmonella</i> isolates from Children's Hospital, Zhejiang University School of Medicine (ZUCH) and Wanbei Coal-Electricity Group General Hospital (WCGH) were analyzed using MALDI-TOF MS, generating 2,048 spectra. The ZUCH data were randomly divided into training and internal validation sets. The WCGH data were used as an external validation set. Ten machine learning algorithms were evaluated for their ability to identify eight <i>Salmonella</i> serotypes (B, C1, C2/3, D, E, Not A-F, <i>Salmonella</i> Typhimurium, and <i>Salmonella</i> Enteritidis). From 192 initial features, 16 features were selected for the final model construction. XGBoost demonstrated the best discriminative ability (area under the receiver operating characteristic curve [AUC] = 0.9898, sensitivity = 0.88, and specificity = 0.98) for the training set. The streamlined XGBoost model achieved AUCs of 0.9662 and 0.9778 for the internal and external validation sets, respectively, accurately identifying <i>Salmonella</i> serotypes. To enhance usability, the model was deployed as a Streamlit-based application, facilitating interaction and broader application. MALDI-TOF MS combined with XGBoost provides a fast and accurate method for <i>Salmonella</i> serotype identification, offering an efficient solution for laboratory diagnostics and epidemiological studies.</p><p><strong>Importance: </strong><i>Salmonella</i> serotyping is vital for outbreak tracking and clinical guidance, but traditional methods are slow and laborious. This study combines matrix-assisted laser desorption ionization-time of flight mass spectrometry with machine learning (XGBoost) to enable rapid, accurate, and cost-effective serotyping. The streamlined model performed excellently in validation and was deployed as a user-friendly Streamlit app, enhancing usability. This innovation simplifies workflows, reduces diagnostic time, and supports scalable use in clinical and public health settings, improving outbreak response and epidemiological research.</p>","PeriodicalId":15511,"journal":{"name":"Journal of Clinical Microbiology","volume":" ","pages":"e0003725"},"PeriodicalIF":6.1000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12239726/pdf/","citationCount":"0","resultStr":"{\"title\":\"Automated identification of <i>Salmonella</i> serotype using MALDI-TOF mass spectrometry and machine learning techniques.\",\"authors\":\"Jun Ren, Jintao Xia, Mengyu Zhang, Chunhong Liu, Yuanyuan Xu, Jianing Wu, Yingzhu Li, Mingming Zhou, Shengjie Li, Wenjun Cao\",\"doi\":\"10.1128/jcm.00037-25\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><i>Salmonella</i> serotyping is essential for epidemiological studies and clinical treatment guidance. However, traditional serological agglutination methods are time-consuming, technically complex, and difficult to adopt at scale. Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) is a rapid and cost-effective microbial identification technique, but it cannot be used to differentiate <i>Salmonella</i> serotypes. This study aims to integrate MALDI-TOF MS with machine learning algorithms to develop and validate a model for <i>Salmonella</i> serotype identification, improving efficiency and simplifying workflows. A total of 692 <i>Salmonella</i> isolates from Children's Hospital, Zhejiang University School of Medicine (ZUCH) and Wanbei Coal-Electricity Group General Hospital (WCGH) were analyzed using MALDI-TOF MS, generating 2,048 spectra. The ZUCH data were randomly divided into training and internal validation sets. The WCGH data were used as an external validation set. Ten machine learning algorithms were evaluated for their ability to identify eight <i>Salmonella</i> serotypes (B, C1, C2/3, D, E, Not A-F, <i>Salmonella</i> Typhimurium, and <i>Salmonella</i> Enteritidis). From 192 initial features, 16 features were selected for the final model construction. XGBoost demonstrated the best discriminative ability (area under the receiver operating characteristic curve [AUC] = 0.9898, sensitivity = 0.88, and specificity = 0.98) for the training set. The streamlined XGBoost model achieved AUCs of 0.9662 and 0.9778 for the internal and external validation sets, respectively, accurately identifying <i>Salmonella</i> serotypes. To enhance usability, the model was deployed as a Streamlit-based application, facilitating interaction and broader application. MALDI-TOF MS combined with XGBoost provides a fast and accurate method for <i>Salmonella</i> serotype identification, offering an efficient solution for laboratory diagnostics and epidemiological studies.</p><p><strong>Importance: </strong><i>Salmonella</i> serotyping is vital for outbreak tracking and clinical guidance, but traditional methods are slow and laborious. This study combines matrix-assisted laser desorption ionization-time of flight mass spectrometry with machine learning (XGBoost) to enable rapid, accurate, and cost-effective serotyping. The streamlined model performed excellently in validation and was deployed as a user-friendly Streamlit app, enhancing usability. This innovation simplifies workflows, reduces diagnostic time, and supports scalable use in clinical and public health settings, improving outbreak response and epidemiological research.</p>\",\"PeriodicalId\":15511,\"journal\":{\"name\":\"Journal of Clinical Microbiology\",\"volume\":\" \",\"pages\":\"e0003725\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2025-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12239726/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Clinical Microbiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1128/jcm.00037-25\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/11 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Microbiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1128/jcm.00037-25","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/11 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
Automated identification of Salmonella serotype using MALDI-TOF mass spectrometry and machine learning techniques.
Salmonella serotyping is essential for epidemiological studies and clinical treatment guidance. However, traditional serological agglutination methods are time-consuming, technically complex, and difficult to adopt at scale. Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) is a rapid and cost-effective microbial identification technique, but it cannot be used to differentiate Salmonella serotypes. This study aims to integrate MALDI-TOF MS with machine learning algorithms to develop and validate a model for Salmonella serotype identification, improving efficiency and simplifying workflows. A total of 692 Salmonella isolates from Children's Hospital, Zhejiang University School of Medicine (ZUCH) and Wanbei Coal-Electricity Group General Hospital (WCGH) were analyzed using MALDI-TOF MS, generating 2,048 spectra. The ZUCH data were randomly divided into training and internal validation sets. The WCGH data were used as an external validation set. Ten machine learning algorithms were evaluated for their ability to identify eight Salmonella serotypes (B, C1, C2/3, D, E, Not A-F, Salmonella Typhimurium, and Salmonella Enteritidis). From 192 initial features, 16 features were selected for the final model construction. XGBoost demonstrated the best discriminative ability (area under the receiver operating characteristic curve [AUC] = 0.9898, sensitivity = 0.88, and specificity = 0.98) for the training set. The streamlined XGBoost model achieved AUCs of 0.9662 and 0.9778 for the internal and external validation sets, respectively, accurately identifying Salmonella serotypes. To enhance usability, the model was deployed as a Streamlit-based application, facilitating interaction and broader application. MALDI-TOF MS combined with XGBoost provides a fast and accurate method for Salmonella serotype identification, offering an efficient solution for laboratory diagnostics and epidemiological studies.
Importance: Salmonella serotyping is vital for outbreak tracking and clinical guidance, but traditional methods are slow and laborious. This study combines matrix-assisted laser desorption ionization-time of flight mass spectrometry with machine learning (XGBoost) to enable rapid, accurate, and cost-effective serotyping. The streamlined model performed excellently in validation and was deployed as a user-friendly Streamlit app, enhancing usability. This innovation simplifies workflows, reduces diagnostic time, and supports scalable use in clinical and public health settings, improving outbreak response and epidemiological research.
期刊介绍:
The Journal of Clinical Microbiology® disseminates the latest research concerning the laboratory diagnosis of human and animal infections, along with the laboratory's role in epidemiology and the management of infectious diseases.