Scott A Cohen, Ziyi Chen, Jiang Bian, Christina Boucher, Yonghui Wu, Mattia Prosperi
{"title":"Comparative Evaluation of Clinical Large Language Models and Machine Learning to Predict Antimicrobial Resistance in Hospital-Onset Sepsis.","authors":"Scott A Cohen, Ziyi Chen, Jiang Bian, Christina Boucher, Yonghui Wu, Mattia Prosperi","doi":"10.1007/978-3-031-95838-0_7","DOIUrl":null,"url":null,"abstract":"<p><p>Approaches to guide empiric antimicrobial therapy are needed, especially in critically ill populations with prevalent antimicrobial resistance (AMR). While artificial intelligence shows promise in predicting AMR, scalable and generalizable prediction models are essential for broad clinical adoption. We utilized a publicly available clinical large language model (LLM), Gatortron, in comparison to traditional machine learning, to predict AMR and methicillin-resistant <i>Staphylococcus aureus</i> (MRSA)-specific patterns within a hospital-onset sepsis cohort using electronic health record (EHR) data available at time of illness onset. EHR data from approximately 150,000 hospitalizations with a documented bacterial infection at a large tertiary care healthcare system between 2010 and 2023 were examined. Among 2,019 eligible hospital-onset sepsis encounters, an AMR pathogen was identified in 911 (45%) and MRSA was isolated in 234 (26%). LLMs outperformed traditional models in predicting MRSA, achieving an AUC of 0.73 compared to 0.66 for the best traditional ML model, with superior F1 scores (0.43 vs. 0.16 for ML). Negative predictive value for MRSA prediction using LLM was at least 90% across majority of infection presentations. The LLM's superior prediction using a relatively simplified feature set demonstrates the potential of leveraging EHR data for early resistance prediction, though further refinement is needed to enhance sensitivity and clinical applicability.</p>","PeriodicalId":72303,"journal":{"name":"Artificial intelligence in medicine. Conference on Artificial Intelligence in Medicine (2005- )","volume":"15734 ","pages":"65-76"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12433606/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence in medicine. Conference on Artificial Intelligence in Medicine (2005- )","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-95838-0_7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/23 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Approaches to guide empiric antimicrobial therapy are needed, especially in critically ill populations with prevalent antimicrobial resistance (AMR). While artificial intelligence shows promise in predicting AMR, scalable and generalizable prediction models are essential for broad clinical adoption. We utilized a publicly available clinical large language model (LLM), Gatortron, in comparison to traditional machine learning, to predict AMR and methicillin-resistant Staphylococcus aureus (MRSA)-specific patterns within a hospital-onset sepsis cohort using electronic health record (EHR) data available at time of illness onset. EHR data from approximately 150,000 hospitalizations with a documented bacterial infection at a large tertiary care healthcare system between 2010 and 2023 were examined. Among 2,019 eligible hospital-onset sepsis encounters, an AMR pathogen was identified in 911 (45%) and MRSA was isolated in 234 (26%). LLMs outperformed traditional models in predicting MRSA, achieving an AUC of 0.73 compared to 0.66 for the best traditional ML model, with superior F1 scores (0.43 vs. 0.16 for ML). Negative predictive value for MRSA prediction using LLM was at least 90% across majority of infection presentations. The LLM's superior prediction using a relatively simplified feature set demonstrates the potential of leveraging EHR data for early resistance prediction, though further refinement is needed to enhance sensitivity and clinical applicability.