Shrirajh Satheakeerthy, Brandon Stretton, James Tsimiklis, Andrew Ec Booth, Sarah Howson, Shaun Evans, Christina Guo, Joshua Kovoor, Aashray Gupta, Christina Gao, Weng Onn Chan, Tim French, Amelia Demopoulos, Alyssa Pradhan, Samuel Gluck, Toby Gilbert, Matthew Blake Roberts, Camille Kotton, Stephen Bacchi
{"title":"Zero-shot large language model application for surgical site infection auditing.","authors":"Shrirajh Satheakeerthy, Brandon Stretton, James Tsimiklis, Andrew Ec Booth, Sarah Howson, Shaun Evans, Christina Guo, Joshua Kovoor, Aashray Gupta, Christina Gao, Weng Onn Chan, Tim French, Amelia Demopoulos, Alyssa Pradhan, Samuel Gluck, Toby Gilbert, Matthew Blake Roberts, Camille Kotton, Stephen Bacchi","doi":"10.1016/j.idh.2025.05.001","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Artificial intelligence, in particular large language models (LLM), may be able to assist with monitoring for surgical site infections (SSI).</p><p><strong>Method: </strong>This retrospective study involved the application of the Llama 3.0 70-billion parameter model to the identification of SSI in a group of all SSI in two metropolitan hospitals from a 4-month period. Randomly selected control patients were chosen as comparators. Clinical inpatient and outpatient progress notes were provided to the LLM individually and classified as indicating an SSI or not. These classifications were then analysed to determine binary performance characteristics and to determine the timing of positive case classification.</p><p><strong>Results: </strong>There was a total of 28 cases in the study, 14 in the case (SSI) group and 14 in the control group. The operations involved in the SSI cases were caesarean section (12/14, 85.7 %) and arthroplasty (2/14, 14.2 %). The LLM had an overall accuracy at the patient-level of 26/28 (93 %). There was a sensitivity of 100 % and specificity of 86%. At the note-level, for the first note flagged by the LLM for each case, 13/14 (92.3 %) were on the same day as, or before, the date noted as the onset of infection as identified by infection control clinicians.</p><p><strong>Conclusions: </strong>The use of LLM for the screening of medical notes for SSI is feasible. Further studies may seek to evaluate the outcomes of LLM when deployed as part of a clinical workflow.</p>","PeriodicalId":94040,"journal":{"name":"Infection, disease & health","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infection, disease & health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.idh.2025.05.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Artificial intelligence, in particular large language models (LLM), may be able to assist with monitoring for surgical site infections (SSI).
Method: This retrospective study involved the application of the Llama 3.0 70-billion parameter model to the identification of SSI in a group of all SSI in two metropolitan hospitals from a 4-month period. Randomly selected control patients were chosen as comparators. Clinical inpatient and outpatient progress notes were provided to the LLM individually and classified as indicating an SSI or not. These classifications were then analysed to determine binary performance characteristics and to determine the timing of positive case classification.
Results: There was a total of 28 cases in the study, 14 in the case (SSI) group and 14 in the control group. The operations involved in the SSI cases were caesarean section (12/14, 85.7 %) and arthroplasty (2/14, 14.2 %). The LLM had an overall accuracy at the patient-level of 26/28 (93 %). There was a sensitivity of 100 % and specificity of 86%. At the note-level, for the first note flagged by the LLM for each case, 13/14 (92.3 %) were on the same day as, or before, the date noted as the onset of infection as identified by infection control clinicians.
Conclusions: The use of LLM for the screening of medical notes for SSI is feasible. Further studies may seek to evaluate the outcomes of LLM when deployed as part of a clinical workflow.