Verity Schaye MD, MHPE, Daniel J. Sartori MD, Lexi Signoriello PhD, Kiran Malhotra MD, Benedict Guzman MS, Bijal Rajput MD, Ilan Reinstein MS, Jesse Burk-Rafel MD, MRes
{"title":"Large language model-based identification of venous thromboembolism diagnostic delays","authors":"Verity Schaye MD, MHPE, Daniel J. Sartori MD, Lexi Signoriello PhD, Kiran Malhotra MD, Benedict Guzman MS, Bijal Rajput MD, Ilan Reinstein MS, Jesse Burk-Rafel MD, MRes","doi":"10.1002/jhm.70194","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Delayed diagnosis of venous thromboembolism (VTE) is prevalent among hospitalized patients, yet case identification is challenging and feedback limited.</p>\n </section>\n \n <section>\n \n <h3> Objective</h3>\n \n <p>To develop a large language model (LLM)-based electronic-trigger to identify VTE diagnostic delays.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>All admissions to internal medicine (IM) residents at NYU Langone Health between January 2022 and December 2023 (<i>n</i> = 20,843) were included. Using an open-source LLM, prompts were validated to detect (1) residents considering VTE in admission notes and (2) VTE confirmation in five types of imaging reports (<i>n</i> = 100 for each prompt validation set). The validated prompts were applied to determine discordance between admission note differential omitting VTE and imaging report confirming VTE. Two hospitalists reviewed discordant cases using a validated tool to identify diagnostic delays. Hospitalizations were labeled as diagnostic delays, in-hospital complication, or false-positive. Based on in-hospital complication and false-positive patterns, exclusion criteria were implemented. Positive predictive value (PPV) and negative predictive value (NPV) were calculated.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The LLM prompts correctly classified admission notes and VTE imaging studies with high accuracy (range 98%–100%, <i>n</i> = 699 VTE cases identified). Of the 137 diagnostic delays the LLM-based electronic-trigger identified, 31 were true-positives, 60 in-hospital complications, and 46 false-positives. 4.4% of all VTE hospitalizations had a diagnostic delay. With the exclusion criteria, the PPV was 48% (95% confidence interval [CI], 35%–62%) and NPV was 95% (95% CI, 87%–98%).</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>We developed the first LLM-based electronic-trigger to identify VTE diagnostic delays, with higher performance than existing non-LLM electronic-triggers. LLM-based approaches can facilitate diagnostic performance feedback and are scalable to other conditions and institutions.</p>\n </section>\n </div>","PeriodicalId":15883,"journal":{"name":"Journal of hospital medicine","volume":"21 4","pages":"391-401"},"PeriodicalIF":2.3000,"publicationDate":"2026-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of hospital medicine","FirstCategoryId":"3","ListUrlMain":"https://shmpublications.onlinelibrary.wiley.com/doi/10.1002/jhm.70194","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/10/7 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Delayed diagnosis of venous thromboembolism (VTE) is prevalent among hospitalized patients, yet case identification is challenging and feedback limited.
Objective
To develop a large language model (LLM)-based electronic-trigger to identify VTE diagnostic delays.
Methods
All admissions to internal medicine (IM) residents at NYU Langone Health between January 2022 and December 2023 (n = 20,843) were included. Using an open-source LLM, prompts were validated to detect (1) residents considering VTE in admission notes and (2) VTE confirmation in five types of imaging reports (n = 100 for each prompt validation set). The validated prompts were applied to determine discordance between admission note differential omitting VTE and imaging report confirming VTE. Two hospitalists reviewed discordant cases using a validated tool to identify diagnostic delays. Hospitalizations were labeled as diagnostic delays, in-hospital complication, or false-positive. Based on in-hospital complication and false-positive patterns, exclusion criteria were implemented. Positive predictive value (PPV) and negative predictive value (NPV) were calculated.
Results
The LLM prompts correctly classified admission notes and VTE imaging studies with high accuracy (range 98%–100%, n = 699 VTE cases identified). Of the 137 diagnostic delays the LLM-based electronic-trigger identified, 31 were true-positives, 60 in-hospital complications, and 46 false-positives. 4.4% of all VTE hospitalizations had a diagnostic delay. With the exclusion criteria, the PPV was 48% (95% confidence interval [CI], 35%–62%) and NPV was 95% (95% CI, 87%–98%).
Conclusions
We developed the first LLM-based electronic-trigger to identify VTE diagnostic delays, with higher performance than existing non-LLM electronic-triggers. LLM-based approaches can facilitate diagnostic performance feedback and are scalable to other conditions and institutions.
期刊介绍:
JHM is a peer-reviewed publication of the Society of Hospital Medicine and is published 12 times per year. JHM publishes manuscripts that address the care of hospitalized adults or children.
Broad areas of interest include (1) Treatments for common inpatient conditions; (2) Approaches to improving perioperative care; (3) Improving care for hospitalized patients with geriatric or pediatric vulnerabilities (such as mobility problems, or those with complex longitudinal care); (4) Evaluation of innovative healthcare delivery or educational models; (5) Approaches to improving the quality, safety, and value of healthcare across the acute- and postacute-continuum of care; and (6) Evaluation of policy and payment changes that affect hospital and postacute care.