从合成临床记录自动识别脑卒中溶栓禁忌症-概念验证研究。

IF 2.1 Q3 PERIPHERAL VASCULAR DISEASE

Cerebrovascular Diseases Extra Pub Date : 2025-01-01 Epub Date: 2025-03-17 DOI:10.1159/000545317

Bing Yu Chen, Fares Antaki, Marco Gonzalez, Ken Uchino, Samer Albahra, Scott Robertson, Sidonie Ibrikji, Eric Aube, Andrew Russman, Muhammad Shazam Hussain

{"title":"从合成临床记录自动识别脑卒中溶栓禁忌症-概念验证研究。","authors":"Bing Yu Chen, Fares Antaki, Marco Gonzalez, Ken Uchino, Samer Albahra, Scott Robertson, Sidonie Ibrikji, Eric Aube, Andrew Russman, Muhammad Shazam Hussain","doi":"10.1159/000545317","DOIUrl":null,"url":null,"abstract":"Introduction: Timely thrombolytic therapy improves outcomes in acute ischemic stroke. Manual chart review to screen for thrombolysis contraindications may be time-consuming and prone to errors. We developed and tested a large language model (LLM)-based tool to identify thrombolysis contraindications from clinical notes using synthetic data in a proof-of-concept study.Methods: We generated 150 synthetic clinical notes containing randomly assigned thrombolysis contraindications using LLMs. We then used Llama 3.1 405B with a custom prompt to generate a list of thrombolysis contraindications from each note. Performance was evaluated using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1 score.Results: A total of 150 synthetic notes were generated using five different models: ChatGPT-4o, Llama 3.1 405B, Llama 3.1 70B, ChatGPT-4o mini, and Gemini 1.5 Flash. On average, each note contained 241.6 words (SD 110.7; range 80-549) and included 1.5 contraindications (SD 1.1; range 0-5). Our tool achieved a sensitivity of 90.9% (95% CI: 86.3%-94.3%), specificity of 99.2% (95% CI: 98.8%-99.5%), PPV of 87.7% (95% CI: 82.7%-91.7%), NPV of 99.4% (95% CI: 99.1%-99.6%), accuracy of 98.7% (95% CI: 98.2%-99.0%), and an F1 score of 0.892. Among the false positives, 24 (86%) were due to the inclusion of irrelevant contraindications, and 4 (14%) resulted from repetitive information. No hallucinations were observed.Conclusion: Our LLM-based tool may identify stroke thrombolysis contraindications from synthetic clinical notes with high sensitivity and PPV. Future studies will validate its performance using real EMR data and integrate it into acute stroke workflows to facilitate faster and safer thrombolysis decision-making.","PeriodicalId":45709,"journal":{"name":"Cerebrovascular Diseases Extra","volume":" ","pages":"130-136"},"PeriodicalIF":2.1000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12021381/pdf/","citationCount":"0","resultStr":"{\"title\":\"Automated Identification of Stroke Thrombolysis Contraindications from Synthetic Clinical Notes: A Proof-of-Concept Study.\",\"authors\":\"Bing Yu Chen, Fares Antaki, Marco Gonzalez, Ken Uchino, Samer Albahra, Scott Robertson, Sidonie Ibrikji, Eric Aube, Andrew Russman, Muhammad Shazam Hussain\",\"doi\":\"10.1159/000545317\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Timely thrombolytic therapy improves outcomes in acute ischemic stroke. Manual chart review to screen for thrombolysis contraindications may be time-consuming and prone to errors. We developed and tested a large language model (LLM)-based tool to identify thrombolysis contraindications from clinical notes using synthetic data in a proof-of-concept study.Methods: We generated 150 synthetic clinical notes containing randomly assigned thrombolysis contraindications using LLMs. We then used Llama 3.1 405B with a custom prompt to generate a list of thrombolysis contraindications from each note. Performance was evaluated using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1 score.Results: A total of 150 synthetic notes were generated using five different models: ChatGPT-4o, Llama 3.1 405B, Llama 3.1 70B, ChatGPT-4o mini, and Gemini 1.5 Flash. On average, each note contained 241.6 words (SD 110.7; range 80-549) and included 1.5 contraindications (SD 1.1; range 0-5). Our tool achieved a sensitivity of 90.9% (95% CI: 86.3%-94.3%), specificity of 99.2% (95% CI: 98.8%-99.5%), PPV of 87.7% (95% CI: 82.7%-91.7%), NPV of 99.4% (95% CI: 99.1%-99.6%), accuracy of 98.7% (95% CI: 98.2%-99.0%), and an F1 score of 0.892. Among the false positives, 24 (86%) were due to the inclusion of irrelevant contraindications, and 4 (14%) resulted from repetitive information. No hallucinations were observed.Conclusion: Our LLM-based tool may identify stroke thrombolysis contraindications from synthetic clinical notes with high sensitivity and PPV. Future studies will validate its performance using real EMR data and integrate it into acute stroke workflows to facilitate faster and safer thrombolysis decision-making.\",\"PeriodicalId\":45709,\"journal\":{\"name\":\"Cerebrovascular Diseases Extra\",\"volume\":\" \",\"pages\":\"130-136\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12021381/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cerebrovascular Diseases Extra\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1159/000545317\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/3/17 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"PERIPHERAL VASCULAR DISEASE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cerebrovascular Diseases Extra","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1159/000545317","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/17 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"PERIPHERAL VASCULAR DISEASE","Score":null,"Total":0}

引用次数: 0

摘要

背景：及时溶栓治疗可改善急性缺血性卒中的预后。手动图表审查筛选溶栓禁忌症可能是耗时和容易出错。我们开发并测试了一种基于大型语言模型（LLM）的工具，用于在概念验证研究中使用合成数据从临床记录中识别溶栓禁忌症。方法：我们合成了150个临床记录，其中包含随机分配的使用LLMs的溶栓禁忌症。然后，我们使用Llama 3.1 405B和自定义提示符，从每个注释生成溶栓禁忌症列表。通过敏感性、特异性、阳性预测值（PPV）、阴性预测值（NPV）、准确性和F1评分来评估疗效。结果：使用chatgpt - 40、Llama 3.1 405B、Llama 3.1 70B、chatgpt - 40 mini和Gemini 1.5 Flash五种不同的模型共生成了150个合成音符。平均每个笔记包含241.6个单词(SD 110.7；范围80-549)，包括1.5个禁忌症(SD 1.1；范围0 - 5)。该工具的灵敏度为90.9% (95% CI: 86.3%-94.3%)，特异性为99.2% (95% CI: 98.8%-99.5%)， PPV为87.7% (95% CI: 82.7%-91.7%)， NPV为99.4% (95% CI: 99.1%-99.6%)，准确率为98.7% (95% CI: 98.2%-99.0%)， F1评分为0.892。在假阳性中，24例（86%）是由于纳入了不相关的禁忌症，4例（14%）是由于重复信息。没有观察到任何幻觉。结论：我们基于llm的工具可以从综合临床记录中识别出高灵敏度和PPV的脑卒中溶栓禁忌症。未来的研究将使用真实的EMR数据验证其性能，并将其整合到急性卒中工作流程中，以促进更快、更安全的溶栓决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automated Identification of Stroke Thrombolysis Contraindications from Synthetic Clinical Notes: A Proof-of-Concept Study.

Introduction: Timely thrombolytic therapy improves outcomes in acute ischemic stroke. Manual chart review to screen for thrombolysis contraindications may be time-consuming and prone to errors. We developed and tested a large language model (LLM)-based tool to identify thrombolysis contraindications from clinical notes using synthetic data in a proof-of-concept study.

Methods: We generated 150 synthetic clinical notes containing randomly assigned thrombolysis contraindications using LLMs. We then used Llama 3.1 405B with a custom prompt to generate a list of thrombolysis contraindications from each note. Performance was evaluated using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1 score.

Results: A total of 150 synthetic notes were generated using five different models: ChatGPT-4o, Llama 3.1 405B, Llama 3.1 70B, ChatGPT-4o mini, and Gemini 1.5 Flash. On average, each note contained 241.6 words (SD 110.7; range 80-549) and included 1.5 contraindications (SD 1.1; range 0-5). Our tool achieved a sensitivity of 90.9% (95% CI: 86.3%-94.3%), specificity of 99.2% (95% CI: 98.8%-99.5%), PPV of 87.7% (95% CI: 82.7%-91.7%), NPV of 99.4% (95% CI: 99.1%-99.6%), accuracy of 98.7% (95% CI: 98.2%-99.0%), and an F1 score of 0.892. Among the false positives, 24 (86%) were due to the inclusion of irrelevant contraindications, and 4 (14%) resulted from repetitive information. No hallucinations were observed.

Conclusion: Our LLM-based tool may identify stroke thrombolysis contraindications from synthetic clinical notes with high sensitivity and PPV. Future studies will validate its performance using real EMR data and integrate it into acute stroke workflows to facilitate faster and safer thrombolysis decision-making.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Cerebrovascular Diseases Extra PERIPHERAL VASCULAR DISEASE-

CiteScore

3.50

自引率

0.00%

发文量

审稿时长

8 weeks

期刊介绍： This open access and online-only journal publishes original articles covering the entire spectrum of stroke and cerebrovascular research, drawing from a variety of specialties such as neurology, internal medicine, surgery, radiology, epidemiology, cardiology, hematology, psychology and rehabilitation. Offering an international forum, it meets the growing need for sophisticated, up-to-date scientific information on clinical data, diagnostic testing, and therapeutic issues. The journal publishes original contributions, reviews of selected topics as well as clinical investigative studies. All aspects related to clinical advances are considered, while purely experimental work appears only if directly relevant to clinical issues. Cerebrovascular Diseases Extra provides additional contents based on reviewed and accepted submissions to the main journal Cerebrovascular Diseases.