Danger, Danger, Gaston Labat! Does zero-shot artificial intelligence correlate with anticoagulation guidelines recommendations for neuraxial anesthesia?

IF 5.1 2区医学 Q1 ANESTHESIOLOGY

Regional Anesthesia and Pain Medicine Pub Date : 2024-09-02 DOI:10.1136/rapm-2023-104868

Nathan C Hurley, Rajnish K Gupta, Kristopher M Schroeder, Aaron S Hess

{"title":"Danger, Danger, Gaston Labat! Does zero-shot artificial intelligence correlate with anticoagulation guidelines recommendations for neuraxial anesthesia?","authors":"Nathan C Hurley, Rajnish K Gupta, Kristopher M Schroeder, Aaron S Hess","doi":"10.1136/rapm-2023-104868","DOIUrl":null,"url":null,"abstract":"Introduction: Artificial intelligence and large language models (LLMs) have emerged as potentially disruptive technologies in healthcare. In this study GPT-3.5, an accessible LLM, was assessed for its accuracy and reliability in performing guideline-based evaluation of neuraxial bleeding risk in hypothetical patients on anticoagulation medication. The study also explored the impact of structured prompt guidance on the LLM's performance.Methods: A dataset of 10 hypothetical patient stems and 26 anticoagulation profiles (260 unique combinations) was developed based on American Society of Regional Anesthesia and Pain Medicine guidelines. Five prompts were created for the LLM, ranging from minimal guidance to explicit instructions. The model's responses were compared with a \"truth table\" based on the guidelines. Performance metrics, including accuracy and area under the receiver operating curve (AUC), were used.Results: Baseline performance of GPT-3.5 was slightly above chance. With detailed prompts and explicit guidelines, performance improved significantly (AUC 0.70, 95% CI (0.64 to 0.77)). Performance varied among medication classes.Discussion: LLMs show potential for assisting in clinical decision making but rely on accurate and relevant prompts. Integration of LLMs should consider safety and privacy concerns. Further research is needed to optimize LLM performance and address complex scenarios. The tested LLM demonstrates potential in assessing neuraxial bleeding risk but relies on precise prompts. LLM integration should be approached cautiously, considering limitations. Future research should focus on optimization and understanding LLM capabilities and limitations in healthcare.","PeriodicalId":54503,"journal":{"name":"Regional Anesthesia and Pain Medicine","volume":" ","pages":"661-667"},"PeriodicalIF":5.1000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Regional Anesthesia and Pain Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/rapm-2023-104868","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Artificial intelligence and large language models (LLMs) have emerged as potentially disruptive technologies in healthcare. In this study GPT-3.5, an accessible LLM, was assessed for its accuracy and reliability in performing guideline-based evaluation of neuraxial bleeding risk in hypothetical patients on anticoagulation medication. The study also explored the impact of structured prompt guidance on the LLM's performance.

Methods: A dataset of 10 hypothetical patient stems and 26 anticoagulation profiles (260 unique combinations) was developed based on American Society of Regional Anesthesia and Pain Medicine guidelines. Five prompts were created for the LLM, ranging from minimal guidance to explicit instructions. The model's responses were compared with a "truth table" based on the guidelines. Performance metrics, including accuracy and area under the receiver operating curve (AUC), were used.

Results: Baseline performance of GPT-3.5 was slightly above chance. With detailed prompts and explicit guidelines, performance improved significantly (AUC 0.70, 95% CI (0.64 to 0.77)). Performance varied among medication classes.

Discussion: LLMs show potential for assisting in clinical decision making but rely on accurate and relevant prompts. Integration of LLMs should consider safety and privacy concerns. Further research is needed to optimize LLM performance and address complex scenarios. The tested LLM demonstrates potential in assessing neuraxial bleeding risk but relies on precise prompts. LLM integration should be approached cautiously, considering limitations. Future research should focus on optimization and understanding LLM capabilities and limitations in healthcare.

查看原文本刊更多论文

危险危险加斯东-拉巴特零注射人工智能与神经麻醉抗凝指南建议相关吗？

导言：人工智能和大型语言模型（LLM）已成为医疗保健领域潜在的颠覆性技术。本研究评估了 GPT-3.5（一种易于使用的 LLM）在对服用抗凝药物的假设患者进行基于指南的神经性出血风险评估时的准确性和可靠性。研究还探讨了结构化提示指导对 LLM 性能的影响：方法：根据美国区域麻醉和疼痛医学学会指南，开发了一个包含 10 个假设患者主干和 26 个抗凝配置文件（260 个独特组合）的数据集。为 LLM 创建了五个提示，从最基本的指导到明确的指示。模型的响应与基于指南的 "真值表 "进行了比较。使用的性能指标包括准确率和接收者操作曲线下面积（AUC）：结果：GPT-3.5 的基准性能略高于偶然性。有了详细的提示和明确的指南后，性能明显提高（AUC 0.70，95% CI (0.64 至 0.77)）。不同药物类别的表现各不相同：讨论：LLMs 显示出辅助临床决策的潜力，但这有赖于准确和相关的提示。整合 LLMs 应考虑安全性和隐私问题。需要进一步开展研究，以优化 LLM 的性能并应对复杂的情况。经过测试的 LLM 具备评估神经轴出血风险的潜力，但需要准确的提示。考虑到局限性，应谨慎对待 LLM 集成。未来的研究应侧重于优化和了解 LLM 在医疗保健中的功能和局限性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Regional Anesthesia and Pain Medicine 医学-麻醉学

CiteScore

8.50

自引率

11.80%

发文量

175

审稿时长

6-12 weeks

期刊介绍： Regional Anesthesia & Pain Medicine, the official publication of the American Society of Regional Anesthesia and Pain Medicine (ASRA), is a monthly journal that publishes peer-reviewed scientific and clinical studies to advance the understanding and clinical application of regional techniques for surgical anesthesia and postoperative analgesia. Coverage includes intraoperative regional techniques, perioperative pain, chronic pain, obstetric anesthesia, pediatric anesthesia, outcome studies, and complications. Published for over thirty years, this respected journal also serves as the official publication of the European Society of Regional Anaesthesia and Pain Therapy (ESRA), the Asian and Oceanic Society of Regional Anesthesia (AOSRA), the Latin American Society of Regional Anesthesia (LASRA), the African Society for Regional Anesthesia (AFSRA), and the Academy of Regional Anaesthesia of India (AORA).