Enhancing Large Language Models for Clinical Decision Support by Incorporating Clinical Practice Guidelines.

IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2024-06-01 Epub Date: 2024-08-22 DOI:10.1109/ichi61247.2024.00111

David Oniani, Xizhi Wu, Shyam Visweswaran, Sumit Kapoor, Shravan Kooragayalu, Katelyn Polanska, Yanshan Wang

{"title":"Enhancing Large Language Models for Clinical Decision Support by Incorporating Clinical Practice Guidelines.","authors":"David Oniani, Xizhi Wu, Shyam Visweswaran, Sumit Kapoor, Shravan Kooragayalu, Katelyn Polanska, Yanshan Wang","doi":"10.1109/ichi61247.2024.00111","DOIUrl":null,"url":null,"abstract":"<p><p>Large Language Models (LLMs), enhanced with Clinical Practice Guidelines (CPGs), can significantly improve Clinical Decision Support (CDS). However, approaches for incorporating CPGs into LLMs are not well studied. In this study, we develop three distinct methods for incorporating CPGs into LLMs: Binary Decision Tree (BDT), Program-Aided Graph Construction (PAGC), and Chain-of-Thought-Few-Shot Prompting (CoT-FSP), and focus on CDS for COVID-19 outpatient treatment as the case study. Zero-Shot Prompting (ZSP) is our baseline method. To evaluate the effectiveness of the proposed methods, we create a set of synthetic patient descriptions and conduct both automatic and human evaluation of the responses generated by four LLMs: GPT-4, GPT-3.5 Turbo, LLaMA, and PaLM 2. All four LLMs exhibit improved performance when enhanced with CPGs compared to the baseline ZSP. BDT outperformed both CoT-FSP and PAGC in automatic evaluation. All of the proposed methods demonstrate high performance in human evaluation. LLMs enhanced with CPGs outperform plain LLMs with ZSP in providing accurate recommendations for COVID-19 outpatient treatment, highlighting the potential for broader applications beyond the case study.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2024 ","pages":"694-702"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11909794/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ichi61247.2024.00111","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/22 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Large Language Models (LLMs), enhanced with Clinical Practice Guidelines (CPGs), can significantly improve Clinical Decision Support (CDS). However, approaches for incorporating CPGs into LLMs are not well studied. In this study, we develop three distinct methods for incorporating CPGs into LLMs: Binary Decision Tree (BDT), Program-Aided Graph Construction (PAGC), and Chain-of-Thought-Few-Shot Prompting (CoT-FSP), and focus on CDS for COVID-19 outpatient treatment as the case study. Zero-Shot Prompting (ZSP) is our baseline method. To evaluate the effectiveness of the proposed methods, we create a set of synthetic patient descriptions and conduct both automatic and human evaluation of the responses generated by four LLMs: GPT-4, GPT-3.5 Turbo, LLaMA, and PaLM 2. All four LLMs exhibit improved performance when enhanced with CPGs compared to the baseline ZSP. BDT outperformed both CoT-FSP and PAGC in automatic evaluation. All of the proposed methods demonstrate high performance in human evaluation. LLMs enhanced with CPGs outperform plain LLMs with ZSP in providing accurate recommendations for COVID-19 outpatient treatment, highlighting the potential for broader applications beyond the case study.

查看原文本刊更多论文

通过纳入临床实践指南，增强临床决策支持的大型语言模型。

大型语言模型（LLMs），与临床实践指南（CPGs）增强，可以显著提高临床决策支持（CDS）。然而，将CPGs纳入llm的方法并没有得到很好的研究。在本研究中，我们开发了三种不同的方法将cpg纳入llm：二叉决策树（BDT），程序辅助图构建（PAGC）和思维链-少针提示（ct - fsp），并将重点放在COVID-19门诊治疗的CDS作为案例研究。零射击提示（ZSP）是我们的基准方法。为了评估所提出方法的有效性，我们创建了一组合成的患者描述，并对四种LLMs （GPT-4、GPT-3.5 Turbo、LLaMA和PaLM 2）产生的反应进行了自动和人工评估。与基线ZSP相比，cpg增强后，所有四种llm的性能都有所提高。BDT在自动评价方面优于CoT-FSP和PAGC。所有提出的方法在人类评估中都表现出很高的性能。CPGs增强llm在为COVID-19门诊治疗提供准确建议方面优于ZSP的普通llm，突出了案例研究之外更广泛应用的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics

自引率

0.00%

发文量