The Impact of Access to Clinical Guidelines on LLM-Based Treatment Recommendations for Chronic Hepatitis B

IF 5.2 2区医学 Q1 GASTROENTEROLOGY & HEPATOLOGY

Liver International Pub Date : 2025-09-02 DOI:10.1111/liv.70324

Robert Siepmann, Carolin Victoria Schneider, Marc Sebastian von der Stueck, Iakovos Amygdalos, Karsten Große, Kai Markus Schneider, Maike Rebecca Pollmanns, Mohamad Murad, Joel Joy, Elena Kabak, Marcella Ricardis May, Jan Clusmann, Christiane Kuhl, Sven Nebelung, Jakob Nikolas Kather, Daniel Truhn

{"title":"The Impact of Access to Clinical Guidelines on LLM-Based Treatment Recommendations for Chronic Hepatitis B","authors":"Robert Siepmann, Carolin Victoria Schneider, Marc Sebastian von der Stueck, Iakovos Amygdalos, Karsten Große, Kai Markus Schneider, Maike Rebecca Pollmanns, Mohamad Murad, Joel Joy, Elena Kabak, Marcella Ricardis May, Jan Clusmann, Christiane Kuhl, Sven Nebelung, Jakob Nikolas Kather, Daniel Truhn","doi":"10.1111/liv.70324","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background and Aims</h3>\n \n <p>Large language models (LLMs) can potentially support clinicians in their daily routine by providing easy access to information. Yet, they are plagued by stating incorrect facts and hallucinating when queried. Increasing the context by providing external databases while prompting LLMs may decrease the risk of misinformation. This study compares the influence of increased context on the coherence of LLM-based treatment recommendations with the recently updated WHO guidelines for the treatment of chronic hepatitis B (CHB).</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>GPT-4 was queried with five clinical case vignettes in two configurations: with and without additional context. The clinical vignettes were explicitly constructed so that treatment recommendations differed between the formerly applicable 2015 WHO guidelines and the updated 2024 ones. GPT-4 with context was provided access to the updated guidelines, while GPT-4 without context had to rely on its internal knowledge. GPT-4 was accessed only a few days after the release of the new WHO guidelines. Treatment recommendations were compared regarding guideline coherence, information inclusion, textual errors, wording clarity and preciseness by seven physicians.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Using GPT-4 with context increased the coherence of the treatment recommendations with the new 2024 guidelines from 51% to 91% compared to GPT-4 without context. Similar trends were observed for all other categories, leading to an increase of 54% in preciseness and clarity, 24% in completeness of incorporating the case vignette information, and 12% in textual correctness.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>If LLMs are consulted by clinicians for medical advice, they should be given access to external data sources to increase the chance of providing factually correct advice.</p>\n </section>\n </div>","PeriodicalId":18101,"journal":{"name":"Liver International","volume":"45 10","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/liv.70324","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Liver International","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/liv.70324","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background and Aims

Large language models (LLMs) can potentially support clinicians in their daily routine by providing easy access to information. Yet, they are plagued by stating incorrect facts and hallucinating when queried. Increasing the context by providing external databases while prompting LLMs may decrease the risk of misinformation. This study compares the influence of increased context on the coherence of LLM-based treatment recommendations with the recently updated WHO guidelines for the treatment of chronic hepatitis B (CHB).

Methods

GPT-4 was queried with five clinical case vignettes in two configurations: with and without additional context. The clinical vignettes were explicitly constructed so that treatment recommendations differed between the formerly applicable 2015 WHO guidelines and the updated 2024 ones. GPT-4 with context was provided access to the updated guidelines, while GPT-4 without context had to rely on its internal knowledge. GPT-4 was accessed only a few days after the release of the new WHO guidelines. Treatment recommendations were compared regarding guideline coherence, information inclusion, textual errors, wording clarity and preciseness by seven physicians.

Results

Using GPT-4 with context increased the coherence of the treatment recommendations with the new 2024 guidelines from 51% to 91% compared to GPT-4 without context. Similar trends were observed for all other categories, leading to an increase of 54% in preciseness and clarity, 24% in completeness of incorporating the case vignette information, and 12% in textual correctness.

Conclusions

If LLMs are consulted by clinicians for medical advice, they should be given access to external data sources to increase the chance of providing factually correct advice.

Abstract Image

查看原文本刊更多论文

获得临床指南对基于llm的慢性乙型肝炎治疗建议的影响

背景和目的大型语言模型（llm）可以通过提供方便的信息访问来潜在地支持临床医生的日常工作。然而，当被问及错误的事实和产生幻觉时，他们饱受困扰。通过提供外部数据库增加上下文，同时提示法学硕士可以降低错误信息的风险。本研究比较了背景增加对基于llm的治疗建议与最近更新的WHO慢性乙型肝炎（CHB）治疗指南一致性的影响。方法对5例临床病例进行GPT-4查询，分为两种情况：有和无附加背景。明确构建了临床小片段，因此治疗建议在以前适用的2015年世卫组织指南和更新的2024年指南之间有所不同。有上下文的GPT-4可以访问更新的指南，而没有上下文的GPT-4必须依赖其内部知识。世卫组织新指南发布几天后才访问了GPT-4。7位医生比较了治疗建议的指南一致性、信息包含、文本错误、措辞清晰性和精确性。结果与没有背景的GPT-4相比，使用有背景的GPT-4将治疗建议与新的2024指南的一致性从51%提高到91%。在所有其他类别中也观察到类似的趋势，导致准确性和清晰度增加54%，合并案例插图信息的完整性增加24%，文本正确性增加12%。如果临床医生向法学硕士咨询医疗建议，应给予他们访问外部数据源的机会，以增加提供事实正确的建议的机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Liver International 医学-胃肠肝病学

CiteScore

13.90

自引率

4.50%

发文量

348

审稿时长

2 months

期刊介绍： Liver International promotes all aspects of the science of hepatology from basic research to applied clinical studies. Providing an international forum for the publication of high-quality original research in hepatology, it is an essential resource for everyone working on normal and abnormal structure and function in the liver and its constituent cells, including clinicians and basic scientists involved in the multi-disciplinary field of hepatology. The journal welcomes articles from all fields of hepatology, which may be published as original articles, brief definitive reports, reviews, mini-reviews, images in hepatology and letters to the Editor.