Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges.

Proceedings of machine learning research Pub Date : 2024-06-01

Hiba Ahsan, Denis Jered McInerney, Jisoo Kim, Christopher Potter, Geoffrey Young, Silvio Amir, Byron C Wallace

{"title":"Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges.","authors":"Hiba Ahsan, Denis Jered McInerney, Jisoo Kim, Christopher Potter, Geoffrey Young, Silvio Amir, Byron C Wallace","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Unstructured data in Electronic Health Records (EHRs) often contains critical information-complementary to imaging-that could inform radiologists' diagnoses. But the large volume of notes often associated with patients together with time constraints renders manually identifying relevant evidence practically infeasible. In this work we propose and evaluate a zero-shot strategy for using LLMs as a mechanism to efficiently retrieve and summarize unstructured evidence in patient EHR relevant to a given query. Our method entails tasking an LLM to infer whether a patient has, or is at risk of, a particular condition on the basis of associated notes; if so, we ask the model to summarize the supporting evidence. Under expert evaluation, we find that this LLM-based approach provides outputs consistently preferred to a pre-LLM information retrieval baseline. Manual evaluation is expensive, so we also propose and validate a method using an LLM to evaluate (other) LLM outputs for this task, allowing us to scale up evaluation. Our findings indicate the promise of LLMs as interfaces to EHR, but also highlight the outstanding challenge posed by \"hallucinations\". In this setting, however, we show that model confidence in outputs strongly correlates with faithful summaries, offering a practical means to limit confabulations.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"248 ","pages":"489-505"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368037/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of machine learning research","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Unstructured data in Electronic Health Records (EHRs) often contains critical information-complementary to imaging-that could inform radiologists' diagnoses. But the large volume of notes often associated with patients together with time constraints renders manually identifying relevant evidence practically infeasible. In this work we propose and evaluate a zero-shot strategy for using LLMs as a mechanism to efficiently retrieve and summarize unstructured evidence in patient EHR relevant to a given query. Our method entails tasking an LLM to infer whether a patient has, or is at risk of, a particular condition on the basis of associated notes; if so, we ask the model to summarize the supporting evidence. Under expert evaluation, we find that this LLM-based approach provides outputs consistently preferred to a pre-LLM information retrieval baseline. Manual evaluation is expensive, so we also propose and validate a method using an LLM to evaluate (other) LLM outputs for this task, allowing us to scale up evaluation. Our findings indicate the promise of LLMs as interfaces to EHR, but also highlight the outstanding challenge posed by "hallucinations". In this setting, however, we show that model confidence in outputs strongly correlates with faithful summaries, offering a practical means to limit confabulations.

本刊更多论文

利用 LLM 从电子病历中检索证据：可能性与挑战。

电子健康记录（EHR）中的非结构化数据通常包含重要的信息--与影像资料互为补充--可为放射科医生的诊断提供依据。但是，由于患者的笔记数量庞大，再加上时间限制，人工识别相关证据实际上是不可行的。在这项工作中，我们提出并评估了一种零点策略，利用 LLM 作为一种机制，有效检索和总结病人电子病历中与给定查询相关的非结构化证据。我们的方法要求 LLM 根据相关笔记推断病人是否患有某种疾病或是否有患病风险；如果是，我们要求模型总结支持性证据。通过专家评估，我们发现这种基于 LLM 的方法所提供的输出结果始终优于 LLM 前的信息检索基线。人工评估的成本很高，因此我们还提出并验证了一种使用 LLM 评估（其他）LLM 输出的方法，使我们能够扩大评估范围。我们的研究结果表明了 LLM 作为电子病历接口的前景，但也强调了 "幻觉 "带来的巨大挑战。不过，在这种情况下，我们发现模型对输出结果的信心与忠实摘要密切相关，这为限制幻觉提供了一种切实可行的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of machine learning research

自引率

0.00%

发文量