生成医学上准确的患者-提供者对话摘要:使用大型语言模型的多阶段方法

Clinical Natural Language Processing Workshop Pub Date : 2023-05-10 DOI:10.48550/arXiv.2305.05982

Varun Nair, Elliot Schumacher, Anitha Kannan

{"title":"生成医学上准确的患者-提供者对话摘要:使用大型语言模型的多阶段方法","authors":"Varun Nair, Elliot Schumacher, Anitha Kannan","doi":"10.48550/arXiv.2305.05982","DOIUrl":null,"url":null,"abstract":"A medical provider’s summary of a patient visit serves several critical purposes, including clinical decision-making, facilitating hand-offs between providers, and as a reference for the patient. An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue, despite the complexity of patient-generated language. Even minor inaccuracies in visit summaries (for example, summarizing “patient does not have a fever” when a fever is present) can be detrimental to the outcome of care for the patient.This paper tackles the problem of medical conversation summarization by discretizing the task into several smaller dialogue-understanding tasks that are sequentially built upon. First, we identify medical entities and their affirmations within the conversation to serve as building blocks. We study dynamically constructing few-shot prompts for tasks by conditioning on relevant patient information and use GPT-3 as the backbone for our experiments. We also develop GPT-derived summarization metrics to measure performance against reference summaries quantitatively. Both our human evaluation study and metrics for medical correctness show that summaries generated using this approach are clinically accurate and outperform the baseline approach of summarizing the dialog in a zero-shot, single-prompt setting.","PeriodicalId":216954,"journal":{"name":"Clinical Natural Language Processing Workshop","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Generating medically-accurate summaries of patient-provider dialogue: A multi-stage approach using large language models\",\"authors\":\"Varun Nair, Elliot Schumacher, Anitha Kannan\",\"doi\":\"10.48550/arXiv.2305.05982\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A medical provider’s summary of a patient visit serves several critical purposes, including clinical decision-making, facilitating hand-offs between providers, and as a reference for the patient. An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue, despite the complexity of patient-generated language. Even minor inaccuracies in visit summaries (for example, summarizing “patient does not have a fever” when a fever is present) can be detrimental to the outcome of care for the patient.This paper tackles the problem of medical conversation summarization by discretizing the task into several smaller dialogue-understanding tasks that are sequentially built upon. First, we identify medical entities and their affirmations within the conversation to serve as building blocks. We study dynamically constructing few-shot prompts for tasks by conditioning on relevant patient information and use GPT-3 as the backbone for our experiments. We also develop GPT-derived summarization metrics to measure performance against reference summaries quantitatively. Both our human evaluation study and metrics for medical correctness show that summaries generated using this approach are clinically accurate and outperform the baseline approach of summarizing the dialog in a zero-shot, single-prompt setting.\",\"PeriodicalId\":216954,\"journal\":{\"name\":\"Clinical Natural Language Processing Workshop\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Natural Language Processing Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2305.05982\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Natural Language Processing Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2305.05982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

医疗提供者对患者就诊的总结有几个关键的目的，包括临床决策，促进提供者之间的交接，以及作为患者的参考。尽管病人产生的语言很复杂，但有效的摘要必须连贯一致，并准确地捕捉对话中的所有医学相关信息。即使是就诊总结中的微小错误(例如，在发烧时总结为“病人不发烧”)也可能对病人的护理结果有害。本文通过将任务离散为顺序构建的几个较小的对话理解任务来解决医学会话摘要问题。首先，我们在对话中确定医疗实体和他们的肯定，作为构建模块。我们以GPT-3作为实验的主干，通过对相关患者信息的条件反射，动态地构建少量的任务提示。我们还开发了gpt派生的总结度量，以定量地衡量参考摘要的性能。我们的人类评估研究和医学正确性指标都表明，使用这种方法生成的摘要在临床上是准确的，并且优于在零镜头、单提示设置中总结对话的基线方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Generating medically-accurate summaries of patient-provider dialogue: A multi-stage approach using large language models

A medical provider’s summary of a patient visit serves several critical purposes, including clinical decision-making, facilitating hand-offs between providers, and as a reference for the patient. An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue, despite the complexity of patient-generated language. Even minor inaccuracies in visit summaries (for example, summarizing “patient does not have a fever” when a fever is present) can be detrimental to the outcome of care for the patient.This paper tackles the problem of medical conversation summarization by discretizing the task into several smaller dialogue-understanding tasks that are sequentially built upon. First, we identify medical entities and their affirmations within the conversation to serve as building blocks. We study dynamically constructing few-shot prompts for tasks by conditioning on relevant patient information and use GPT-3 as the backbone for our experiments. We also develop GPT-derived summarization metrics to measure performance against reference summaries quantitatively. Both our human evaluation study and metrics for medical correctness show that summaries generated using this approach are clinically accurate and outperform the baseline approach of summarizing the dialog in a zero-shot, single-prompt setting.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Clinical Natural Language Processing Workshop

自引率

0.00%

发文量