Generating Synthetic Healthcare Dialogues in Emergency Medicine Using Large Language Models.

Denis Moser, Matthias Bender, Murat Sariyar
{"title":"Generating Synthetic Healthcare Dialogues in Emergency Medicine Using Large Language Models.","authors":"Denis Moser, Matthias Bender, Murat Sariyar","doi":"10.3233/SHTI241099","DOIUrl":null,"url":null,"abstract":"<p><p>Natural Language Processing (NLP) has shown promise in fields like radiology for converting unstructured into structured data, but acquiring suitable datasets poses several challenges, including privacy concerns. Specifically, we aim to utilize Large Language Models (LLMs) to extract medical information from dialogues between ambulance staff and patients to populate emergency protocol forms. However, we currently lack dialogues with known content that can serve as a gold standard for an evaluation. We designed a pipeline using the quantized LLM \"Zephyr-7b-beta\" for initial dialogue generation, followed by refinement and translation using OpenAI's GPT-4 Turbo. The MIMIC-IV database provided relevant medical data. The evaluation involved accuracy assessment via Retrieval-Augmented Generation (RAG) and sentiment analysis using multilingual models. Initial results showed a high accuracy of 94% with \"Zephyr-7b-beta,\" slightly decreasing to 87% after refinement with GPT-4 Turbo. Sentiment analysis indicated a qualitative shift towards more positive sentiment post-refinement. These findings highlight the potential and challenges of using LLMs for generating synthetic medical dialogues, informing future NLP system development in healthcare.</p>","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"321 ","pages":"235-239"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI241099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Natural Language Processing (NLP) has shown promise in fields like radiology for converting unstructured into structured data, but acquiring suitable datasets poses several challenges, including privacy concerns. Specifically, we aim to utilize Large Language Models (LLMs) to extract medical information from dialogues between ambulance staff and patients to populate emergency protocol forms. However, we currently lack dialogues with known content that can serve as a gold standard for an evaluation. We designed a pipeline using the quantized LLM "Zephyr-7b-beta" for initial dialogue generation, followed by refinement and translation using OpenAI's GPT-4 Turbo. The MIMIC-IV database provided relevant medical data. The evaluation involved accuracy assessment via Retrieval-Augmented Generation (RAG) and sentiment analysis using multilingual models. Initial results showed a high accuracy of 94% with "Zephyr-7b-beta," slightly decreasing to 87% after refinement with GPT-4 Turbo. Sentiment analysis indicated a qualitative shift towards more positive sentiment post-refinement. These findings highlight the potential and challenges of using LLMs for generating synthetic medical dialogues, informing future NLP system development in healthcare.

利用大型语言模型生成急诊医学中的合成医疗对话。
自然语言处理(NLP)已在放射学等领域显示出将非结构化数据转换为结构化数据的前景,但获取合适的数据集却面临着一些挑战,其中包括隐私问题。具体来说,我们的目标是利用大型语言模型(LLMs)从救护人员和病人之间的对话中提取医疗信息,以填充紧急协议表格。然而,我们目前缺乏已知内容的对话,无法作为评估的黄金标准。我们设计了一个管道,使用量化 LLM "Zephyr-7b-beta "进行初始对话生成,然后使用 OpenAI 的 GPT-4 Turbo 进行细化和翻译。MIMIC-IV 数据库提供了相关的医疗数据。评估包括通过检索增强生成(RAG)进行准确性评估,以及使用多语言模型进行情感分析。初步结果显示,"Zephyr-7b-beta "的准确率高达 94%,在使用 GPT-4 Turbo 进行改进后,准确率略有下降,为 87%。情感分析表明,经过改进后,情感发生了质的变化,变得更加积极。这些发现凸显了使用 LLM 生成合成医疗对话的潜力和挑战,为未来医疗保健领域的 NLP 系统开发提供了参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信