基于大语言模型的药物处方零弹和少弹命名实体识别和文本扩展

IF 6.2 2区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Natthanaphop Isaradech , Andrea Riedel , Wachiranun Sirikul , Markus Kreuzthaler , Stefan Schulz
{"title":"基于大语言模型的药物处方零弹和少弹命名实体识别和文本扩展","authors":"Natthanaphop Isaradech ,&nbsp;Andrea Riedel ,&nbsp;Wachiranun Sirikul ,&nbsp;Markus Kreuzthaler ,&nbsp;Stefan Schulz","doi":"10.1016/j.artmed.2025.103165","DOIUrl":null,"url":null,"abstract":"<div><div>Medication prescriptions in electronic health records (EHR) are often in free-text and may include a mix of languages, local brand names, and a wide range of idiosyncratic formats and abbreviations. Large language models (LLMs) have shown a promising ability to generate text in response to input prompts. We use ChatGPT3.5 to automatically structure and expand medication statements in discharge summaries and thus make them easier to interpret for people and machines. Named Entity Recognition (NER) and Text Expansion (EX) are used with different prompt strategies in a zero- and few-shot setting. 100 medication statements were manually annotated and curated. NER performance was measured by using strict and partial matching. For the EX task, two experts interpreted the results by assessing semantic equivalence between original and expanded statements. The model performance was measured by precision, recall, and F1 score. For NER, the best-performing prompt reached an average F1 score of 0.94 in the test set. For EX, the few-shot prompt showed superior performance among other prompts, with an average F1 score of 0.87. Our study demonstrates good performance for NER and EX tasks in free-text medication statements using ChatGPT3.5. Compared to a zero-shot baseline, a few-shot approach prevented the system from hallucinating, which is essential when processing safety-relevant medication data. We tested ChatGPT3.5-tuned prompts on other LLMs, including ChatGPT4o, Gemini 2.0 Flash, MedLM-1.5-Large, and DeepSeekV3. The findings showed most models outperformed ChatGPT3.5 in NER and EX tasks.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"167 ","pages":"Article 103165"},"PeriodicalIF":6.2000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Zero- and few-shot Named Entity Recognition and Text Expansion in medication prescriptions using large language models\",\"authors\":\"Natthanaphop Isaradech ,&nbsp;Andrea Riedel ,&nbsp;Wachiranun Sirikul ,&nbsp;Markus Kreuzthaler ,&nbsp;Stefan Schulz\",\"doi\":\"10.1016/j.artmed.2025.103165\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Medication prescriptions in electronic health records (EHR) are often in free-text and may include a mix of languages, local brand names, and a wide range of idiosyncratic formats and abbreviations. Large language models (LLMs) have shown a promising ability to generate text in response to input prompts. We use ChatGPT3.5 to automatically structure and expand medication statements in discharge summaries and thus make them easier to interpret for people and machines. Named Entity Recognition (NER) and Text Expansion (EX) are used with different prompt strategies in a zero- and few-shot setting. 100 medication statements were manually annotated and curated. NER performance was measured by using strict and partial matching. For the EX task, two experts interpreted the results by assessing semantic equivalence between original and expanded statements. The model performance was measured by precision, recall, and F1 score. For NER, the best-performing prompt reached an average F1 score of 0.94 in the test set. For EX, the few-shot prompt showed superior performance among other prompts, with an average F1 score of 0.87. Our study demonstrates good performance for NER and EX tasks in free-text medication statements using ChatGPT3.5. Compared to a zero-shot baseline, a few-shot approach prevented the system from hallucinating, which is essential when processing safety-relevant medication data. We tested ChatGPT3.5-tuned prompts on other LLMs, including ChatGPT4o, Gemini 2.0 Flash, MedLM-1.5-Large, and DeepSeekV3. The findings showed most models outperformed ChatGPT3.5 in NER and EX tasks.</div></div>\",\"PeriodicalId\":55458,\"journal\":{\"name\":\"Artificial Intelligence in Medicine\",\"volume\":\"167 \",\"pages\":\"Article 103165\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence in Medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0933365725001009\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0933365725001009","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

电子健康记录(EHR)中的药物处方通常是自由文本,可能包括多种语言、当地品牌名称以及各种特殊格式和缩写。大型语言模型(llm)已经显示出响应输入提示生成文本的良好能力。我们使用ChatGPT3.5自动构建和扩展出院摘要中的用药说明,从而使其更容易为人和机器解释。命名实体识别(NER)和文本扩展(EX)在零弹和少弹场景下使用不同的提示策略。对100份用药说明书进行手工标注和整理。采用严格匹配和部分匹配的方法对NER性能进行了测量。对于EX任务,两位专家通过评估原始语句和扩展语句之间的语义等效性来解释结果。模型性能通过准确率、召回率和F1分数来衡量。对于NER,表现最好的提示在测试集中平均F1得分为0.94。对于EX,少射提示在其他提示中表现较好,平均F1得分为0.87。我们的研究表明,使用ChatGPT3.5在自由文本药物声明中执行NER和EX任务具有良好的性能。与零注射基线相比,少量注射方法可以防止系统产生幻觉,这在处理与安全相关的药物数据时是必不可少的。我们在其他llm上测试了chatgpt3.5调优的提示,包括chatgpt40、Gemini 2.0 Flash、MedLM-1.5-Large和DeepSeekV3。结果表明,大多数模型在NER和EX任务中表现优于ChatGPT3.5。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Zero- and few-shot Named Entity Recognition and Text Expansion in medication prescriptions using large language models

Zero- and few-shot Named Entity Recognition and Text Expansion in medication prescriptions using large language models
Medication prescriptions in electronic health records (EHR) are often in free-text and may include a mix of languages, local brand names, and a wide range of idiosyncratic formats and abbreviations. Large language models (LLMs) have shown a promising ability to generate text in response to input prompts. We use ChatGPT3.5 to automatically structure and expand medication statements in discharge summaries and thus make them easier to interpret for people and machines. Named Entity Recognition (NER) and Text Expansion (EX) are used with different prompt strategies in a zero- and few-shot setting. 100 medication statements were manually annotated and curated. NER performance was measured by using strict and partial matching. For the EX task, two experts interpreted the results by assessing semantic equivalence between original and expanded statements. The model performance was measured by precision, recall, and F1 score. For NER, the best-performing prompt reached an average F1 score of 0.94 in the test set. For EX, the few-shot prompt showed superior performance among other prompts, with an average F1 score of 0.87. Our study demonstrates good performance for NER and EX tasks in free-text medication statements using ChatGPT3.5. Compared to a zero-shot baseline, a few-shot approach prevented the system from hallucinating, which is essential when processing safety-relevant medication data. We tested ChatGPT3.5-tuned prompts on other LLMs, including ChatGPT4o, Gemini 2.0 Flash, MedLM-1.5-Large, and DeepSeekV3. The findings showed most models outperformed ChatGPT3.5 in NER and EX tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Artificial Intelligence in Medicine
Artificial Intelligence in Medicine 工程技术-工程:生物医学
CiteScore
15.00
自引率
2.70%
发文量
143
审稿时长
6.3 months
期刊介绍: Artificial Intelligence in Medicine publishes original articles from a wide variety of interdisciplinary perspectives concerning the theory and practice of artificial intelligence (AI) in medicine, medically-oriented human biology, and health care. Artificial intelligence in medicine may be characterized as the scientific discipline pertaining to research studies, projects, and applications that aim at supporting decision-based medical tasks through knowledge- and/or data-intensive computer-based solutions that ultimately support and improve the performance of a human care provider.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信