使用 BERT 和 GPT,从自由文本药物标签到结构化药物术语。

AMIA ... Annual Symposium proceedings. AMIA Symposium Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Duy-Hoa Ngo, Bevan Koopman
{"title":"使用 BERT 和 GPT,从自由文本药物标签到结构化药物术语。","authors":"Duy-Hoa Ngo, Bevan Koopman","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>We present a method to enrich controlled medication terminology from free-text drug labels. This is important because, while controlled medication terminology capture well-structured medication information, much of the information pertaining to medications is still found in free-text. First, we compared different Named Entity Recognition (NER) models including rule-based, feature-based, deep learning-based models with Transformers as well as ChatGPT, few-shot and fine-tuned GPT-3 to find the most suitable model that accurately extracts medication entities (ingredients, brand, dose, etc.) from free-text. Then, a rule-based Relation Extraction algorithm transforms NER results into a well-structured medication knowledge graph. Finally, a Medication Searching method takes the knowledge graph and matches it to relevant medications in the terminology server. An empirical evaluation on real-world drug labels shows that BERT-CRF was the most effective NER model with F-measure 95%. After performing terms normalization, the Medication Searching achieved an accuracy of 77% for when matching a label to relevant medication in the terminology server. The NER and Medication Searching models could be deployed as a web service capable of accepting free-text queries and returning structured medication information; thus providing a useful means of better managing medications information found in different health systems.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"540-549"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785872/pdf/","citationCount":"0","resultStr":"{\"title\":\"From Free-text Drug Labels to Structured Medication Terminology with BERT and GPT.\",\"authors\":\"Duy-Hoa Ngo, Bevan Koopman\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>We present a method to enrich controlled medication terminology from free-text drug labels. This is important because, while controlled medication terminology capture well-structured medication information, much of the information pertaining to medications is still found in free-text. First, we compared different Named Entity Recognition (NER) models including rule-based, feature-based, deep learning-based models with Transformers as well as ChatGPT, few-shot and fine-tuned GPT-3 to find the most suitable model that accurately extracts medication entities (ingredients, brand, dose, etc.) from free-text. Then, a rule-based Relation Extraction algorithm transforms NER results into a well-structured medication knowledge graph. Finally, a Medication Searching method takes the knowledge graph and matches it to relevant medications in the terminology server. An empirical evaluation on real-world drug labels shows that BERT-CRF was the most effective NER model with F-measure 95%. After performing terms normalization, the Medication Searching achieved an accuracy of 77% for when matching a label to relevant medication in the terminology server. The NER and Medication Searching models could be deployed as a web service capable of accepting free-text queries and returning structured medication information; thus providing a useful means of better managing medications information found in different health systems.</p>\",\"PeriodicalId\":72180,\"journal\":{\"name\":\"AMIA ... Annual Symposium proceedings. AMIA Symposium\",\"volume\":\"2023 \",\"pages\":\"540-549\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785872/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AMIA ... Annual Symposium proceedings. AMIA Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA ... Annual Symposium proceedings. AMIA Symposium","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们提出了一种从自由文本药物标签中丰富受控药物术语的方法。这一点非常重要,因为虽然受控药物术语能捕捉到结构良好的药物信息,但许多与药物相关的信息仍然存在于自由文本中。首先,我们比较了不同的命名实体识别(NER)模型,包括基于规则的模型、基于特征的模型、基于深度学习的模型、Transformers 模型以及 ChatGPT 模型、少拍模型和微调 GPT-3 模型,以找到最适合的模型,从自由文本中准确提取药物实体(成分、品牌、剂量等)。然后,基于规则的关系提取算法将 NER 结果转化为结构良好的药物知识图谱。最后,药物搜索方法将知识图谱与术语服务器中的相关药物进行匹配。对真实世界药物标签的经验评估表明,BERT-CRF 是最有效的 NER 模型,F-measure 为 95%。在对术语进行归一化处理后,当将标签与术语服务器中的相关药物进行匹配时,药物搜索的准确率达到了 77%。NER 和用药搜索模型可作为网络服务部署,能够接受自由文本查询并返回结构化的用药信息,从而为更好地管理不同医疗系统中的用药信息提供了有用的手段。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
From Free-text Drug Labels to Structured Medication Terminology with BERT and GPT.

We present a method to enrich controlled medication terminology from free-text drug labels. This is important because, while controlled medication terminology capture well-structured medication information, much of the information pertaining to medications is still found in free-text. First, we compared different Named Entity Recognition (NER) models including rule-based, feature-based, deep learning-based models with Transformers as well as ChatGPT, few-shot and fine-tuned GPT-3 to find the most suitable model that accurately extracts medication entities (ingredients, brand, dose, etc.) from free-text. Then, a rule-based Relation Extraction algorithm transforms NER results into a well-structured medication knowledge graph. Finally, a Medication Searching method takes the knowledge graph and matches it to relevant medications in the terminology server. An empirical evaluation on real-world drug labels shows that BERT-CRF was the most effective NER model with F-measure 95%. After performing terms normalization, the Medication Searching achieved an accuracy of 77% for when matching a label to relevant medication in the terminology server. The NER and Medication Searching models could be deployed as a web service capable of accepting free-text queries and returning structured medication information; thus providing a useful means of better managing medications information found in different health systems.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信