Natural Language Processing for Enhanced Clinical Decision Support in Allergy Verification for Medication Prescriptions

Mayo Clinic Proceedings. Digital health Pub Date : 2025-06-10 DOI:10.1016/j.mcpdig.2025.100244

Juan Pablo Botero-Aguirre MS , Michael Andrés García-Rivera MS

{"title":"Natural Language Processing for Enhanced Clinical Decision Support in Allergy Verification for Medication Prescriptions","authors":"Juan Pablo Botero-Aguirre MS , Michael Andrés García-Rivera MS","doi":"10.1016/j.mcpdig.2025.100244","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>To develop and validate a named entity recognition (NER) model based on BERT-based model trained on Spanish-language corpor, for extracting allergy-related information from unstructured electronic health records.</div></div><div><h3>Patients and Methods</h3><div>The model was fine-tuned using 16,176 manually annotated allergy-related entities from anonimized patient records (hospitalized patients between January 1, 2021, and June 30, 2024). The data set was divided into training (80%) and testing (20%) subsets, and model performance was evaluated using accuracy, recall, and F1 score. The validated model was applied to another data set with 80,917 medication prescriptions from 5859 hospitalized patients with at least one prescribed medication (during August and September 2024) to detect potential prescription errors. Sensitivity, specificity, and Cohen κ were calculated using manual expert review as the gold standard.</div></div><div><h3>Results</h3><div>The model achieved an accuracy of 87.28% and an F1 score of 0.80. It effectively identified medication names (F1=0.91) and adverse reactions (F1=0.85) but struggled with recommendation-related entities (F1=0.29). The model detected prescription errors in 0.96% of cases, with a sensitivity of 75.73% and specificity of 99.98%. The weighted κ score (0.7797) indicated substantial agreement with expert annotations.</div></div><div><h3>Conclusion</h3><div>The BERT-based model trained on Spanish-language corpora–based NER model demonstrated strong performance in identifying nonallergic cases (specificity, 99.98%; negative predictive value, 99.97%) and showed promise for clinical decision support. Despite moderate sensitivity (75.73%), these results highlight the feasibility of using Spanish-language NER models to enhance medication safety.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 3","pages":"Article 100244"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mayo Clinic Proceedings. Digital health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949761225000513","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Objective

To develop and validate a named entity recognition (NER) model based on BERT-based model trained on Spanish-language corpor, for extracting allergy-related information from unstructured electronic health records.

Patients and Methods

The model was fine-tuned using 16,176 manually annotated allergy-related entities from anonimized patient records (hospitalized patients between January 1, 2021, and June 30, 2024). The data set was divided into training (80%) and testing (20%) subsets, and model performance was evaluated using accuracy, recall, and F1 score. The validated model was applied to another data set with 80,917 medication prescriptions from 5859 hospitalized patients with at least one prescribed medication (during August and September 2024) to detect potential prescription errors. Sensitivity, specificity, and Cohen κ were calculated using manual expert review as the gold standard.

Results

The model achieved an accuracy of 87.28% and an F1 score of 0.80. It effectively identified medication names (F1=0.91) and adverse reactions (F1=0.85) but struggled with recommendation-related entities (F1=0.29). The model detected prescription errors in 0.96% of cases, with a sensitivity of 75.73% and specificity of 99.98%. The weighted κ score (0.7797) indicated substantial agreement with expert annotations.

Conclusion

The BERT-based model trained on Spanish-language corpora–based NER model demonstrated strong performance in identifying nonallergic cases (specificity, 99.98%; negative predictive value, 99.97%) and showed promise for clinical decision support. Despite moderate sensitivity (75.73%), these results highlight the feasibility of using Spanish-language NER models to enhance medication safety.

查看原文本刊更多论文

自然语言处理在药物处方过敏验证中的临床决策支持

目的基于西班牙语语料库训练的bert模型，开发并验证命名实体识别（NER）模型，用于从非结构化电子病历中提取过敏相关信息。患者和方法使用来自匿名患者记录（2021年1月1日至2024年6月30日住院患者）的16,176个手动注释的过敏相关实体对模型进行微调。数据集被分为训练子集（80%）和测试子集（20%），使用准确率、召回率和F1分数来评估模型的性能。将验证后的模型应用于另一个数据集，该数据集包含5859名至少服用一种药物的住院患者（2024年8月至9月）的80,917张药物处方，以检测潜在的处方错误。灵敏度、特异性和Cohen κ以人工专家评审为金标准计算。结果该模型的准确率为87.28%，F1评分为0.80。它有效地识别了药物名称（F1=0.91）和不良反应（F1=0.85），但难以识别与推荐相关的实体（F1=0.29）。该模型检出率为0.96%，灵敏度为75.73%，特异性为99.98%。加权κ分数（0.7797）与专家注释基本一致。结论基于bert的模型在基于西班牙语语料库的NER模型上训练后，在识别非过敏病例方面表现出较强的性能(特异性为99.98%；阴性预测值为99.97%)，为临床决策支持提供了希望。尽管敏感性中等（75.73%），但这些结果强调了使用西班牙语NER模型提高用药安全性的可行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Mayo Clinic Proceedings. Digital health Medicine and Dentistry (General), Health Informatics, Public Health and Health Policy

自引率

0.00%

发文量

审稿时长

47 days