A refined set of RxNorm drug names for enhancing unstructured data analysis in drug safety surveillance.

IF 2.8 4区 医学 Q2 MEDICINE, RESEARCH & EXPERIMENTAL
Experimental Biology and Medicine Pub Date : 2025-05-02 eCollection Date: 2025-01-01 DOI:10.3389/ebm.2025.10374
Wenjing Guo, Fan Dong, Jie Liu, Aasma Aslam, Tucker A Patterson, Huixiao Hong
{"title":"A refined set of RxNorm drug names for enhancing unstructured data analysis in drug safety surveillance.","authors":"Wenjing Guo, Fan Dong, Jie Liu, Aasma Aslam, Tucker A Patterson, Huixiao Hong","doi":"10.3389/ebm.2025.10374","DOIUrl":null,"url":null,"abstract":"<p><p>Adverse drug events are harms associated with drug use, whether the drug is used correctly or incorrectly. Identifying adverse drug events is vital in pharmacovigilance to safeguard public health. Drug safety surveillance can be performed using unstructured data. A comprehensive and accurate list of drug names is essential for effective identification of adverse drug events. While there are numerous sources for drug names, RxNorm is widely recognized as a leading resource. However, its effectiveness for unstructured data analysis in drug safety surveillance has not been thoroughly assessed. To address this, we evaluated the drug names in RxNorm for their suitability in unstructured data analysis and developed a refined set of drug names. Initially, we removed duplicates, the names exceeding 199 characters, and those that only describe administrative details. Drug names with four or fewer characters were analyzed using 18,000 drug-related PubMed abstracts to remove names which rarely appear in unstructured data. The remaining names, which ranged from five to 199 characters, were further refined to exclude those that could lead to inaccurate drug counts in unstructured data analysis. We compared the efficiency and accuracy of the refined set with the original RxNorm set by testing both on the 18,000 drug-related PubMed abstracts. The results showed a decrease in both computational cost and the number of false drug names identified. Further analysis of the removed names revealed that most originated from only one of the 14 sources. Our findings suggest that the refined set can enhance drug identification in unstructured data analysis, thereby improving pharmacovigilance.</p>","PeriodicalId":12163,"journal":{"name":"Experimental Biology and Medicine","volume":"250 ","pages":"10374"},"PeriodicalIF":2.8000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12083459/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Experimental Biology and Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/ebm.2025.10374","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Adverse drug events are harms associated with drug use, whether the drug is used correctly or incorrectly. Identifying adverse drug events is vital in pharmacovigilance to safeguard public health. Drug safety surveillance can be performed using unstructured data. A comprehensive and accurate list of drug names is essential for effective identification of adverse drug events. While there are numerous sources for drug names, RxNorm is widely recognized as a leading resource. However, its effectiveness for unstructured data analysis in drug safety surveillance has not been thoroughly assessed. To address this, we evaluated the drug names in RxNorm for their suitability in unstructured data analysis and developed a refined set of drug names. Initially, we removed duplicates, the names exceeding 199 characters, and those that only describe administrative details. Drug names with four or fewer characters were analyzed using 18,000 drug-related PubMed abstracts to remove names which rarely appear in unstructured data. The remaining names, which ranged from five to 199 characters, were further refined to exclude those that could lead to inaccurate drug counts in unstructured data analysis. We compared the efficiency and accuracy of the refined set with the original RxNorm set by testing both on the 18,000 drug-related PubMed abstracts. The results showed a decrease in both computational cost and the number of false drug names identified. Further analysis of the removed names revealed that most originated from only one of the 14 sources. Our findings suggest that the refined set can enhance drug identification in unstructured data analysis, thereby improving pharmacovigilance.

一套精炼的RxNorm药物名称,用于加强药物安全监测中的非结构化数据分析。
药物不良事件是与药物使用相关的危害,无论药物使用正确与否。识别药物不良事件对保护公众健康的药物警戒至关重要。药物安全监测可以使用非结构化数据进行。一份全面而准确的药物名称清单对于有效识别药物不良事件至关重要。虽然有许多药物名称的来源,但RxNorm被广泛认为是一个主要的资源。然而,其在药物安全监测中非结构化数据分析的有效性尚未得到充分评估。为了解决这个问题,我们评估了RxNorm中的药名在非结构化数据分析中的适用性,并开发了一套精炼的药名。最初,我们删除了重复的名称、超过199个字符的名称以及仅描述管理细节的名称。使用18000个与药物相关的PubMed摘要分析了四个或更少字符的药物名称,以删除很少出现在非结构化数据中的名称。剩下的名字从5个字符到199个字符不等,在非结构化数据分析中排除了那些可能导致不准确的药物计数的名字。我们通过对18,000篇与药物相关的PubMed摘要进行测试,比较了改进集与原始RxNorm集的效率和准确性。结果表明,计算成本和识别出的假药名数量都有所减少。对被删除的名字的进一步分析显示,大多数名字只来自14个来源中的一个。我们的研究结果表明,该精化集可以增强非结构化数据分析中的药物识别,从而提高药物警惕性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Experimental Biology and Medicine
Experimental Biology and Medicine 医学-医学:研究与实验
CiteScore
6.00
自引率
0.00%
发文量
157
审稿时长
1 months
期刊介绍: Experimental Biology and Medicine (EBM) is a global, peer-reviewed journal dedicated to the publication of multidisciplinary and interdisciplinary research in the biomedical sciences. EBM provides both research and review articles as well as meeting symposia and brief communications. Articles in EBM represent cutting edge research at the overlapping junctions of the biological, physical and engineering sciences that impact upon the health and welfare of the world''s population. Topics covered in EBM include: Anatomy/Pathology; Biochemistry and Molecular Biology; Bioimaging; Biomedical Engineering; Bionanoscience; Cell and Developmental Biology; Endocrinology and Nutrition; Environmental Health/Biomarkers/Precision Medicine; Genomics, Proteomics, and Bioinformatics; Immunology/Microbiology/Virology; Mechanisms of Aging; Neuroscience; Pharmacology and Toxicology; Physiology; Stem Cell Biology; Structural Biology; Systems Biology and Microphysiological Systems; and Translational Research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信