Classification of Use Status for Dietary Supplements in Clinical Notes.

Yadan Fan, Lu He, Rui Zhang
{"title":"Classification of Use Status for Dietary Supplements in Clinical Notes.","authors":"Yadan Fan,&nbsp;Lu He,&nbsp;Rui Zhang","doi":"10.1109/BIBM.2016.7822668","DOIUrl":null,"url":null,"abstract":"<p><p>Clinical notes contain rich information about dietary supplements, which are critical for detecting signals of dietary supplement side effects and interactions between drugs and supplements. One of the important factors of supplement documentation is usage status, such as started and discontinuation. Such information is usually stored in the unstructured clinical notes. We developed a rule-based classifier to identify supplement usage status in clinical notes. The categories referring to the patient's status of supplement use were classified into four classes: Continuing (C), Discontinued (D), Started (S), and Unclassified (U). Clinical notes containing 10 of the most commonly consumed supplements (i.e., alfalfa, echinacea, fish oil, garlic, ginger, ginkgo, ginseng, melatonin, St. John's Wort, and Vitamin E) were retrieved from the University of Minnesota Clinical Data Repository. The gold standard was defined by manually annotating 1000 randomly selected sentences or statements mentioning at least one of these 10 supplements. The rules in the classifier was initially developed on two-thirds of the set of 7 supplements (i.e., alfalfa, garlic, ginger, ginkgo, ginseng, St. John's Wort, and Vitamin E); the performance was evaluated on the remaining one-third of this set. To evaluate the generalizability of rules, we further validated the second testing set on other 3 supplements (i.e., echinacea, fish oil, and melatonin). The performance of the classifier achieved F-measures of 0.95, 0.97, 0.96, and 0.96 for status C, D, S, and U on 7 supplements, respectively. The classifier also showed good generalizability when it was applied to the other 3 supplements with F-measures of 0.96 for C, 0.96 for D, 0.95 for S, and 0.89 for U. This study demonstrated that the classifier can accurately classify supplement usage status, which can be further integrated as a module into the existing natural language processing pipeline for supporting dietary supplement knowledge discovery.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2016 ","pages":"1054-1061"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2016.7822668","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822668","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2017/1/19 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Clinical notes contain rich information about dietary supplements, which are critical for detecting signals of dietary supplement side effects and interactions between drugs and supplements. One of the important factors of supplement documentation is usage status, such as started and discontinuation. Such information is usually stored in the unstructured clinical notes. We developed a rule-based classifier to identify supplement usage status in clinical notes. The categories referring to the patient's status of supplement use were classified into four classes: Continuing (C), Discontinued (D), Started (S), and Unclassified (U). Clinical notes containing 10 of the most commonly consumed supplements (i.e., alfalfa, echinacea, fish oil, garlic, ginger, ginkgo, ginseng, melatonin, St. John's Wort, and Vitamin E) were retrieved from the University of Minnesota Clinical Data Repository. The gold standard was defined by manually annotating 1000 randomly selected sentences or statements mentioning at least one of these 10 supplements. The rules in the classifier was initially developed on two-thirds of the set of 7 supplements (i.e., alfalfa, garlic, ginger, ginkgo, ginseng, St. John's Wort, and Vitamin E); the performance was evaluated on the remaining one-third of this set. To evaluate the generalizability of rules, we further validated the second testing set on other 3 supplements (i.e., echinacea, fish oil, and melatonin). The performance of the classifier achieved F-measures of 0.95, 0.97, 0.96, and 0.96 for status C, D, S, and U on 7 supplements, respectively. The classifier also showed good generalizability when it was applied to the other 3 supplements with F-measures of 0.96 for C, 0.96 for D, 0.95 for S, and 0.89 for U. This study demonstrated that the classifier can accurately classify supplement usage status, which can be further integrated as a module into the existing natural language processing pipeline for supporting dietary supplement knowledge discovery.

临床记录中膳食补充剂使用状况的分类。
临床记录包含丰富的膳食补充剂信息,这对于检测膳食补充剂副作用和药物与补充剂之间的相互作用至关重要。补充文档的一个重要因素是使用状态,如启动和停止。这些信息通常存储在非结构化的临床记录中。我们开发了一个基于规则的分类器来识别临床记录中的补充剂使用状况。将患者服用补充剂的情况分为四类:持续(C)、停止(D)、开始(S)和未分类(U)。临床记录中包含10种最常服用的补充剂(即苜蓿、紫锥菊、鱼油、大蒜、生姜、银杏、人参、褪黑素、圣约翰草和维生素E)从明尼苏达大学临床数据存储库中检索。黄金标准是通过手动标注1000个随机选择的句子或语句来定义的,这些句子或语句至少提到了这10个补充内容中的一个。分类器中的规则最初是针对7种补充剂(即苜蓿、大蒜、生姜、银杏、人参、圣约翰草和维生素E)中的三分之二制定的;对剩下的三分之一进行性能评估。为了评估规则的普遍性,我们进一步验证了其他3种补充剂(即紫锥菊、鱼油和褪黑素)的第二组测试集。分类器在7种补充剂上的C、D、S和U状态的f测量值分别为0.95、0.97、0.96和0.96。该分类器对C、D、S、u的f值分别为0.96、0.96、0.95和0.89的其他3种补充剂也表现出了良好的泛化性。研究表明,该分类器可以准确地对补充剂的使用状态进行分类,可以作为模块进一步集成到现有的自然语言处理管道中,支持膳食补充剂知识的发现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信