FoodIE: A Rule-based Named-entity Recognition Method for Food Information Extraction

Gorjan Popovski, S. Kochev, B. Korousic-Seljak, T. Eftimov
{"title":"FoodIE: A Rule-based Named-entity Recognition Method for Food Information Extraction","authors":"Gorjan Popovski, S. Kochev, B. Korousic-Seljak, T. Eftimov","doi":"10.5220/0007686309150922","DOIUrl":null,"url":null,"abstract":"The application of Natural Language Processing (NLP) methods and resources to biomedical textual data has received growing attention over the past years. Previously organized biomedical NLP-shared tasks (such as, for example, BioNLP Shared Tasks) are related to extracting different biomedical entities (like genes, phenotypes, drugs, diseases, chemical entities) and finding relations between them. However, to the best of our knowledge there are limited NLP methods that can be used for information extraction of entities related to food concepts. For this reason, to extract food entities from unstructured textual data, we propose a rule-based named-entity recognition method for food information extraction, called FoodIE. It is comprised of a small number of rules based on computational linguistics and semantic information that describe the food entities. Experimental results from the evaluation performed using two different datasets showed that very promising results can be achieved. The proposed method achieved 97% precision, 94% recall, and 96% F1 score.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Recognition Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0007686309150922","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 40

Abstract

The application of Natural Language Processing (NLP) methods and resources to biomedical textual data has received growing attention over the past years. Previously organized biomedical NLP-shared tasks (such as, for example, BioNLP Shared Tasks) are related to extracting different biomedical entities (like genes, phenotypes, drugs, diseases, chemical entities) and finding relations between them. However, to the best of our knowledge there are limited NLP methods that can be used for information extraction of entities related to food concepts. For this reason, to extract food entities from unstructured textual data, we propose a rule-based named-entity recognition method for food information extraction, called FoodIE. It is comprised of a small number of rules based on computational linguistics and semantic information that describe the food entities. Experimental results from the evaluation performed using two different datasets showed that very promising results can be achieved. The proposed method achieved 97% precision, 94% recall, and 96% F1 score.
FoodIE:一种基于规则的食品信息抽取命名实体识别方法
自然语言处理(NLP)方法和资源在生物医学文本数据中的应用近年来受到越来越多的关注。以前组织的生物医学nlp共享任务(如BioNLP共享任务)涉及提取不同的生物医学实体(如基因、表型、药物、疾病、化学实体)并找到它们之间的关系。然而,据我们所知,可以用于与食物概念相关的实体信息提取的NLP方法有限。为此,为了从非结构化文本数据中提取食品实体,我们提出了一种基于规则的食品信息抽取命名实体识别方法FoodIE。它由基于计算语言学和描述食物实体的语义信息的少量规则组成。使用两个不同的数据集进行评估的实验结果表明,可以获得非常有希望的结果。该方法的准确率为97%,召回率为94%,F1得分为96%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信