名称实体识别使用归纳逻辑编程

H. T. Le, Thien Huu Nguyen
{"title":"名称实体识别使用归纳逻辑编程","authors":"H. T. Le, Thien Huu Nguyen","doi":"10.1145/1852611.1852626","DOIUrl":null,"url":null,"abstract":"Named entity recognition (NER) is the process of seeking to locate atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, and percentages. It is useful in applying NER to other natural language tasks such as question-answering, text summarization, building semantic web, etc. This paper presents a system, called BKIE, that uses SRV -- an inductive logic program - to extract name entities in Vietnamese text. New predicates and features are added to SRV to deal with characteristics of Vietnamese language. Also, several strategies are proposed in this paper to improve the efficiency of the SRV algorithm. The data set using in experiments is 80 homepages of scientists in Vietnamese language that were tagged manually. The experiments give us the best F-score of 83% for extracting the \"name\" entity. It shows that SRV is an efficient NER algorithm given its advantages of generality and flexibility. In order to increase the system's performance, our future work includes (i) building a larger set of training data to improve system's performance; (ii) implementing BKIE using parallel programming to increase system efficiency; and (iii) testing BKIE with other application domains to get a more accurate evaluation of the system.","PeriodicalId":388053,"journal":{"name":"Proceedings of the 1st Symposium on Information and Communication Technology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2010-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Name entity recognition using inductive logic programming\",\"authors\":\"H. T. Le, Thien Huu Nguyen\",\"doi\":\"10.1145/1852611.1852626\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Named entity recognition (NER) is the process of seeking to locate atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, and percentages. It is useful in applying NER to other natural language tasks such as question-answering, text summarization, building semantic web, etc. This paper presents a system, called BKIE, that uses SRV -- an inductive logic program - to extract name entities in Vietnamese text. New predicates and features are added to SRV to deal with characteristics of Vietnamese language. Also, several strategies are proposed in this paper to improve the efficiency of the SRV algorithm. The data set using in experiments is 80 homepages of scientists in Vietnamese language that were tagged manually. The experiments give us the best F-score of 83% for extracting the \\\"name\\\" entity. It shows that SRV is an efficient NER algorithm given its advantages of generality and flexibility. In order to increase the system's performance, our future work includes (i) building a larger set of training data to improve system's performance; (ii) implementing BKIE using parallel programming to increase system efficiency; and (iii) testing BKIE with other application domains to get a more accurate evaluation of the system.\",\"PeriodicalId\":388053,\"journal\":{\"name\":\"Proceedings of the 1st Symposium on Information and Communication Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1st Symposium on Information and Communication Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1852611.1852626\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st Symposium on Information and Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1852611.1852626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

命名实体识别(NER)是将文本中的原子元素定位到预定义类别(如人名、组织、地点、时间表达式、数量和百分比)的过程。它可以应用于其他自然语言任务,如问答、文本摘要、构建语义网等。本文提出了一个名为BKIE的系统,该系统使用SRV(一种归纳逻辑程序)来提取越南文文本中的名称实体。针对越南语的特点,在SRV中增加了新的谓词和特征。此外,本文还提出了几种提高SRV算法效率的策略。实验中使用的数据集是80个人工标记的越南语科学家的主页。实验给出了提取“name”实体的最佳f值83%。结果表明,SRV算法具有通用性和灵活性等优点,是一种有效的NER算法。为了提高系统的性能,我们未来的工作包括(i)建立更大的训练数据集来提高系统的性能;(ii)采用并行编程方式实施BKIE,以提高系统效率;(iii)将BKIE与其他应用领域进行测试,以获得更准确的系统评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Name entity recognition using inductive logic programming
Named entity recognition (NER) is the process of seeking to locate atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, and percentages. It is useful in applying NER to other natural language tasks such as question-answering, text summarization, building semantic web, etc. This paper presents a system, called BKIE, that uses SRV -- an inductive logic program - to extract name entities in Vietnamese text. New predicates and features are added to SRV to deal with characteristics of Vietnamese language. Also, several strategies are proposed in this paper to improve the efficiency of the SRV algorithm. The data set using in experiments is 80 homepages of scientists in Vietnamese language that were tagged manually. The experiments give us the best F-score of 83% for extracting the "name" entity. It shows that SRV is an efficient NER algorithm given its advantages of generality and flexibility. In order to increase the system's performance, our future work includes (i) building a larger set of training data to improve system's performance; (ii) implementing BKIE using parallel programming to increase system efficiency; and (iii) testing BKIE with other application domains to get a more accurate evaluation of the system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信