Learning Term Taxonomy Relationship from a Large Collection of Plain Text

Shang-En Yang, Hung-Yuan Chen, Vorakit Vorakitphan, Yao-Chung Fan
{"title":"Learning Term Taxonomy Relationship from a Large Collection of Plain Text","authors":"Shang-En Yang, Hung-Yuan Chen, Vorakit Vorakitphan, Yao-Chung Fan","doi":"10.1109/ICS.2016.0061","DOIUrl":null,"url":null,"abstract":"In this paper, we present a heuristic for labeling a given term a taxonomy label. Specifically, for a given term, our goal is to construct a model for determining an \"is-a\" relationship between the given term and an inferred concept. Such term-labelling problem is not new, but the existing solutions require semi-supervised training processing, e.g., supervised LDA, or rely on lexicographers, e.g., wordnet. The model construction cost becomes burdens for employing such semantic understanding capability in various emerging applications. Aiming at these issues, in this study, we present a lightweight approach with the following features. First, the proposed approach is unsupervised and take only pain text as inputs. Second, the proposed approach allows incremental model construction. Third, the proposed approach is simple but effective and computationally efficient in comparison with the existing solutions. We demonstrate these results through experiments by comparing our approach with DBpedia and employ the popular search terms as test set. From experiment results, we see that 30 percent improvement in accuracy can be achieved by the proposed approach.","PeriodicalId":281088,"journal":{"name":"2016 International Computer Symposium (ICS)","volume":"192 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Computer Symposium (ICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICS.2016.0061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In this paper, we present a heuristic for labeling a given term a taxonomy label. Specifically, for a given term, our goal is to construct a model for determining an "is-a" relationship between the given term and an inferred concept. Such term-labelling problem is not new, but the existing solutions require semi-supervised training processing, e.g., supervised LDA, or rely on lexicographers, e.g., wordnet. The model construction cost becomes burdens for employing such semantic understanding capability in various emerging applications. Aiming at these issues, in this study, we present a lightweight approach with the following features. First, the proposed approach is unsupervised and take only pain text as inputs. Second, the proposed approach allows incremental model construction. Third, the proposed approach is simple but effective and computationally efficient in comparison with the existing solutions. We demonstrate these results through experiments by comparing our approach with DBpedia and employ the popular search terms as test set. From experiment results, we see that 30 percent improvement in accuracy can be achieved by the proposed approach.
从大量纯文本中学习术语分类关系
在本文中,我们提出了一种启发式的标记给定术语的分类标签。具体来说,对于给定的术语,我们的目标是构建一个模型,用于确定给定术语和推断概念之间的“is-a”关系。这样的术语标注问题并不新鲜,但是现有的解决方案需要半监督训练处理,例如监督LDA,或者依赖词典编纂者,例如wordnet。在各种新兴的应用中,模型的构建成本成为使用这种语义理解能力的负担。针对这些问题,在本研究中,我们提出了一种具有以下特征的轻量级方法。首先,所提出的方法是无监督的,只接受疼痛文本作为输入。其次,所提出的方法允许增量模型构建。第三,与现有的解决方案相比,该方法简单有效,计算效率高。我们通过实验将我们的方法与DBpedia进行比较,并使用流行的搜索词作为测试集,从而证明了这些结果。实验结果表明,采用该方法可以提高30%的精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信