Self-Teaching Semantic Annotation Method for Knowledge Discovery from Text

Kaiquan Xu, S. Liao, Raymond Y. K. Lau, L. Liao, Heng Tang
{"title":"Self-Teaching Semantic Annotation Method for Knowledge Discovery from Text","authors":"Kaiquan Xu, S. Liao, Raymond Y. K. Lau, L. Liao, Heng Tang","doi":"10.1109/HICSS.2009.898","DOIUrl":null,"url":null,"abstract":"As much valuable domain knowledge is hidden in enterprises' text repositories (e.g., email archives, digital libraries, etc.), it is desirable to develop effective knowledge management tools to process this unstructured data so as to extract domain knowledge for business decision making. Ontology-based semantic annotation of documents is one of the promising ways for knowledge discovery from text repositories. Existing semantic annotation methods usually require many labeled training examples before they can effectively operate, and this bottleneck holds back the widely applications of these semantic annotation methods. In this paper, we propose a semi-supervised semantic annotation method, self-teaching SVM-struct, which uses fewer labeled examples to improve the annotating performance. The key of the self-teaching method is how to identify the reliably predicted examples for retraining. Two novel confidence measures are developed to estimate prediction confidence. The experimental results show that the prediction performance of our self-teaching semantic annotation method is promising.","PeriodicalId":211759,"journal":{"name":"2009 42nd Hawaii International Conference on System Sciences","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 42nd Hawaii International Conference on System Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HICSS.2009.898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

As much valuable domain knowledge is hidden in enterprises' text repositories (e.g., email archives, digital libraries, etc.), it is desirable to develop effective knowledge management tools to process this unstructured data so as to extract domain knowledge for business decision making. Ontology-based semantic annotation of documents is one of the promising ways for knowledge discovery from text repositories. Existing semantic annotation methods usually require many labeled training examples before they can effectively operate, and this bottleneck holds back the widely applications of these semantic annotation methods. In this paper, we propose a semi-supervised semantic annotation method, self-teaching SVM-struct, which uses fewer labeled examples to improve the annotating performance. The key of the self-teaching method is how to identify the reliably predicted examples for retraining. Two novel confidence measures are developed to estimate prediction confidence. The experimental results show that the prediction performance of our self-teaching semantic annotation method is promising.
面向文本知识发现的自学语义标注方法
由于许多有价值的领域知识隐藏在企业的文本库中(如电子邮件档案、数字图书馆等),因此需要开发有效的知识管理工具来处理这些非结构化数据,以便提取领域知识以供业务决策。基于本体的文档语义标注是一种很有前途的文本知识库知识发现方法。现有的语义标注方法通常需要大量的标记训练样例才能有效运行,这一瓶颈制约了语义标注方法的广泛应用。本文提出了一种半监督语义标注方法——自学习svm结构,该方法使用较少的标记样本来提高标注性能。自主学习方法的关键是如何识别可靠的预测样本进行再训练。提出了两种新的置信度测度来估计预测置信度。实验结果表明,该方法具有良好的预测性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信