面向文本知识发现的自学语义标注方法

2009 42nd Hawaii International Conference on System Sciences Pub Date : 2009-01-20 DOI:10.1109/HICSS.2009.898

Kaiquan Xu, S. Liao, Raymond Y. K. Lau, L. Liao, Heng Tang

{"title":"面向文本知识发现的自学语义标注方法","authors":"Kaiquan Xu, S. Liao, Raymond Y. K. Lau, L. Liao, Heng Tang","doi":"10.1109/HICSS.2009.898","DOIUrl":null,"url":null,"abstract":"As much valuable domain knowledge is hidden in enterprises' text repositories (e.g., email archives, digital libraries, etc.), it is desirable to develop effective knowledge management tools to process this unstructured data so as to extract domain knowledge for business decision making. Ontology-based semantic annotation of documents is one of the promising ways for knowledge discovery from text repositories. Existing semantic annotation methods usually require many labeled training examples before they can effectively operate, and this bottleneck holds back the widely applications of these semantic annotation methods. In this paper, we propose a semi-supervised semantic annotation method, self-teaching SVM-struct, which uses fewer labeled examples to improve the annotating performance. The key of the self-teaching method is how to identify the reliably predicted examples for retraining. Two novel confidence measures are developed to estimate prediction confidence. The experimental results show that the prediction performance of our self-teaching semantic annotation method is promising.","PeriodicalId":211759,"journal":{"name":"2009 42nd Hawaii International Conference on System Sciences","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Self-Teaching Semantic Annotation Method for Knowledge Discovery from Text\",\"authors\":\"Kaiquan Xu, S. Liao, Raymond Y. K. Lau, L. Liao, Heng Tang\",\"doi\":\"10.1109/HICSS.2009.898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As much valuable domain knowledge is hidden in enterprises' text repositories (e.g., email archives, digital libraries, etc.), it is desirable to develop effective knowledge management tools to process this unstructured data so as to extract domain knowledge for business decision making. Ontology-based semantic annotation of documents is one of the promising ways for knowledge discovery from text repositories. Existing semantic annotation methods usually require many labeled training examples before they can effectively operate, and this bottleneck holds back the widely applications of these semantic annotation methods. In this paper, we propose a semi-supervised semantic annotation method, self-teaching SVM-struct, which uses fewer labeled examples to improve the annotating performance. The key of the self-teaching method is how to identify the reliably predicted examples for retraining. Two novel confidence measures are developed to estimate prediction confidence. The experimental results show that the prediction performance of our self-teaching semantic annotation method is promising.\",\"PeriodicalId\":211759,\"journal\":{\"name\":\"2009 42nd Hawaii International Conference on System Sciences\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-01-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 42nd Hawaii International Conference on System Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HICSS.2009.898\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 42nd Hawaii International Conference on System Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HICSS.2009.898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

由于许多有价值的领域知识隐藏在企业的文本库中(如电子邮件档案、数字图书馆等)，因此需要开发有效的知识管理工具来处理这些非结构化数据，以便提取领域知识以供业务决策。基于本体的文档语义标注是一种很有前途的文本知识库知识发现方法。现有的语义标注方法通常需要大量的标记训练样例才能有效运行，这一瓶颈制约了语义标注方法的广泛应用。本文提出了一种半监督语义标注方法——自学习svm结构，该方法使用较少的标记样本来提高标注性能。自主学习方法的关键是如何识别可靠的预测样本进行再训练。提出了两种新的置信度测度来估计预测置信度。实验结果表明，该方法具有良好的预测性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Self-Teaching Semantic Annotation Method for Knowledge Discovery from Text

As much valuable domain knowledge is hidden in enterprises' text repositories (e.g., email archives, digital libraries, etc.), it is desirable to develop effective knowledge management tools to process this unstructured data so as to extract domain knowledge for business decision making. Ontology-based semantic annotation of documents is one of the promising ways for knowledge discovery from text repositories. Existing semantic annotation methods usually require many labeled training examples before they can effectively operate, and this bottleneck holds back the widely applications of these semantic annotation methods. In this paper, we propose a semi-supervised semantic annotation method, self-teaching SVM-struct, which uses fewer labeled examples to improve the annotating performance. The key of the self-teaching method is how to identify the reliably predicted examples for retraining. Two novel confidence measures are developed to estimate prediction confidence. The experimental results show that the prediction performance of our self-teaching semantic annotation method is promising.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2009 42nd Hawaii International Conference on System Sciences

自引率

0.00%

发文量