基于三重置信度评价的知识库错误三元组检测

Haihua Xie, Xiaoqing Lu, Zhi Tang, Xiaojun Huang
{"title":"基于三重置信度评价的知识库错误三元组检测","authors":"Haihua Xie, Xiaoqing Lu, Zhi Tang, Xiaojun Huang","doi":"10.1145/3133811.3133829","DOIUrl":null,"url":null,"abstract":"The knowledge base is an important form of data storage and organization in the fields of knowledge service, and it is the basis of knowledge representation learning. The accuracy of the contents in the knowledge base determines the effectiveness of knowledge service applications. This study proposes a generic computational methodology to evaluate the confidence level of triples in knowledge bases and detect potentially incorrect ones for further verification. In our methodology, the confidence of a triple is evaluated based on weighted feature words that are able to characterize the subject-object relation embedded in the triple, and the feature words are extracted from a corpus of natural language sentences using statistical and natural language processing techniques. Based on the calculated confidence values of triples, incorrect triples are detected using machine-learning-based classification. An experiment on a data set of industry applications has been conducted to demonstrate the workflow of evaluating triple confidence and detecting in-correct triples using our methodology.","PeriodicalId":403248,"journal":{"name":"Proceedings of the 3rd International Conference on Industrial and Business Engineering","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detect Incorrect Triples in Knowledge Base Based on Triple Confidence Evaluation\",\"authors\":\"Haihua Xie, Xiaoqing Lu, Zhi Tang, Xiaojun Huang\",\"doi\":\"10.1145/3133811.3133829\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The knowledge base is an important form of data storage and organization in the fields of knowledge service, and it is the basis of knowledge representation learning. The accuracy of the contents in the knowledge base determines the effectiveness of knowledge service applications. This study proposes a generic computational methodology to evaluate the confidence level of triples in knowledge bases and detect potentially incorrect ones for further verification. In our methodology, the confidence of a triple is evaluated based on weighted feature words that are able to characterize the subject-object relation embedded in the triple, and the feature words are extracted from a corpus of natural language sentences using statistical and natural language processing techniques. Based on the calculated confidence values of triples, incorrect triples are detected using machine-learning-based classification. An experiment on a data set of industry applications has been conducted to demonstrate the workflow of evaluating triple confidence and detecting in-correct triples using our methodology.\",\"PeriodicalId\":403248,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Industrial and Business Engineering\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Industrial and Business Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3133811.3133829\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Industrial and Business Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3133811.3133829","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

知识库是知识服务领域中重要的数据存储和组织形式,是知识表示学习的基础。知识库内容的准确性决定了知识服务应用的有效性。本研究提出了一种通用的计算方法来评估知识库中三元组的置信度,并检测可能不正确的三元组以供进一步验证。在我们的方法中,基于能够表征三元组中嵌入的主客体关系的加权特征词来评估三元组的置信度,并使用统计和自然语言处理技术从自然语言句子的语料库中提取特征词。基于计算的三元组置信度值,使用基于机器学习的分类检测不正确的三元组。在工业应用数据集上进行了实验,以演示使用我们的方法评估三重置信度和检测不正确三重的工作流程。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Detect Incorrect Triples in Knowledge Base Based on Triple Confidence Evaluation
The knowledge base is an important form of data storage and organization in the fields of knowledge service, and it is the basis of knowledge representation learning. The accuracy of the contents in the knowledge base determines the effectiveness of knowledge service applications. This study proposes a generic computational methodology to evaluate the confidence level of triples in knowledge bases and detect potentially incorrect ones for further verification. In our methodology, the confidence of a triple is evaluated based on weighted feature words that are able to characterize the subject-object relation embedded in the triple, and the feature words are extracted from a corpus of natural language sentences using statistical and natural language processing techniques. Based on the calculated confidence values of triples, incorrect triples are detected using machine-learning-based classification. An experiment on a data set of industry applications has been conducted to demonstrate the workflow of evaluating triple confidence and detecting in-correct triples using our methodology.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信