Dong Yu, Derong Shen, Mingdong Zhu, Tiezheng Nie, Yue Kou, Ge Yu
{"title":"A Method to Discover Truth with Two Source Quality Metrics","authors":"Dong Yu, Derong Shen, Mingdong Zhu, Tiezheng Nie, Yue Kou, Ge Yu","doi":"10.1109/WISA.2015.76","DOIUrl":null,"url":null,"abstract":"In many web integration applications, there are usually some sources that depict the same entity object with different descriptions, which leads to lots of conflicts. Resolving conflicts and finding the truth can be used to improve the quality of integration or to build a high-quality knowledge base, etc. In the single-truth data conflicting scenario, existing methods have limitations to distinguish false negative, also named as data missing, and false positive. So their source quality measurements are inadequate. Therefore, in this paper, we use recall and false positive rate to measure source quality and present a method to discover truth. The experimental results on three real-word data sets show that the proposed algorithm can effectively distinguish the data missing and false positive and improve the precision of truth discovery.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 12th Web Information System and Application Conference (WISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2015.76","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In many web integration applications, there are usually some sources that depict the same entity object with different descriptions, which leads to lots of conflicts. Resolving conflicts and finding the truth can be used to improve the quality of integration or to build a high-quality knowledge base, etc. In the single-truth data conflicting scenario, existing methods have limitations to distinguish false negative, also named as data missing, and false positive. So their source quality measurements are inadequate. Therefore, in this paper, we use recall and false positive rate to measure source quality and present a method to discover truth. The experimental results on three real-word data sets show that the proposed algorithm can effectively distinguish the data missing and false positive and improve the precision of truth discovery.