Dong Yu, Derong Shen, Mingdong Zhu, Tiezheng Nie, Yue Kou, Ge Yu
{"title":"一种使用两个源质量度量来发现真相的方法","authors":"Dong Yu, Derong Shen, Mingdong Zhu, Tiezheng Nie, Yue Kou, Ge Yu","doi":"10.1109/WISA.2015.76","DOIUrl":null,"url":null,"abstract":"In many web integration applications, there are usually some sources that depict the same entity object with different descriptions, which leads to lots of conflicts. Resolving conflicts and finding the truth can be used to improve the quality of integration or to build a high-quality knowledge base, etc. In the single-truth data conflicting scenario, existing methods have limitations to distinguish false negative, also named as data missing, and false positive. So their source quality measurements are inadequate. Therefore, in this paper, we use recall and false positive rate to measure source quality and present a method to discover truth. The experimental results on three real-word data sets show that the proposed algorithm can effectively distinguish the data missing and false positive and improve the precision of truth discovery.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Method to Discover Truth with Two Source Quality Metrics\",\"authors\":\"Dong Yu, Derong Shen, Mingdong Zhu, Tiezheng Nie, Yue Kou, Ge Yu\",\"doi\":\"10.1109/WISA.2015.76\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In many web integration applications, there are usually some sources that depict the same entity object with different descriptions, which leads to lots of conflicts. Resolving conflicts and finding the truth can be used to improve the quality of integration or to build a high-quality knowledge base, etc. In the single-truth data conflicting scenario, existing methods have limitations to distinguish false negative, also named as data missing, and false positive. So their source quality measurements are inadequate. Therefore, in this paper, we use recall and false positive rate to measure source quality and present a method to discover truth. The experimental results on three real-word data sets show that the proposed algorithm can effectively distinguish the data missing and false positive and improve the precision of truth discovery.\",\"PeriodicalId\":198938,\"journal\":{\"name\":\"2015 12th Web Information System and Application Conference (WISA)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 12th Web Information System and Application Conference (WISA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WISA.2015.76\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 12th Web Information System and Application Conference (WISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2015.76","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Method to Discover Truth with Two Source Quality Metrics
In many web integration applications, there are usually some sources that depict the same entity object with different descriptions, which leads to lots of conflicts. Resolving conflicts and finding the truth can be used to improve the quality of integration or to build a high-quality knowledge base, etc. In the single-truth data conflicting scenario, existing methods have limitations to distinguish false negative, also named as data missing, and false positive. So their source quality measurements are inadequate. Therefore, in this paper, we use recall and false positive rate to measure source quality and present a method to discover truth. The experimental results on three real-word data sets show that the proposed algorithm can effectively distinguish the data missing and false positive and improve the precision of truth discovery.