{"title":"语义Web实例匹配的候选生成","authors":"B. Vijaya, P. Gharpure","doi":"10.1109/ICCIDS.2019.8862131","DOIUrl":null,"url":null,"abstract":"The growth of semantic web has given rise to proliferation of data sources wherein the task of recognizing real world entities and identifying multiple references of the same real world entity becomes an essential task in order to facilitate sharing and integration of data. Due to the heterogeneous nature of data on the semantic web, entities belonging to different sources are compared by assessing the similarity of features that are common in order to identify matches. With the increasing size of data sets Candidate generation methods are generally employed to avoid quadratic time complexity that would otherwise be incurred if pairwise similarity of all entities are computed. Here we propose a novel index based approach for candidate generation and reduction. The evaluation shows that the proposed method scales well and improves recall significantly.","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Candidate Generation for Instance Matching on Semantic Web\",\"authors\":\"B. Vijaya, P. Gharpure\",\"doi\":\"10.1109/ICCIDS.2019.8862131\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The growth of semantic web has given rise to proliferation of data sources wherein the task of recognizing real world entities and identifying multiple references of the same real world entity becomes an essential task in order to facilitate sharing and integration of data. Due to the heterogeneous nature of data on the semantic web, entities belonging to different sources are compared by assessing the similarity of features that are common in order to identify matches. With the increasing size of data sets Candidate generation methods are generally employed to avoid quadratic time complexity that would otherwise be incurred if pairwise similarity of all entities are computed. Here we propose a novel index based approach for candidate generation and reduction. The evaluation shows that the proposed method scales well and improves recall significantly.\",\"PeriodicalId\":196915,\"journal\":{\"name\":\"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIDS.2019.8862131\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIDS.2019.8862131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Candidate Generation for Instance Matching on Semantic Web
The growth of semantic web has given rise to proliferation of data sources wherein the task of recognizing real world entities and identifying multiple references of the same real world entity becomes an essential task in order to facilitate sharing and integration of data. Due to the heterogeneous nature of data on the semantic web, entities belonging to different sources are compared by assessing the similarity of features that are common in order to identify matches. With the increasing size of data sets Candidate generation methods are generally employed to avoid quadratic time complexity that would otherwise be incurred if pairwise similarity of all entities are computed. Here we propose a novel index based approach for candidate generation and reduction. The evaluation shows that the proposed method scales well and improves recall significantly.