{"title":"Enforcing strictness in integration of dimensions: beyond instance matching","authors":"D. Riazati, J. Thom, Xiuzhen Zhang","doi":"10.1145/2064676.2064679","DOIUrl":null,"url":null,"abstract":"Maintaining strictness in dimensions is important in integration of data warehouses. A dimension that satisfies all of its roll-up constraints is said to be strict, a property that is required for correct aggregation. Existing work on instance matching does not address the problem of enforcing the strictness of roll-up constraints. In this paper, we use a graph matching-based approach to dimension instance matching and propose an algorithm that enforces strictness and reduces false positives. Making use of similarity flooding, the graph matching algorithm can be greedy in identifying matching members, we propose heuristics to further reduce false positive matches and reduce false strictness. Experiments on real-world data demonstrates the effectiveness of our proposed approach.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Data Warehousing and OLAP","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2064676.2064679","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Maintaining strictness in dimensions is important in integration of data warehouses. A dimension that satisfies all of its roll-up constraints is said to be strict, a property that is required for correct aggregation. Existing work on instance matching does not address the problem of enforcing the strictness of roll-up constraints. In this paper, we use a graph matching-based approach to dimension instance matching and propose an algorithm that enforces strictness and reduces false positives. Making use of similarity flooding, the graph matching algorithm can be greedy in identifying matching members, we propose heuristics to further reduce false positive matches and reduce false strictness. Experiments on real-world data demonstrates the effectiveness of our proposed approach.