用于清理和查询不一致数据仓库的扩展维度

J. Ramírez, Loreto Bravo, Mónica Caniupán Marileo
{"title":"用于清理和查询不一致数据仓库的扩展维度","authors":"J. Ramírez, Loreto Bravo, Mónica Caniupán Marileo","doi":"10.1145/2513190.2513193","DOIUrl":null,"url":null,"abstract":"A dimension in a data warehouse (DW) is an abstract concept that groups data that share a common semantic meaning. The dimensions are modeled using a hierarchical schema of categories. A dimension is called strict if every element of each category has exactly one ancestor in each parent category, and covering if each element of a category has an ancestor in each parent category. If a dimension is strict and covering we can use pre-computed results at lower levels to answer queries at higher levels. This capability of computing summaries is vital for efficiency purposes. Nevertheless, when dimensions are not strict/covering it is important to know their strictness and covering constraints to keep the capability of obtaining correct summarizations. Real world dimensions might fail to satisfy these constraints, and, in these cases, it is important to find ways to fix the dimensions (correct them) or find ways to get correct answers to queries posed on inconsistent dimensions. A minimal repair is a new dimension that satisfies the strictness and covering constraints, and that is obtained from the original dimension through a minimum number of changes. The set of minimal repairs can be used as a tool to compute answers to aggregate queries in the presence of inconsistencies. However, computing all of them is NP-hard. In this paper, instead of trying to find all possible minimal repairs, we define a single compatible repair that is consistent with respect to both strictness and covering constraints, is close to the inconsistent dimension, can be computed efficiently and can be used to compute approximate answers to aggregate queries. In order to define the compatible repair we defined the notion of extended dimension that supports sets of elements in categories.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Extended dimensions for cleaning and querying inconsistent data warehouses\",\"authors\":\"J. Ramírez, Loreto Bravo, Mónica Caniupán Marileo\",\"doi\":\"10.1145/2513190.2513193\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A dimension in a data warehouse (DW) is an abstract concept that groups data that share a common semantic meaning. The dimensions are modeled using a hierarchical schema of categories. A dimension is called strict if every element of each category has exactly one ancestor in each parent category, and covering if each element of a category has an ancestor in each parent category. If a dimension is strict and covering we can use pre-computed results at lower levels to answer queries at higher levels. This capability of computing summaries is vital for efficiency purposes. Nevertheless, when dimensions are not strict/covering it is important to know their strictness and covering constraints to keep the capability of obtaining correct summarizations. Real world dimensions might fail to satisfy these constraints, and, in these cases, it is important to find ways to fix the dimensions (correct them) or find ways to get correct answers to queries posed on inconsistent dimensions. A minimal repair is a new dimension that satisfies the strictness and covering constraints, and that is obtained from the original dimension through a minimum number of changes. The set of minimal repairs can be used as a tool to compute answers to aggregate queries in the presence of inconsistencies. However, computing all of them is NP-hard. In this paper, instead of trying to find all possible minimal repairs, we define a single compatible repair that is consistent with respect to both strictness and covering constraints, is close to the inconsistent dimension, can be computed efficiently and can be used to compute approximate answers to aggregate queries. In order to define the compatible repair we defined the notion of extended dimension that supports sets of elements in categories.\",\"PeriodicalId\":335396,\"journal\":{\"name\":\"International Workshop on Data Warehousing and OLAP\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Workshop on Data Warehousing and OLAP\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2513190.2513193\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Data Warehousing and OLAP","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2513190.2513193","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

数据仓库(DW)中的维度是一个抽象概念,它将共享共同语义的数据分组。维度是使用类别的分层模式建模的。如果每个类别的每个元素在每个父类别中只有一个祖先,则称为严格维度;如果类别的每个元素在每个父类别中都有一个祖先,则称为覆盖维度。如果维度是严格且覆盖的,我们可以在较低级别使用预先计算的结果来回答较高级别的查询。这种计算摘要的能力对于提高效率至关重要。然而,当维度不严格/覆盖时,了解它们的严格性和覆盖约束以保持获得正确总结的能力是很重要的。现实世界的维度可能无法满足这些约束,在这种情况下,找到修复维度(纠正它们)的方法或找到对不一致维度提出的查询获得正确答案的方法是很重要的。最小修复是一个满足严格性和覆盖约束的新维度,它是通过最小数量的更改从原始维度获得的。最小修复集可以用作计算不一致情况下聚合查询的答案的工具。然而,计算所有这些都是np困难的。在本文中,我们不是试图找到所有可能的最小修复,而是定义了一个单一的兼容修复,它在严格约束和覆盖约束方面是一致的,接近不一致的维度,可以有效地计算,并且可以用于计算聚合查询的近似答案。为了定义兼容修复,我们定义了扩展维度的概念,它支持类别中的元素集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Extended dimensions for cleaning and querying inconsistent data warehouses
A dimension in a data warehouse (DW) is an abstract concept that groups data that share a common semantic meaning. The dimensions are modeled using a hierarchical schema of categories. A dimension is called strict if every element of each category has exactly one ancestor in each parent category, and covering if each element of a category has an ancestor in each parent category. If a dimension is strict and covering we can use pre-computed results at lower levels to answer queries at higher levels. This capability of computing summaries is vital for efficiency purposes. Nevertheless, when dimensions are not strict/covering it is important to know their strictness and covering constraints to keep the capability of obtaining correct summarizations. Real world dimensions might fail to satisfy these constraints, and, in these cases, it is important to find ways to fix the dimensions (correct them) or find ways to get correct answers to queries posed on inconsistent dimensions. A minimal repair is a new dimension that satisfies the strictness and covering constraints, and that is obtained from the original dimension through a minimum number of changes. The set of minimal repairs can be used as a tool to compute answers to aggregate queries in the presence of inconsistencies. However, computing all of them is NP-hard. In this paper, instead of trying to find all possible minimal repairs, we define a single compatible repair that is consistent with respect to both strictness and covering constraints, is close to the inconsistent dimension, can be computed efficiently and can be used to compute approximate answers to aggregate queries. In order to define the compatible repair we defined the notion of extended dimension that supports sets of elements in categories.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信