压缩立方体:减少数据立方体大小的有效方法

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI:10.1109/ICDE.2002.994705

Wei Wang, Hongjun Lu, Jianlin Feng, J. Yu

{"title":"压缩立方体:减少数据立方体大小的有效方法","authors":"Wei Wang, Hongjun Lu, Jianlin Feng, J. Yu","doi":"10.1109/ICDE.2002.994705","DOIUrl":null,"url":null,"abstract":"Pre-computed data cube facilitates OLAP (on-line analytical processing). It is well-known that data cube computation is an expensive operation. While most algorithms have been devoted to optimizing memory management and reducing computation costs, less work has addressed a fundamental issue: the size of a data cube is huge when a large base relation with a large number of attributes is involved. In this paper, we propose a new concept, called a condensed data cube. The condensed cube is of much smaller size than a complete non-condensed cube. More importantly, it is a fully pre-computed cube without compression, and, hence, it requires neither decompression nor further aggregation when answering queries. Several algorithms for computing a condensed cube are proposed. Results of experiments on the effectiveness of condensed data cube are presented, using both synthetic and real-world data. The results indicate that the proposed condensed cube can reduce both the cube size and therefore its computation time.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"179","resultStr":"{\"title\":\"Condensed cube: an effective approach to reducing data cube size\",\"authors\":\"Wei Wang, Hongjun Lu, Jianlin Feng, J. Yu\",\"doi\":\"10.1109/ICDE.2002.994705\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pre-computed data cube facilitates OLAP (on-line analytical processing). It is well-known that data cube computation is an expensive operation. While most algorithms have been devoted to optimizing memory management and reducing computation costs, less work has addressed a fundamental issue: the size of a data cube is huge when a large base relation with a large number of attributes is involved. In this paper, we propose a new concept, called a condensed data cube. The condensed cube is of much smaller size than a complete non-condensed cube. More importantly, it is a fully pre-computed cube without compression, and, hence, it requires neither decompression nor further aggregation when answering queries. Several algorithms for computing a condensed cube are proposed. Results of experiments on the effectiveness of condensed data cube are presented, using both synthetic and real-world data. The results indicate that the proposed condensed cube can reduce both the cube size and therefore its computation time.\",\"PeriodicalId\":191529,\"journal\":{\"name\":\"Proceedings 18th International Conference on Data Engineering\",\"volume\":\"78 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"179\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 18th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2002.994705\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 18th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2002.994705","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 179

摘要

预先计算的数据立方体便于联机分析处理。众所周知，数据立方体计算是一项昂贵的操作。虽然大多数算法都致力于优化内存管理和降低计算成本，但较少的工作解决了一个基本问题:当涉及具有大量属性的大型基关系时，数据立方体的大小是巨大的。在本文中，我们提出了一个新的概念，称为压缩数据立方体。浓缩的立方体比完全的非浓缩立方体要小得多。更重要的是，它是一个没有压缩的完全预先计算的多维数据集，因此，在回答查询时既不需要解压缩，也不需要进一步聚合。提出了几种计算压缩立方体的算法。用合成数据和真实数据对压缩数据立方体的有效性进行了实验。结果表明，所提出的压缩立方体既可以减少立方体的大小，也可以减少计算时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Condensed cube: an effective approach to reducing data cube size

Pre-computed data cube facilitates OLAP (on-line analytical processing). It is well-known that data cube computation is an expensive operation. While most algorithms have been devoted to optimizing memory management and reducing computation costs, less work has addressed a fundamental issue: the size of a data cube is huge when a large base relation with a large number of attributes is involved. In this paper, we propose a new concept, called a condensed data cube. The condensed cube is of much smaller size than a complete non-condensed cube. More importantly, it is a fully pre-computed cube without compression, and, hence, it requires neither decompression nor further aggregation when answering queries. Several algorithms for computing a condensed cube are proposed. Results of experiments on the effectiveness of condensed data cube are presented, using both synthetic and real-world data. The results indicate that the proposed condensed cube can reduce both the cube size and therefore its computation time.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings 18th International Conference on Data Engineering

自引率

0.00%

发文量