SUPRA: a sampling-query optimization method for large-scale OLAP

Kazutomo Ushijima, S. Fujiwara, I. Nishizawa, Nobutoshi Sagawa
{"title":"SUPRA: a sampling-query optimization method for large-scale OLAP","authors":"Kazutomo Ushijima, S. Fujiwara, I. Nishizawa, Nobutoshi Sagawa","doi":"10.1109/DEXA.1998.707408","DOIUrl":null,"url":null,"abstract":"Relational online analytical processing (ROLAP) reduces the amount of storage required for maintaining various sizes of data cubes by materializing only parts of them in a lazy evaluation manner. In ROLAP however, cube creation queries need to be issued repeatedly in order to search for useful features (i.e. rules or patterns) within large scale databases. The cube creation cost can be a bottleneck in the whole ROLAP processing. The cost of the queries can be effectively reduced by estimating the query results using samples. To maintain the accuracy of ROLAP even when using samples, the samples need to be extracted in an appropriate unit. However, conventional query optimization methods only support record based sampling and cannot be applied for complex queries that have other sampling units, such as the ones that include grouping aggregate operations. We develop a query optimization method named SUPRA that preserves the sampling unit used in random data extraction. The method is designed to preserve both the sampling unit and the randomness of the sampling operation. Using this method, typical ROLAP queries can be transformed into more efficient ones than those obtained through conventional methods.","PeriodicalId":194923,"journal":{"name":"Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130)","volume":"483 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEXA.1998.707408","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Relational online analytical processing (ROLAP) reduces the amount of storage required for maintaining various sizes of data cubes by materializing only parts of them in a lazy evaluation manner. In ROLAP however, cube creation queries need to be issued repeatedly in order to search for useful features (i.e. rules or patterns) within large scale databases. The cube creation cost can be a bottleneck in the whole ROLAP processing. The cost of the queries can be effectively reduced by estimating the query results using samples. To maintain the accuracy of ROLAP even when using samples, the samples need to be extracted in an appropriate unit. However, conventional query optimization methods only support record based sampling and cannot be applied for complex queries that have other sampling units, such as the ones that include grouping aggregate operations. We develop a query optimization method named SUPRA that preserves the sampling unit used in random data extraction. The method is designed to preserve both the sampling unit and the randomness of the sampling operation. Using this method, typical ROLAP queries can be transformed into more efficient ones than those obtained through conventional methods.
SUPRA:一种大规模OLAP的抽样查询优化方法
关系在线分析处理(ROLAP)通过以惰性计算方式只具体化数据集的一部分,减少了维护各种大小的数据集所需的存储量。然而,在ROLAP中,为了在大型数据库中搜索有用的特性(即规则或模式),需要重复发出多维数据集创建查询。多维数据集创建成本可能是整个ROLAP处理过程中的瓶颈。通过使用样本估计查询结果,可以有效地降低查询的成本。即使在使用样品时,为了保持ROLAP的准确性,样品需要在适当的单位中提取。然而,传统的查询优化方法只支持基于记录的采样,不能应用于具有其他采样单元的复杂查询,例如包含分组聚合操作的查询。我们开发了一种名为SUPRA的查询优化方法,该方法保留了随机数据提取中使用的采样单元。该方法被设计为既保留采样单元又保留采样操作的随机性。使用这种方法,可以将典型的ROLAP查询转换为比通过传统方法获得的查询更有效的查询。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信