{"title":"I/O-efficient algorithms for answering pattern-based aggregate queries in a sequence OLAP system","authors":"C. Chui, B. Kao, Eric Lo, Reynold Cheng","doi":"10.1145/2063576.2063812","DOIUrl":null,"url":null,"abstract":"Many kinds of real-life data exhibit logical ordering among their data items and are thus sequential in nature. In recent years, the concept of Sequence OLAP (S-OLAP) has been proposed. The biggest distinguishing feature of SOLAP from traditional OLAP is that data sequences managed by an S-OLAP system are characterized by the subsequence/substring patterns they possess. An S-OLAP system thus supports pattern-based grouping and aggregation. Conceptually, an S-OLAP system maintains a sequence data cube which is composed of sequence cuboids. Each sequence cuboid presents the answer of a pattern-based aggregate (PBA) query. This paper focuses on the I/O aspects of evaluating PBA queries. We study the problems of joining plan selection and execution planning, which are the core issues in the design of I/O-efficient cuboid materialization algorithms. Through an empirical study, we show that our algorithms lead to a very I/O-efficient strategy for sequence cuboid materialization.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"6 1","pages":"1619-1628"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2063576.2063812","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Many kinds of real-life data exhibit logical ordering among their data items and are thus sequential in nature. In recent years, the concept of Sequence OLAP (S-OLAP) has been proposed. The biggest distinguishing feature of SOLAP from traditional OLAP is that data sequences managed by an S-OLAP system are characterized by the subsequence/substring patterns they possess. An S-OLAP system thus supports pattern-based grouping and aggregation. Conceptually, an S-OLAP system maintains a sequence data cube which is composed of sequence cuboids. Each sequence cuboid presents the answer of a pattern-based aggregate (PBA) query. This paper focuses on the I/O aspects of evaluating PBA queries. We study the problems of joining plan selection and execution planning, which are the core issues in the design of I/O-efficient cuboid materialization algorithms. Through an empirical study, we show that our algorithms lead to a very I/O-efficient strategy for sequence cuboid materialization.