Transformation of Continuous Aggregation Join Queries over Data Streams

Q3 Engineering

Journal of Computing Science and Engineering Pub Date : 2007-07-16 DOI:10.5626/jcse.2009.3.1.027

T. Tran, B. Lee

{"title":"Transformation of Continuous Aggregation Join Queries over Data Streams","authors":"T. Tran, B. Lee","doi":"10.5626/jcse.2009.3.1.027","DOIUrl":null,"url":null,"abstract":"We address continuously processing an aggregation join query over data streams. Queries of this type involve both join and aggregation operations, with windows specified on join input streams. To our knowledge, the existing researches address join query optimization and aggregation query optimization as separate problems. Our observation, however, is that by putting them within the same scope of query optimization we can generate more efficient query execution plans. This is through more versatile query transformations, the key idea of which is to perform aggregation before join so join execution time may be reduced. This idea itself is not new (already proposed in the database area), but developing the query transformation rules faces a completely new set of challenges. In this paper, we first propose a query processing model of an aggregation join query with two key stream operators: (1) aggregation set update, which produces an aggregation set of tuples (one tuple per group) and updates it incrementally as new tuples arrive, and (2) aggregation set join, i.e., join between a stream and an aggregation set of tuples. Then, we introduce the concrete query transformation rules specialized to work with streams. The rules are far more compact and yet more general than the rules proposed in the database area. Then, we present a query processing algorithm generic to all alternative query execution plans that can be generated through the transformations, and study the performances of alternative query execution plans through extensive experiments.","PeriodicalId":37773,"journal":{"name":"Journal of Computing Science and Engineering","volume":"29 1","pages":"330-347"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computing Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5626/jcse.2009.3.1.027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Engineering","Score":null,"Total":0}

引用次数: 4

Abstract

We address continuously processing an aggregation join query over data streams. Queries of this type involve both join and aggregation operations, with windows specified on join input streams. To our knowledge, the existing researches address join query optimization and aggregation query optimization as separate problems. Our observation, however, is that by putting them within the same scope of query optimization we can generate more efficient query execution plans. This is through more versatile query transformations, the key idea of which is to perform aggregation before join so join execution time may be reduced. This idea itself is not new (already proposed in the database area), but developing the query transformation rules faces a completely new set of challenges. In this paper, we first propose a query processing model of an aggregation join query with two key stream operators: (1) aggregation set update, which produces an aggregation set of tuples (one tuple per group) and updates it incrementally as new tuples arrive, and (2) aggregation set join, i.e., join between a stream and an aggregation set of tuples. Then, we introduce the concrete query transformation rules specialized to work with streams. The rules are far more compact and yet more general than the rules proposed in the database area. Then, we present a query processing algorithm generic to all alternative query execution plans that can be generated through the transformations, and study the performances of alternative query execution plans through extensive experiments.

查看原文本刊更多论文

数据流上连续聚合连接查询的转换

我们解决了在数据流上连续处理聚合连接查询的问题。这种类型的查询涉及连接和聚合操作，并在连接输入流上指定窗口。据我们所知，现有的研究将连接查询优化和聚合查询优化作为单独的问题进行处理。然而，我们的观察是，通过将它们放在相同的查询优化范围内，我们可以生成更有效的查询执行计划。这是通过更通用的查询转换实现的，其关键思想是在连接之前执行聚合，这样可以减少连接的执行时间。这个想法本身并不新鲜(已经在数据库领域提出了)，但是开发查询转换规则面临着一系列全新的挑战。在本文中，我们首先提出了一个包含两个关键流操作符的聚合连接查询的查询处理模型:(1)聚合集更新，它产生一个元组的聚合集(每组一个元组)，并在新的元组到达时增量更新它;(2)聚合集连接，即流和元组的聚合集之间的连接。然后，我们介绍了专门用于处理流的具体查询转换规则。这些规则比在数据库领域提出的规则要紧凑得多，但也更通用。然后，我们提出了一种通用于所有可通过转换生成的备选查询执行计划的查询处理算法，并通过大量的实验研究了备选查询执行计划的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Computing Science and Engineering Engineering-Engineering (all)

CiteScore

1.00

自引率

0.00%

发文量

期刊介绍： Journal of Computing Science and Engineering (JCSE) is a peer-reviewed quarterly journal that publishes high-quality papers on all aspects of computing science and engineering. The primary objective of JCSE is to be an authoritative international forum for delivering both theoretical and innovative applied researches in the field. JCSE publishes original research contributions, surveys, and experimental studies with scientific advances. The scope of JCSE covers all topics related to computing science and engineering, with a special emphasis on the following areas: Embedded Computing, Ubiquitous Computing, Convergence Computing, Green Computing, Smart and Intelligent Computing, Human Computing.