{"title":"Rectangle-efficient aggregation in spatial data streams","authors":"S. Tirthapura, David P. Woodruff","doi":"10.1145/2213556.2213595","DOIUrl":null,"url":null,"abstract":"We consider the estimation of aggregates over a data stream of multidimensional axis-aligned rectangles. Rectangles are a basic primitive object in spatial databases, and efficient aggregation of rectangles is a fundamental task. The data stream model has emerged as a de facto model for processing massive databases in which the data resides in external memory or the cloud and is streamed through main memory. For a point <i>p</i>, let <i>n(p)</i> denote the sum of the weights of all rectangles in the stream that contain <i>p</i>. We give near-optimal solutions for basic problems, including (1) the <i>k</i>-th frequency moment <i>F<sub>k</sub></i> = ∑ <sub>points <i>p</i></sub>|<i>n(p)</i>|<sup><i>k</i></sup>, (2)~the counting version of stabbing queries, which seeks an estimate of <i>n(p)</i> given <i>p</i>, and (3) identification of heavy-hitters, i.e., points <i>p</i> for which <i>n(p)</i> is large. An important special case of <i>F<sub>k</sub></i> is <i>F<sub>0</sub></i>, which corresponds to the volume of the union of the rectangles. This is a celebrated problem in computational geometry known as \"Klee's measure problem\", and our work yields the first solution in the streaming model for dimensions greater than one.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"115 1","pages":"283-294"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2213556.2213595","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
We consider the estimation of aggregates over a data stream of multidimensional axis-aligned rectangles. Rectangles are a basic primitive object in spatial databases, and efficient aggregation of rectangles is a fundamental task. The data stream model has emerged as a de facto model for processing massive databases in which the data resides in external memory or the cloud and is streamed through main memory. For a point p, let n(p) denote the sum of the weights of all rectangles in the stream that contain p. We give near-optimal solutions for basic problems, including (1) the k-th frequency moment Fk = ∑ points p|n(p)|k, (2)~the counting version of stabbing queries, which seeks an estimate of n(p) given p, and (3) identification of heavy-hitters, i.e., points p for which n(p) is large. An important special case of Fk is F0, which corresponds to the volume of the union of the rectangles. This is a celebrated problem in computational geometry known as "Klee's measure problem", and our work yields the first solution in the streaming model for dimensions greater than one.