{"title":"滑动窗口上聚合的高效增量计算","authors":"Chao Zhang, Reza Akbarinia, F. Toumani","doi":"10.1145/3447548.3467360","DOIUrl":null,"url":null,"abstract":"Computing aggregation over sliding windows, i.e., finite subsets of an unbounded stream, is a core operation in streaming analytics. We propose PBA (Parallel Boundary Aggregator), a novel parallel algorithm that groups continuous slices of streaming values into chunks and exploits two buffers, cumulative slice aggregations and left cumulative slice aggregations, to compute sliding window aggregations efficiently. PBA runs in O(1) time, performing at most 3 merging operations per slide while consuming O(n) space for windows with n partial aggregations. Our empirical experiments demonstrate that PBA can improve throughput up to 4X while reducing latency, compared to state-of-the-art algorithms.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Efficient Incremental Computation of Aggregations over Sliding Windows\",\"authors\":\"Chao Zhang, Reza Akbarinia, F. Toumani\",\"doi\":\"10.1145/3447548.3467360\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computing aggregation over sliding windows, i.e., finite subsets of an unbounded stream, is a core operation in streaming analytics. We propose PBA (Parallel Boundary Aggregator), a novel parallel algorithm that groups continuous slices of streaming values into chunks and exploits two buffers, cumulative slice aggregations and left cumulative slice aggregations, to compute sliding window aggregations efficiently. PBA runs in O(1) time, performing at most 3 merging operations per slide while consuming O(n) space for windows with n partial aggregations. Our empirical experiments demonstrate that PBA can improve throughput up to 4X while reducing latency, compared to state-of-the-art algorithms.\",\"PeriodicalId\":421090,\"journal\":{\"name\":\"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3447548.3467360\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3447548.3467360","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient Incremental Computation of Aggregations over Sliding Windows
Computing aggregation over sliding windows, i.e., finite subsets of an unbounded stream, is a core operation in streaming analytics. We propose PBA (Parallel Boundary Aggregator), a novel parallel algorithm that groups continuous slices of streaming values into chunks and exploits two buffers, cumulative slice aggregations and left cumulative slice aggregations, to compute sliding window aggregations efficiently. PBA runs in O(1) time, performing at most 3 merging operations per slide while consuming O(n) space for windows with n partial aggregations. Our empirical experiments demonstrate that PBA can improve throughput up to 4X while reducing latency, compared to state-of-the-art algorithms.