{"title":"Processing Exact Results for Sliding Window Joins over Time-Sequence, Streaming Data Using a Disk Archive","authors":"Abhirup Chakraborty, Ajit Singh","doi":"10.1109/ACIIDS.2009.64","DOIUrl":null,"url":null,"abstract":"We consider the problem of processing exact results for sliding window joins over data streams with limited memory. Existing approaches deal with memory limitations by shedding loads, and therefore cannot provide exact or even highly accurate results for sliding window joins over data streams showing time varying rate of data arrivals. We provide an Exact Window Join (EWJ) algorithm incorporating disk storage as an archive. Our algorithm spills window data onto the disk on a periodic basis, refines the output result by properly retrieving the disk resident data, and maximizes output rate by employing techniques to manage the memory blocks. The problem of managing the window blocks in memory--similar in nature to the caching issue--captures both the temporal and frequency related properties of the stream arrivals. At the same, we improve I/O efficiency by amortizing a disk scan over a large number of input tuple. We provide experimental results demonstrating the performance and effectiveness of the proposed algorithm.","PeriodicalId":275776,"journal":{"name":"2009 First Asian Conference on Intelligent Information and Database Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 First Asian Conference on Intelligent Information and Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACIIDS.2009.64","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We consider the problem of processing exact results for sliding window joins over data streams with limited memory. Existing approaches deal with memory limitations by shedding loads, and therefore cannot provide exact or even highly accurate results for sliding window joins over data streams showing time varying rate of data arrivals. We provide an Exact Window Join (EWJ) algorithm incorporating disk storage as an archive. Our algorithm spills window data onto the disk on a periodic basis, refines the output result by properly retrieving the disk resident data, and maximizes output rate by employing techniques to manage the memory blocks. The problem of managing the window blocks in memory--similar in nature to the caching issue--captures both the temporal and frequency related properties of the stream arrivals. At the same, we improve I/O efficiency by amortizing a disk scan over a large number of input tuple. We provide experimental results demonstrating the performance and effectiveness of the proposed algorithm.