{"title":"Learning-Augmented Frequency Estimation in Sliding Windows","authors":"Rana Shahout, Ibrahim Sabek, Michael Mitzenmacher","doi":"arxiv-2409.11516","DOIUrl":null,"url":null,"abstract":"We show how to utilize machine learning approaches to improve sliding window\nalgorithms for approximate frequency estimation problems, under the\n``algorithms with predictions'' framework. In this dynamic environment,\nprevious learning-augmented algorithms are less effective, since properties in\nsliding window resolution can differ significantly from the properties of the\nentire stream. Our focus is on the benefits of predicting and filtering out\nitems with large next arrival times -- that is, there is a large gap until\ntheir next appearance -- from the stream, which we show improves the\nmemory-accuracy tradeoffs significantly. We provide theorems that provide\ninsight into how and by how much our technique can improve the sliding window\nalgorithm, as well as experimental results using real-world data sets. Our work\ndemonstrates that predictors can be useful in the challenging sliding window\nsetting.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11516","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We show how to utilize machine learning approaches to improve sliding window
algorithms for approximate frequency estimation problems, under the
``algorithms with predictions'' framework. In this dynamic environment,
previous learning-augmented algorithms are less effective, since properties in
sliding window resolution can differ significantly from the properties of the
entire stream. Our focus is on the benefits of predicting and filtering out
items with large next arrival times -- that is, there is a large gap until
their next appearance -- from the stream, which we show improves the
memory-accuracy tradeoffs significantly. We provide theorems that provide
insight into how and by how much our technique can improve the sliding window
algorithm, as well as experimental results using real-world data sets. Our work
demonstrates that predictors can be useful in the challenging sliding window
setting.