{"title":"Optimal Quantile Approximation in Streams","authors":"Zohar S. Karnin, Kevin J. Lang, Edo Liberty","doi":"10.1109/FOCS.2016.17","DOIUrl":null,"url":null,"abstract":"This paper resolves one of the longest standing basic problems in the streaming computational model. Namely, optimal construction of quantile sketches. An ε approximate quantile sketch receives a stream of items x1,⋯,xn and allows one to approximate the rank of any query item up to additive error ε n with probability at least 1-δ.The rank of a query x is the number of stream items such that xi ≤ x. The minimal sketch size required for this task is trivially at least 1/ε.Felber and Ostrovsky obtain a O((1/ε)log(1/ε)) space sketch for a fixed δ.Without restrictions on the nature of the stream or the ratio between ε and n, no better upper or lower bounds were known to date. This paper obtains an O((1/ε)log log (1/δ)) space sketch and a matching lower bound. This resolves the open problem and proves a qualitative gap between randomized and deterministic quantile sketching for which an Ω((1/ε)log(1/ε)) lower bound is known. One of our contributions is a novel representation and modification of the widely used merge-and-reduce construction. This modification allows for an analysis which is both tight and extremely simple. The same technique was reported, in private communications, to be useful for improving other sketching objectives and geometric coreset constructions.","PeriodicalId":414001,"journal":{"name":"2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)","volume":"373 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"81","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FOCS.2016.17","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 81
Abstract
This paper resolves one of the longest standing basic problems in the streaming computational model. Namely, optimal construction of quantile sketches. An ε approximate quantile sketch receives a stream of items x1,⋯,xn and allows one to approximate the rank of any query item up to additive error ε n with probability at least 1-δ.The rank of a query x is the number of stream items such that xi ≤ x. The minimal sketch size required for this task is trivially at least 1/ε.Felber and Ostrovsky obtain a O((1/ε)log(1/ε)) space sketch for a fixed δ.Without restrictions on the nature of the stream or the ratio between ε and n, no better upper or lower bounds were known to date. This paper obtains an O((1/ε)log log (1/δ)) space sketch and a matching lower bound. This resolves the open problem and proves a qualitative gap between randomized and deterministic quantile sketching for which an Ω((1/ε)log(1/ε)) lower bound is known. One of our contributions is a novel representation and modification of the widely used merge-and-reduce construction. This modification allows for an analysis which is both tight and extremely simple. The same technique was reported, in private communications, to be useful for improving other sketching objectives and geometric coreset constructions.