Algorithmic Techniques for Independent Query Sampling

Yufei Tao
{"title":"Algorithmic Techniques for Independent Query Sampling","authors":"Yufei Tao","doi":"10.1145/3517804.3526068","DOIUrl":null,"url":null,"abstract":"Unlike a reporting query that returns all the elements satisfying a predicate, query sampling returns only a sample set of those elements and has long been recognized as an important method in database systems. PODS'14 saw the introduction of independent query sampling (IQS), which extends traditional query sampling with the requirement that the sample outputs of all the queries be mutually independent. The new requirement improves the precision of query estimation, facilitates the execution of randomized algorithms, and enhances the fairness and diversity of query answers. IQS calls for new index structures because conventional indexes are designed to report complete query answers and thus becomes too expensive for extracting only a few random samples. The phenomenon has created an exciting opportunity to revisit the structure for every reporting query known in computer science. There has been considerable progress since 2014 in this direction. This paper distills the existing solutions into several generic techniques that, when put together, can be utilized to solve a great variety of IQS problems with attractive performance guarantees.","PeriodicalId":230606,"journal":{"name":"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3517804.3526068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Unlike a reporting query that returns all the elements satisfying a predicate, query sampling returns only a sample set of those elements and has long been recognized as an important method in database systems. PODS'14 saw the introduction of independent query sampling (IQS), which extends traditional query sampling with the requirement that the sample outputs of all the queries be mutually independent. The new requirement improves the precision of query estimation, facilitates the execution of randomized algorithms, and enhances the fairness and diversity of query answers. IQS calls for new index structures because conventional indexes are designed to report complete query answers and thus becomes too expensive for extracting only a few random samples. The phenomenon has created an exciting opportunity to revisit the structure for every reporting query known in computer science. There has been considerable progress since 2014 in this direction. This paper distills the existing solutions into several generic techniques that, when put together, can be utilized to solve a great variety of IQS problems with attractive performance guarantees.
独立查询抽样的算法技术
与返回满足谓词的所有元素的报告查询不同,查询抽样只返回这些元素的一个样本集,并且一直被认为是数据库系统中的一种重要方法。PODS’14引入了独立查询抽样(IQS),它扩展了传统的查询抽样,要求所有查询的样本输出是相互独立的。新的要求提高了查询估计的精度,方便了随机化算法的执行,增强了查询答案的公平性和多样性。IQS需要新的索引结构,因为传统的索引被设计为报告完整的查询答案,因此对于仅提取少量随机样本来说,成本太高。这种现象为重新审视计算机科学中已知的每个报告查询的结构创造了一个令人兴奋的机会。自2014年以来,在这个方向上取得了相当大的进展。本文将现有的解决方案提炼成几种通用技术,当这些技术组合在一起时,可以用于解决各种具有吸引力性能保证的iq问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信