Efficient Top-k Indexing via General Reductions

Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems Pub Date : 2016-06-15 DOI:10.1145/2902251.2902290

S. Rahul, Yufei Tao

{"title":"Efficient Top-k Indexing via General Reductions","authors":"S. Rahul, Yufei Tao","doi":"10.1145/2902251.2902290","DOIUrl":null,"url":null,"abstract":"Let D be a set of n elements each associated with a real-valued weight, and Q be the set of all possible predicates allowed on those elements. Given a predicate in Q and integer k, a top-k query returns the k elements with the largest weights among the elements of D satisfying q. The corresponding data structure problem aims to store D in small space to allow every query to be answered efficiently. It is already known that, before settling the problem, one must be able to solve two degenerated accompanying problems: (i) prioritized reporting: given a predicate q ∈ Q and a real value τ, return all the elements of D satisfying q and having weights at least τ (ii) max reporting: top-k queries with k fixed to 1. In this paper we prove general reductions in external memory that explore the opposite direction. Our first reduction shows that, (under mild conditions) any prioritized reporting structure yields a static top-$k$ structure with only a slow-down in query time by a factor of O(logB n), where B is the block size. Our second reduction shows that if one additionally has a max reporting structure, then combining the two structures yields a top-k structure with no performance slow down (in space, query, and update) in expectation. These reductions significantly simplify the design of top-k structures, as we showcase on numerous problems including halfspace reporting, circular reporting, interval stabbing, point enclosure, and 3d dominance. All the techniques proposed work directly in the RAM model as well.","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"263 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2902251.2902290","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Let D be a set of n elements each associated with a real-valued weight, and Q be the set of all possible predicates allowed on those elements. Given a predicate in Q and integer k, a top-k query returns the k elements with the largest weights among the elements of D satisfying q. The corresponding data structure problem aims to store D in small space to allow every query to be answered efficiently. It is already known that, before settling the problem, one must be able to solve two degenerated accompanying problems: (i) prioritized reporting: given a predicate q ∈ Q and a real value τ, return all the elements of D satisfying q and having weights at least τ (ii) max reporting: top-k queries with k fixed to 1. In this paper we prove general reductions in external memory that explore the opposite direction. Our first reduction shows that, (under mild conditions) any prioritized reporting structure yields a static top-$k$ structure with only a slow-down in query time by a factor of O(logB n), where B is the block size. Our second reduction shows that if one additionally has a max reporting structure, then combining the two structures yields a top-k structure with no performance slow down (in space, query, and update) in expectation. These reductions significantly simplify the design of top-k structures, as we showcase on numerous problems including halfspace reporting, circular reporting, interval stabbing, point enclosure, and 3d dominance. All the techniques proposed work directly in the RAM model as well.

查看原文本刊更多论文

通过一般约简实现高效的Top-k索引

设D是n个元素的集合，每个元素都有一个实值的权值，Q是这些元素上允许的所有可能谓词的集合。给定Q中的谓词和整数k, top-k查询返回D中满足Q的元素中权值最大的k个元素。相应的数据结构问题旨在将D存储在较小的空间中，使每个查询都能得到有效的回答。众所周知，在解决问题之前，必须能够解决两个退化的伴随问题:(i)优先报告:给定谓词q∈q和实数τ，返回D中满足q且权重至少为τ的所有元素(ii) max报告:k固定为1的top-k查询。在本文中，我们证明了探索相反方向的外部存储器的一般减少。我们的第一个简化表明，(在温和的条件下)任何优先级的报告结构都会产生一个静态的top-$k$结构，查询时间只会减慢O(logB n)，其中B是块大小。我们的第二次简化表明，如果另外有一个max报告结构，那么结合这两个结构会产生top-k结构，而不会出现性能下降(在空间、查询和更新方面)。这些简化极大地简化了top-k结构的设计，正如我们展示的许多问题，包括半空间报告、圆形报告、间隔刺穿、点包围和3d优势。所有提出的技术都可以直接在RAM模型中工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems

自引率

0.00%

发文量