Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems最新文献

筛选
英文 中文
Anti-Persistence on Persistent Storage: History-Independent Sparse Tables and Dictionaries 持久化存储的反持久化:历史无关的稀疏表和字典
M. A. Bender, Jonathan W. Berry, Rob Johnson, Thomas M. Kroeger, Samuel McCauley, C. Phillips, B. Simon, Shikha Singh, David Zage
{"title":"Anti-Persistence on Persistent Storage: History-Independent Sparse Tables and Dictionaries","authors":"M. A. Bender, Jonathan W. Berry, Rob Johnson, Thomas M. Kroeger, Samuel McCauley, C. Phillips, B. Simon, Shikha Singh, David Zage","doi":"10.1145/2902251.2902276","DOIUrl":"https://doi.org/10.1145/2902251.2902276","url":null,"abstract":"We present history-independent alternatives to a B-tree, the primary indexing data structure used in databases. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation of the data structure that is not already available through the API. We show how to build a history-independent cache-oblivious B-tree and a history-independent external-memory skip list. One of the main contributions is a data structure we build on the way---a history-independent packed-memory array (PMA). The PMA supports efficient range queries, one of the most important operations for answering database queries. Our HI PMA matches the asymptotic bounds of prior non-HI packed-memory arrays and sparse tables. Specifically, a PMA maintains a dynamic set of elements in sorted order in a linear-sized array. Inserts and deletes take an amortized O(log2 N) element moves with high probability. Simple experiments with our implementation of HI PMAs corroborate our theoretical analysis. Comparisons to regular PMAs give preliminary indications that the practical cost of adding history-independence is not too large. Our HI cache-oblivious B-tree bounds match those of prior non-HI cache-oblivious B-trees. Searches take O(logB N) I/Os; inserts and deletes take O((log2 N)/B+ logB N) amortized I/Os with high probability; and range queries returning k elements take O(logB N + k/B) I/Os. Our HI external-memory skip list achieves optimal bounds with high probability, analogous to in-memory skip lists: O(logB N) I/Os for point queries and amortized O(logB N) I/Os for inserts/deletes. Range queries returning k elements run in O(logB N + k/B) I/Os. In contrast, the best possible high-probability bounds for inserting into the folklore B-skip list, which promotes elements with probability 1/B, is just Theta(log N) I/Os. This is no better than the bounds one gets from running an in-memory skip list in external memory.","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125787227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Session details: Session 3: Data Streams and Indexes 会话详细信息:会话3:数据流和索引
Sudeepa Roy
{"title":"Session details: Session 3: Data Streams and Indexes","authors":"Sudeepa Roy","doi":"10.1145/3252639","DOIUrl":"https://doi.org/10.1145/3252639","url":null,"abstract":"","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"11 suppl_1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124093181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Top-k Indexing via General Reductions 通过一般约简实现高效的Top-k索引
S. Rahul, Yufei Tao
{"title":"Efficient Top-k Indexing via General Reductions","authors":"S. Rahul, Yufei Tao","doi":"10.1145/2902251.2902290","DOIUrl":"https://doi.org/10.1145/2902251.2902290","url":null,"abstract":"Let D be a set of n elements each associated with a real-valued weight, and Q be the set of all possible predicates allowed on those elements. Given a predicate in Q and integer k, a top-k query returns the k elements with the largest weights among the elements of D satisfying q. The corresponding data structure problem aims to store D in small space to allow every query to be answered efficiently. It is already known that, before settling the problem, one must be able to solve two degenerated accompanying problems: (i) prioritized reporting: given a predicate q ∈ Q and a real value τ, return all the elements of D satisfying q and having weights at least τ (ii) max reporting: top-k queries with k fixed to 1. In this paper we prove general reductions in external memory that explore the opposite direction. Our first reduction shows that, (under mild conditions) any prioritized reporting structure yields a static top-$k$ structure with only a slow-down in query time by a factor of O(logB n), where B is the block size. Our second reduction shows that if one additionally has a max reporting structure, then combining the two structures yields a top-k structure with no performance slow down (in space, query, and update) in expectation. These reductions significantly simplify the design of top-k structures, as we showcase on numerous problems including halfspace reporting, circular reporting, interval stabbing, point enclosure, and 3d dominance. All the techniques proposed work directly in the RAM model as well.","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"263 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126973145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Session details: Session 4: Query Evaluation 会话详细信息:会话4:查询评估
Paris Kourtris
{"title":"Session details: Session 4: Query Evaluation","authors":"Paris Kourtris","doi":"10.1145/3252641","DOIUrl":"https://doi.org/10.1145/3252641","url":null,"abstract":"","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116854976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hypertree Decompositions: Questions and Answers 超树分解:问答
G. Gottlob, G. Greco, N. Leone, Francesco Scarcello
{"title":"Hypertree Decompositions: Questions and Answers","authors":"G. Gottlob, G. Greco, N. Leone, Francesco Scarcello","doi":"10.1145/2902251.2902309","DOIUrl":"https://doi.org/10.1145/2902251.2902309","url":null,"abstract":"In the database context, the hypertree decomposition method is used for query optimization, whereby conjunctive queries having a low degree of cyclicity can be recognized and decomposed automatically, and efficiently evaluated. Hypertree decompositions were introduced at ACM PODS 1999. The present paper reviews' in form of questions and answers' the main relevant concepts and algorithms and surveys selected related work including applications and test results.","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127159241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Designing a Query Language for RDF: Marrying Open and Closed Worlds 为RDF设计一种查询语言:结合开放和封闭的世界
M. Arenas, M. Ugarte
{"title":"Designing a Query Language for RDF: Marrying Open and Closed Worlds","authors":"M. Arenas, M. Ugarte","doi":"10.1145/2902251.2902298","DOIUrl":"https://doi.org/10.1145/2902251.2902298","url":null,"abstract":"When querying an RDF graph, a prominent feature is the possibility of extending the answer to a query with optional information. However, the definition of this feature in SPARQL --the standard RDF query language-- has raised some important issues. Most notably, the use of this feature increases the complexity of the evaluation problem, and its closed-world semantics is in conflict with the underlying open-world semantics of RDF. Many approaches for fixing such problems have been proposed, being the most prominent the introduction of the semantic notion of weakly-monotone SPARQL query. Weakly-monotone SPARQL queries have shaped the class of queries that conform to the open-world semantics of RDF. Unfortunately, finding an effective way of restricting SPARQL to the fragment of weakly-monotone queries has proven to be an elusive problem. In practice, the most widely adopted fragment for writing SPARQL queries is based on the syntactic notion of well designedness. This notion has proven to be a good approach for writing SPARQL queries, but its expressive power has yet to be fully understood. The starting point of this paper is to understand the relation between well-designed queries and the semantic notion of weak monotonicity. It is known that every well-designed SPARQL query is weakly monotone; as our first contribution we prove that the converse does not hold, even if an extension of this notion based on the use of disjunction is considered. Given this negative result, we embark on the task of defining syntactic fragments that are weakly-monotone, and have higher expressive power than the fragment of well-designed queries. To this end, we move to a more general scenario where infinite RDF graphs are also allowed, so that interpolation techniques studied for first-order logic can be applied. With the use of these techniques, we are able to define a new operator for SPARQL that gives rise to a query language with the desired properties (over finite and infinite RDF graphs). It should be noticed that every query in this fragment is weakly monotone if we restrict to the case of finite RDF graphs. Moreover, we use this result to provide a simple characterization of the class of monotone CONSTRUCT queries, that is, the class of SPARQL queries that produce RDF graphs as output. Finally, we pinpoint the complexity of the evaluation problem for the query languages identified in the paper.","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132913641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Range-Max Queries on Uncertain Data 不确定数据的最大范围查询
P. Agarwal, Nirman Kumar, Stavros Sintos, S. Suri
{"title":"Range-Max Queries on Uncertain Data","authors":"P. Agarwal, Nirman Kumar, Stavros Sintos, S. Suri","doi":"10.1145/2902251.2902281","DOIUrl":"https://doi.org/10.1145/2902251.2902281","url":null,"abstract":"Let P be a set of n uncertain points in Red, where each point pi ∈ P is associated with a real value vi and a probability αi ∈ (0,1] of existence, i.e., each pi exists with an independent probability αi. We present algorithms for building an index on P so that for a d-dimensional query rectangle ρ, the expected maximum value or the most-likely maximum value in ρ can be computed quickly. The specific contributions of our paper include the following: (i) The first index of sub-quadratic size to achieve a sub-linear query time in any dimension d ≥ 1. It also provides a trade-off between query time and size of the index. (ii) A conditional lower bound for the most-likely range-max queries, based on the conjectured hardness of the set-intersection problem, which suggests that in the worst case the product (query time)2 x (index size) is Ω((n2}/polylog(n)). (iii) A linear-size index for estimating the expected range-max value within approximation factor 1/2 in O(logc n) time, for some constant c > 0; that is, if the expected maximum value is μ then the query procedure returns a value μ' with μ/2 ≤ μ' ≤ μ. (iv) Extensions of our algorithm to more general uncertainty models and for computing the top-k values of the range-max.","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133883368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Data Management for Social Networking 社交网络的数据管理
Sara Cohen
{"title":"Data Management for Social Networking","authors":"Sara Cohen","doi":"10.1145/2902251.2902306","DOIUrl":"https://doi.org/10.1145/2902251.2902306","url":null,"abstract":"Social networks are fascinating and valuable datasets, which can be leveraged to better understand society, and to make inter-personal choices. This tutorial explores the fundamental issues that arise when storing and querying social data. The discussion is divided into three main parts. First, we consider some of the key computational problems that arise over the social graph structure, such as node centrality, link prediction, community detection and information diffusion. Second, we consider algorithmic challenges that leverage both the textual content and the graph structure of a social network, e.g., social search and querying, and team formation. Finally, we consider critical aspects of implementing a social network database management system, and discuss existing systems. In this tutorial, we also point out gaps between the state-of-the-art and desired features of a data management system for social networking, and discuss open research challenges.","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115115091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
2016 ACM PODS Alberto O. Mendelzon Test-of-Time Award 2016年ACM PODS Alberto O. Mendelzon时间测试奖
M. Arenas, P. Buneman, J. V. D. Bussche
{"title":"2016 ACM PODS Alberto O. Mendelzon Test-of-Time Award","authors":"M. Arenas, P. Buneman, J. V. D. Bussche","doi":"10.1145/2902251.2935710","DOIUrl":"https://doi.org/10.1145/2902251.2935710","url":null,"abstract":"Motivated by reasoning tasks in the context of XML languages, the satisfiability problem of logics on data trees is investigated. The nodes of a data tree have a label from a finite set and a data value from a possibly infinite set. It is shown that satisfiability for two-variable first-order logic is decidable if the tree structure can be accessed only through the child and the next sibling predicates and the access to data values is restricted to equality tests. From this main result decidability of satisfiability and containment for a dataaware fragment of XPath and of the implication problem for unary key and inclusion constraints is concluded.","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126703358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Are Few Bins Enough: Testing Histogram Distributions 几个箱子就足够了:测试直方图分布
C. Canonne
{"title":"Are Few Bins Enough: Testing Histogram Distributions","authors":"C. Canonne","doi":"10.1145/2902251.2902274","DOIUrl":"https://doi.org/10.1145/2902251.2902274","url":null,"abstract":"A probability distribution over an ordered universe [n]={1,...,n} is said to be a k-histogram if it can be represented as a piecewise-constant function over at most k contiguous intervals. We study the following question: given samples from an arbitrary distribution D over [n], one must decide whether D is a k-histogram, or is far in L_1 distance from any such succinct representation. We obtain a sample and time-efficient algorithm for this problem, complemented by a nearly-matching information-theoretic lower bound on the number of samples required for this task. Our results significantly improve on the previous state-of-the-art, due to Indyk, Levi, and Rubinfeld 2012) and Canonne, Diakonikolas, Gouleakis, and Rubinfeld (2016).","PeriodicalId":158471,"journal":{"name":"Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130032229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信