Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory最新文献

筛选
英文 中文
Generalizing Greenwald-Khanna Streaming Quantile Summaries for Weighted Inputs 加权输入的广义Greenwald-Khanna流分位数摘要
Sepehr Assadi, Nirmit Joshi, M. Prabhu, Vihan Shah
{"title":"Generalizing Greenwald-Khanna Streaming Quantile Summaries for Weighted Inputs","authors":"Sepehr Assadi, Nirmit Joshi, M. Prabhu, Vihan Shah","doi":"10.48550/arXiv.2303.06288","DOIUrl":"https://doi.org/10.48550/arXiv.2303.06288","url":null,"abstract":"Estimating quantiles, like the median or percentiles, is a fundamental task in data mining and data science. A (streaming) quantile summary is a data structure that can process a set S of n elements in a streaming fashion and at the end, for any phi in (0,1], return a phi-quantile of S up to an eps error, i.e., return a phi'-quantile with phi'=phi +- eps. We are particularly interested in comparison-based summaries that only compare elements of the universe under a total ordering and are otherwise completely oblivious of the universe. The best known deterministic quantile summary is the 20-year old Greenwald-Khanna (GK) summary that uses O((1/eps) log(eps n)) space [SIGMOD'01]. This bound was recently proved to be optimal for all deterministic comparison-based summaries by Cormode and Vesle'y [PODS'20]. In this paper, we study weighted quantiles, a generalization of the quantiles problem, where each element arrives with a positive integer weight which denotes the number of copies of that element being inserted. The only known method of handling weighted inputs via GK summaries is the naive approach of breaking each weighted element into multiple unweighted items and feeding them one by one to the summary, which results in a prohibitively large update time (proportional to the maximum weight of input elements). We give the first non-trivial extension of GK summaries for weighted inputs and show that it takes O((1/eps) log(eps n)) space and O(log(1/eps)+ log log(eps n)) update time per element to process a stream of length n (under some quite mild assumptions on the range of weights and eps). En route to this, we also simplify the original GK summaries for unweighted quantiles.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87002261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Simple Algorithm for Consistent Query Answering under Primary Keys 一种简单的主键下一致性查询应答算法
Diego Figueira, A. Padmanabha, L. Segoufin, C. Sirangelo
{"title":"A Simple Algorithm for Consistent Query Answering under Primary Keys","authors":"Diego Figueira, A. Padmanabha, L. Segoufin, C. Sirangelo","doi":"10.48550/arXiv.2301.08482","DOIUrl":"https://doi.org/10.48550/arXiv.2301.08482","url":null,"abstract":"We consider the dichotomy conjecture for consistent query answering under primary key constraints stating that for every fixed Boolean conjunctive query q, testing whether it is certain over all repairs of a given inconsistent database is either polynomial time or coNP-complete. This conjecture has been verified for self-join-free and path queries. We propose a simple inflationary fixpoint algorithm for consistent query answering which, for a given database, naively computes a set $Delta$ of subsets of database repairs with at most $k$ facts, where $k$ is the size of the query $q$. The algorithm runs in polynomial time and can be formally defined as: 1. Initialize $Delta$ with all sets $S$ of at most $k$ facts such that $S$ satisfies $q$. 2. Add any set $S$ of at most $k$ facts to $Delta$ if there exists a block $B$ (ie, a maximal set of facts sharing the same key) such that for every fact $a$ of $B$ there is a set $S' in Delta$ contained in $(S cup {a})$. The algorithm answers\"$q$ is certain\"iff $Delta$ eventually contains the empty set. The algorithm correctly computes certain answers when the query $q$ falls in the polynomial time cases for self-join-free queries and path queries. For arbitrary queries, the algorithm is an under-approximation: The query is guaranteed to be certain if the algorithm claims so. However, there are polynomial time certain queries (with self-joins) which are not identified as such by the algorithm.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79633726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Compact Data Structures Meet Databases (Invited Talk) 紧凑型数据结构遇上数据库(特邀讲座)
Gonzalo Navarro
{"title":"Compact Data Structures Meet Databases (Invited Talk)","authors":"Gonzalo Navarro","doi":"10.4230/LIPIcs.ICDT.2023.2","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2023.2","url":null,"abstract":"","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78620278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Some Vignettes on Subgraph Counting Using Graph Orientations (Invited Talk) 利用图的方向进行子图计数的若干要点(特邀演讲)
C. Seshadhri, Floris Geerts, Brecht Vandevoort
{"title":"Some Vignettes on Subgraph Counting Using Graph Orientations (Invited Talk)","authors":"C. Seshadhri, Floris Geerts, Brecht Vandevoort","doi":"10.4230/LIPIcs.ICDT.2023.3","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2023.3","url":null,"abstract":"Subgraph counting is a fundamental problem that spans many areas in computer science: database theory, logic, network science, data mining, and complexity theory. Given a large input graph G and a small pattern graph H , we wish to count the number of occurrences of H in G . In recent times, there has been a resurgence on using an old (maybe overlooked?) technique of orienting the edges of G and H , and then using a combination of brute-force enumeration and indexing. These orientation techniques appear to give the best of both worlds. There is a rigorous theoretical explanation behind these techniques, and they also have excellent empirical behavior (on large real-world graphs). Time and again, graph orientations help solve subgraph counting problems in various computational models, be it sampling, streaming, distributed, etc. In this paper, we give some short vignettes on how the orientation technique solves a variety of algorithmic problems.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90606643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Researcher's Digest of GQL (Invited Talk) GQL研究人员文摘(特邀演讲)
Nadime Francis, Amélie Gheerbrant, P. Guagliardo, L. Libkin, Victor Marsault, W. Martens, Filip Murlak, L. Peterfreund, Alexandra Rogova, D. Vrgoc
{"title":"A Researcher's Digest of GQL (Invited Talk)","authors":"Nadime Francis, Amélie Gheerbrant, P. Guagliardo, L. Libkin, Victor Marsault, W. Martens, Filip Murlak, L. Peterfreund, Alexandra Rogova, D. Vrgoc","doi":"10.4230/LIPIcs.ICDT.2023.1","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2023.1","url":null,"abstract":"","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89748128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Enumerating Subgraphs of Constant Sizes in External Memory 在外部存储器中枚举常数大小的子图
Shiyuan Deng, Francesco Silvestri, Yufei Tao
{"title":"Enumerating Subgraphs of Constant Sizes in External Memory","authors":"Shiyuan Deng, Francesco Silvestri, Yufei Tao","doi":"10.4230/LIPIcs.ICDT.2023.4","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2023.4","url":null,"abstract":"We present an indivisible I/O-efficient algorithm for subgraph enumeration , where the objective is to list all the subgraphs of a massive graph G := ( V, E ) that are isomorphic to a pattern graph Q having k = O (1) vertices. Our algorithm performs O ( | E | k/ 2 M k/ 2 − 1 B log M/B | E | B + | E | ρ M ρ − 1 B ) I/Os with high probability, where ρ is the fractional edge covering number of Q (it always holds ρ ≥ k/ 2, regardless of Q ), M is the number of words in (internal) memory, and B is the number of words in a disk block. Our solution is optimal in the class of indivisible algorithms for all pattern graphs with ρ > k/ 2. When ρ = k/ 2, our algorithm is still optimal as long as M/B ≥ ( | E | /B ) ϵ for any constant ϵ > 0. 2012 ACM Subject Classification Theory of computation → Graph algorithms analysis; Information systems → Join algorithms","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80111240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Size Bounds and Algorithms for Conjunctive Regular Path Queries 合取规则路径查询的大小边界和算法
Tamara Cucumides, Juan L. Reutter, D. Vrgoc
{"title":"Size Bounds and Algorithms for Conjunctive Regular Path Queries","authors":"Tamara Cucumides, Juan L. Reutter, D. Vrgoc","doi":"10.4230/LIPIcs.ICDT.2023.13","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2023.13","url":null,"abstract":"","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74027432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Optimal Algorithm for Sliding Window Order Statistics 滑动窗口序统计量的最优算法
Pavel Raykov
{"title":"An Optimal Algorithm for Sliding Window Order Statistics","authors":"Pavel Raykov","doi":"10.4230/LIPIcs.ICDT.2023.5","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2023.5","url":null,"abstract":"Assume there is a data stream of elements and a window of size m . Sliding window algorithms compute various statistic functions over the last m elements of the data stream seen so far. The time complexity of a sliding window algorithm is measured as the time required to output an updated statistic function value every time a new element is read. For example, it is well known that computing the sliding window maximum/minimum has time complexity O (1) while computing the sliding window median has time complexity O (log m ). In this paper we close the gap between these two cases by (1) presenting an algorithm for computing the sliding window k -th smallest element in O (log k ) time and (2) prove that this time complexity is optimal.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85997470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Consistency of Probabilistic Databases with Independent Cells 具有独立单元格的概率数据库的一致性
Amir Gilad, Aviram Imber, B. Kimelfeld
{"title":"The Consistency of Probabilistic Databases with Independent Cells","authors":"Amir Gilad, Aviram Imber, B. Kimelfeld","doi":"10.48550/arXiv.2212.12104","DOIUrl":"https://doi.org/10.48550/arXiv.2212.12104","url":null,"abstract":"A probabilistic database with attribute-level uncertainty consists of relations where cells of some attributes may hold probability distributions rather than deterministic content. Such databases arise, implicitly or explicitly, in the context of noisy operations such as missing data imputation, where we automatically fill in missing values, column prediction, where we predict unknown attributes, and database cleaning (and repairing), where we replace the original values due to detected errors or violation of integrity constraints. We study the computational complexity of problems that regard the selection of cell values in the presence of integrity constraints. More precisely, we focus on functional dependencies and study three problems: (1) deciding whether the constraints can be satisfied by any choice of values, (2) finding a most probable such choice, and (3) calculating the probability of satisfying the constraints. The data complexity of these problems is determined by the combination of the set of functional dependencies and the collection of uncertain attributes. We give full classifications into tractable and intractable complexities for several classes of constraints, including a single dependency, matching constraints, and unary functional dependencies.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87755463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Approximation and Semantic Tree-width of Conjunctive Regular Path Queries 合取规则路径查询的逼近和语义树宽度
Diego Figueira, Rémi Morvan
{"title":"Approximation and Semantic Tree-width of Conjunctive Regular Path Queries","authors":"Diego Figueira, Rémi Morvan","doi":"10.48550/arXiv.2212.01679","DOIUrl":"https://doi.org/10.48550/arXiv.2212.01679","url":null,"abstract":"We show that the problem of whether a query is equivalent to a query of tree-width $k$ is decidable, for the class of Unions of Conjunctive Regular Path Queries with two-way navigation (UC2RPQs). A previous result by Barcel'o, Romero, and Vardi has shown decidability for the case $k=1$, and here we show that decidability in fact holds for any arbitrary $k>1$. The algorithm is in 2ExpSpace, but for the restricted but practically relevant case where all regular expressions of the query are of the form $a^*$ or $(a_1 + dotsb + a_n)$ we show that the complexity of the problem drops to $Pi_2^p$. We also investigate the related problem of approximating a UC2RPQ by queries of small tree-width. We exhibit an algorithm which, for any fixed number $k$, builds the maximal under-approximation of tree-width $k$ of a UC2RPQ. The maximal under-approximation of tree-width $k$ of a query $q$ is a query $q'$ of tree-width $k$ which is contained in $q$ in a maximal and unique way, that is, such that for every query $q''$ of tree-width $k$, if $q''$ is contained in $q$ then $q''$ is also contained in $q'$.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85337927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信