Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory最新文献

Generalizing Greenwald-Khanna Streaming Quantile Summaries for Weighted Inputs 加权输入的广义Greenwald-Khanna流分位数摘要

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory Pub Date : 2023-03-11 DOI: 10.48550/arXiv.2303.06288

Sepehr Assadi, Nirmit Joshi, M. Prabhu, Vihan Shah

{"title":"Generalizing Greenwald-Khanna Streaming Quantile Summaries for Weighted Inputs","authors":"Sepehr Assadi, Nirmit Joshi, M. Prabhu, Vihan Shah","doi":"10.48550/arXiv.2303.06288","DOIUrl":"https://doi.org/10.48550/arXiv.2303.06288","url":null,"abstract":"Estimating quantiles, like the median or percentiles, is a fundamental task in data mining and data science. A (streaming) quantile summary is a data structure that can process a set S of n elements in a streaming fashion and at the end, for any phi in (0,1], return a phi-quantile of S up to an eps error, i.e., return a phi'-quantile with phi'=phi +- eps. We are particularly interested in comparison-based summaries that only compare elements of the universe under a total ordering and are otherwise completely oblivious of the universe. The best known deterministic quantile summary is the 20-year old Greenwald-Khanna (GK) summary that uses O((1/eps) log(eps n)) space [SIGMOD'01]. This bound was recently proved to be optimal for all deterministic comparison-based summaries by Cormode and Vesle'y [PODS'20]. In this paper, we study weighted quantiles, a generalization of the quantiles problem, where each element arrives with a positive integer weight which denotes the number of copies of that element being inserted. The only known method of handling weighted inputs via GK summaries is the naive approach of breaking each weighted element into multiple unweighted items and feeding them one by one to the summary, which results in a prohibitively large update time (proportional to the maximum weight of input elements). We give the first non-trivial extension of GK summaries for weighted inputs and show that it takes O((1/eps) log(eps n)) space and O(log(1/eps)+ log log(eps n)) update time per element to process a stream of length n (under some quite mild assumptions on the range of weights and eps). En route to this, we also simplify the original GK summaries for unweighted quantiles.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"37 1","pages":"19:1-19:19"},"PeriodicalIF":0.0,"publicationDate":"2023-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87002261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Simple Algorithm for Consistent Query Answering under Primary Keys 一种简单的主键下一致性查询应答算法

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory Pub Date : 2023-01-20 DOI: 10.48550/arXiv.2301.08482

Diego Figueira, A. Padmanabha, L. Segoufin, C. Sirangelo

{"title":"A Simple Algorithm for Consistent Query Answering under Primary Keys","authors":"Diego Figueira, A. Padmanabha, L. Segoufin, C. Sirangelo","doi":"10.48550/arXiv.2301.08482","DOIUrl":"https://doi.org/10.48550/arXiv.2301.08482","url":null,"abstract":"We consider the dichotomy conjecture for consistent query answering under primary key constraints stating that for every fixed Boolean conjunctive query q, testing whether it is certain over all repairs of a given inconsistent database is either polynomial time or coNP-complete. This conjecture has been verified for self-join-free and path queries. We propose a simple inflationary fixpoint algorithm for consistent query answering which, for a given database, naively computes a set $Delta$ of subsets of database repairs with at most $k$ facts, where $k$ is the size of the query $q$. The algorithm runs in polynomial time and can be formally defined as: 1. Initialize $Delta$ with all sets $S$ of at most $k$ facts such that $S$ satisfies $q$. 2. Add any set $S$ of at most $k$ facts to $Delta$ if there exists a block $B$ (ie, a maximal set of facts sharing the same key) such that for every fact $a$ of $B$ there is a set $S' in Delta$ contained in $(S cup {a})$. The algorithm answers\"$q$ is certain\"iff $Delta$ eventually contains the empty set. The algorithm correctly computes certain answers when the query $q$ falls in the polynomial time cases for self-join-free queries and path queries. For arbitrary queries, the algorithm is an under-approximation: The query is guaranteed to be certain if the algorithm claims so. However, there are polynomial time certain queries (with self-joins) which are not identified as such by the algorithm.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"112 1","pages":"24:1-24:18"},"PeriodicalIF":0.0,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79633726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Compact Data Structures Meet Databases (Invited Talk) 紧凑型数据结构遇上数据库(特邀讲座)

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory Pub Date : 2023-01-01 DOI: 10.4230/LIPIcs.ICDT.2023.2

Gonzalo Navarro

引用次数: 0

Some Vignettes on Subgraph Counting Using Graph Orientations (Invited Talk) 利用图的方向进行子图计数的若干要点(特邀演讲)

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory Pub Date : 2023-01-01 DOI: 10.4230/LIPIcs.ICDT.2023.3

C. Seshadhri, Floris Geerts, Brecht Vandevoort

引用次数: 1

A Researcher's Digest of GQL (Invited Talk) GQL研究人员文摘(特邀演讲)

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory Pub Date : 2023-01-01 DOI: 10.4230/LIPIcs.ICDT.2023.1

Nadime Francis, Amélie Gheerbrant, P. Guagliardo, L. Libkin, Victor Marsault, W. Martens, Filip Murlak, L. Peterfreund, Alexandra Rogova, D. Vrgoc

引用次数: 6

Enumerating Subgraphs of Constant Sizes in External Memory 在外部存储器中枚举常数大小的子图

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory Pub Date : 2023-01-01 DOI: 10.4230/LIPIcs.ICDT.2023.4

Shiyuan Deng, Francesco Silvestri, Yufei Tao

引用次数: 1

Size Bounds and Algorithms for Conjunctive Regular Path Queries 合取规则路径查询的大小边界和算法

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory Pub Date : 2023-01-01 DOI: 10.4230/LIPIcs.ICDT.2023.13

Tamara Cucumides, Juan L. Reutter, D. Vrgoc

引用次数: 0

An Optimal Algorithm for Sliding Window Order Statistics 滑动窗口序统计量的最优算法

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory Pub Date : 2023-01-01 DOI: 10.4230/LIPIcs.ICDT.2023.5

Pavel Raykov

引用次数: 0

The Consistency of Probabilistic Databases with Independent Cells 具有独立单元格的概率数据库的一致性

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory Pub Date : 2022-12-23 DOI: 10.48550/arXiv.2212.12104

Amir Gilad, Aviram Imber, B. Kimelfeld

{"title":"The Consistency of Probabilistic Databases with Independent Cells","authors":"Amir Gilad, Aviram Imber, B. Kimelfeld","doi":"10.48550/arXiv.2212.12104","DOIUrl":"https://doi.org/10.48550/arXiv.2212.12104","url":null,"abstract":"A probabilistic database with attribute-level uncertainty consists of relations where cells of some attributes may hold probability distributions rather than deterministic content. Such databases arise, implicitly or explicitly, in the context of noisy operations such as missing data imputation, where we automatically fill in missing values, column prediction, where we predict unknown attributes, and database cleaning (and repairing), where we replace the original values due to detected errors or violation of integrity constraints. We study the computational complexity of problems that regard the selection of cell values in the presence of integrity constraints. More precisely, we focus on functional dependencies and study three problems: (1) deciding whether the constraints can be satisfied by any choice of values, (2) finding a most probable such choice, and (3) calculating the probability of satisfying the constraints. The data complexity of these problems is determined by the combination of the set of functional dependencies and the collection of uncertain attributes. We give full classifications into tractable and intractable complexities for several classes of constraints, including a single dependency, matching constraints, and unary functional dependencies.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"12 1","pages":"22:1-22:19"},"PeriodicalIF":0.0,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87755463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Approximation and Semantic Tree-width of Conjunctive Regular Path Queries 合取规则路径查询的逼近和语义树宽度

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory Pub Date : 2022-12-03 DOI: 10.48550/arXiv.2212.01679

Diego Figueira, Rémi Morvan

{"title":"Approximation and Semantic Tree-width of Conjunctive Regular Path Queries","authors":"Diego Figueira, Rémi Morvan","doi":"10.48550/arXiv.2212.01679","DOIUrl":"https://doi.org/10.48550/arXiv.2212.01679","url":null,"abstract":"We show that the problem of whether a query is equivalent to a query of tree-width $k$ is decidable, for the class of Unions of Conjunctive Regular Path Queries with two-way navigation (UC2RPQs). A previous result by Barcel'o, Romero, and Vardi has shown decidability for the case $k=1$, and here we show that decidability in fact holds for any arbitrary $k>1$. The algorithm is in 2ExpSpace, but for the restricted but practically relevant case where all regular expressions of the query are of the form $a^*$ or $(a_1 + dotsb + a_n)$ we show that the complexity of the problem drops to $Pi_2^p$. We also investigate the related problem of approximating a UC2RPQ by queries of small tree-width. We exhibit an algorithm which, for any fixed number $k$, builds the maximal under-approximation of tree-width $k$ of a UC2RPQ. The maximal under-approximation of tree-width $k$ of a query $q$ is a query $q'$ of tree-width $k$ which is contained in $q$ in a maximal and unique way, that is, such that for every query $q''$ of tree-width $k$, if $q''$ is contained in $q$ then $q''$ is also contained in $q'$.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"848 1","pages":"15:1-15:19"},"PeriodicalIF":0.0,"publicationDate":"2022-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85337927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2