ACM SIGMOD Record最新文献

筛选
英文 中文
Graph Theory for Consent Management: A New Approach for Complex Data Flows 同意管理的图论:复杂数据流的新方法
ACM SIGMOD Record Pub Date : 2024-05-14 DOI: 10.1145/3665252.3665265
Dorota Filipczuk, Enrico H. Gerding, George Konstantinidis
{"title":"Graph Theory for Consent Management: A New Approach for Complex Data Flows","authors":"Dorota Filipczuk, Enrico H. Gerding, George Konstantinidis","doi":"10.1145/3665252.3665265","DOIUrl":"https://doi.org/10.1145/3665252.3665265","url":null,"abstract":"<p>Through legislation and technical advances users gain more control over how their data is processed, and they expect online services to respect their privacy choices and preferences. However, data may be processed for many different purposes by several layers of algorithms that create complex data workflows. To date, there is no existing approach to automatically satisfy fine-grained privacy constraints of a user in a way which optimises the service provider's gains from processing. In this article, we propose a solution to this problem by modelling a data flow as a graph. User constraints and processing purposes are pairs of vertices which need to be disconnected in this graph. We show that, in general, this problem is NP-hard and we propose several heuristics and algorithms. We discuss the optimality versus efficiency of our algorithms and evaluate them using synthetically generated data. On the practical side, our algorithms can provide nearly optimal solutions for tens of constraints and graphs of thousands of nodes, in a few seconds.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"211 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141063951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Better Differentially Private Approximate Histograms and Heavy Hitters using the Misra-Gries Sketch 使用米斯拉-格里斯草图绘制更好的差分私有近似直方图和重击图
ACM SIGMOD Record Pub Date : 2024-05-14 DOI: 10.1145/3665252.3665255
Christian Janos Lebeda, Jakub Tetek
{"title":"Better Differentially Private Approximate Histograms and Heavy Hitters using the Misra-Gries Sketch","authors":"Christian Janos Lebeda, Jakub Tetek","doi":"10.1145/3665252.3665255","DOIUrl":"https://doi.org/10.1145/3665252.3665255","url":null,"abstract":"<p>We consider the problem of computing differentially private approximate histograms and heavy hitters in a stream of elements. In the non-private setting, this is often done using the sketch of Misra and Gries [Science of Computer Programming, 1982]. Chan, Li, Shi, and Xu [PETS 2012] describe a differentially private version of the Misra-Gries sketch, but the amount of noise it adds can be large and scales linearly with the size of the sketch; the more accurate the sketch is, the more noise this approach has to add. We present a better mechanism for releasing a Misra-Gries sketch under (ε, δ)-differential privacy. It adds noise with magnitude independent of the size of the sketch; in fact, the maximum error coming from the noise is the same as the best known in the private non-streaming setting, up to a constant factor. Our mechanism is simple and likely to be practical. In the full version of the paper we also give a simple post-processing step of the Misra-Gries sketch that does not increase the worst-case error guarantee. It is sufficient to add noise to this new sketch with less than twice the magnitude of the non-streaming setting. This improves on the previous result for \"-differential privacy where the noise scales linearly to the size of the sketch.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141061736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Epistemic Parity: Reproducibility as an Evaluation Metric for Differential Privacy 认识等价性:可重复性作为差异隐私的评估指标
ACM SIGMOD Record Pub Date : 2024-05-14 DOI: 10.1145/3665252.3665267
Lucas Rosenblatt, Bernease Herman, Anastasia Holovenko, Wonkwon Lee, Joshua Loftus, Elizabeth McKinnie, Taras Rumezhak, Andrii Stadnik, Bill Howe, Julia Stoyanovich
{"title":"Epistemic Parity: Reproducibility as an Evaluation Metric for Differential Privacy","authors":"Lucas Rosenblatt, Bernease Herman, Anastasia Holovenko, Wonkwon Lee, Joshua Loftus, Elizabeth McKinnie, Taras Rumezhak, Andrii Stadnik, Bill Howe, Julia Stoyanovich","doi":"10.1145/3665252.3665267","DOIUrl":"https://doi.org/10.1145/3665252.3665267","url":null,"abstract":"<p>Differential privacy (DP) data synthesizers are increasingly proposed to afford public release of sensitive information, offering theoretical guarantees for privacy (and, in some cases, utility), but limited empirical evidence of utility in practical settings. Utility is typically measured as the error on representative proxy tasks, such as descriptive statistics, multivariate correlations, the accuracy of trained classifiers, or performance over a query workload. The ability for these results to generalize to practitioners' experience has been questioned in a number of settings, including the U.S. Census. In this paper, we propose an evaluation methodology for synthetic data that avoids assumptions about the representativeness of proxy tasks, instead measuring the likelihood that published conclusions would change had the authors used synthetic data, a condition we call epistemic parity. Our methodology consists of reproducing empirical conclusions of peer-reviewed papers on real, publicly available data, then re-running these experiments a second time on DP synthetic data and comparing the results.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"125 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141061670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Allocating Isolation Levels to Transactions in a Multiversion Setting 在多版本设置中为事务分配隔离级别
ACM SIGMOD Record Pub Date : 2024-05-14 DOI: 10.1145/3665252.3665257
Brecht Vandevoort, Bas Ketsman, Frank Neven
{"title":"Allocating Isolation Levels to Transactions in a Multiversion Setting","authors":"Brecht Vandevoort, Bas Ketsman, Frank Neven","doi":"10.1145/3665252.3665257","DOIUrl":"https://doi.org/10.1145/3665252.3665257","url":null,"abstract":"<p>A serializable concurrency control mechanism ensures consistency for OLTP systems at the expense of a reduced transaction throughput. A DBMS therefore usually offers the possibility to allocate lower isolation levels for some transactions when it is safe to do so. However, such trading of consistency for efficiency does not come with any safety guarantees. In this paper, we study the mixed robustness problem which asks whether, for a given set of transactions and a given allocation of isolation levels, every possible interleaved execution of those transactions that is allowed under the provided allocation is always serializable. That is, whether the given allocation is indeed safe. While robustness has already been studied in the literature for the homogeneous setting where all transactions are allocated the same isolation level, the heterogeneous setting that we consider in this paper, despite its practical relevance, has largely been ignored. We focus on multiversion concurrency control and consider the isolation levels that are available in Postgres and Oracle: read committed (RC), snapshot isolation (SI) and serializable snapshot isolation (SSI). We show that the mixed robustness problem can be decided in polynomial time. In addition, we provide a polynomial time algorithm for computing the optimal robust allocation for a given set of transactions, prioritizing lower over higher isolation levels. The present results therefore establish the groundwork to automate isolation level allocation within existing databases supporting multiversion concurrency control.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141061537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs 基于基数估计图的精确汇总基数估计
ACM SIGMOD Record Pub Date : 2023-06-08 DOI: https://dl.acm.org/doi/10.1145/3604437.3604458
Jeremy Chen, Yuqing Huang, Mushi Wang, Semih Salihoglu, Kenneth Salem
{"title":"Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs","authors":"Jeremy Chen, Yuqing Huang, Mushi Wang, Semih Salihoglu, Kenneth Salem","doi":"https://dl.acm.org/doi/10.1145/3604437.3604458","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604458","url":null,"abstract":"<p>We study two classes of summary-based cardinality estimators that use statistics about input relations and small-size joins: (i) optimistic estimators, which were defined in the context of graph database management systems, that make uniformity and conditional independence assumptions; and (ii) the recent pessimistic estimators that use information theoretic linear programs (LPs). We show that optimistic estimators can be modeled as picking bottom-to-top paths in a cardinality estimation graph (CEG), which contains subqueries as nodes and edges whose weights are average degree statistics. We show that existing optimistic estimators have either undefined or fixed choices for picking CEG paths as their estimates and ignore alternative choices. Instead, we outline a space of optimistic estimators to make an estimate on CEGs, which subsumes existing estimators. We show, using an extensive empirical analysis, that effective paths depend on the structure of the queries. We next show that optimistic estimators and seemingly disparate LP-based pessimistic estimators are in fact connected. Specifically, we show that CEGs can also model some recent pessimistic estimators. This connection allows us to provide insights into the pessimistic estimators, such as showing that they have combinatorial solutions.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"253 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Technical Perspective: (Pre-) Semirings Come to the Recursion Party 技术观点:(Pre-)半星人来参加递归派对
ACM SIGMOD Record Pub Date : 2023-06-08 DOI: https://dl.acm.org/doi/10.1145/3604437.3604453
Atri Rudra
{"title":"Technical Perspective: (Pre-) Semirings Come to the Recursion Party","authors":"Atri Rudra","doi":"https://dl.acm.org/doi/10.1145/3604437.3604453","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604453","url":null,"abstract":"<p>(This article is an imagined conversation with my U. at Buffalo UG algorithms class students.)</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"252 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Technical Perspective for Skeena: Efficient and Consistent Cross-Engine Transactions Skeena的技术视角:高效和一致的跨引擎交易
ACM SIGMOD Record Pub Date : 2023-06-08 DOI: https://dl.acm.org/doi/10.1145/3604437.3604443
Carsten Binnig
{"title":"Technical Perspective for Skeena: Efficient and Consistent Cross-Engine Transactions","authors":"Carsten Binnig","doi":"https://dl.acm.org/doi/10.1145/3604437.3604443","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604443","url":null,"abstract":"<p>The paper proposes a solution to the problem of inadequate support for transactions in multi-engine database systems. Multi-engine database systems are databases that integrate new (fast) memory-optimized storage engines with (slow) traditional engines, allowing the application to use tables in both engines. Multi-engine database systems are in particular interesting for traditional database systems that are extended over time. By being able to store tables in slow and fast storage engines and executing transactions cross engines allows to reduce overall cost since less performance critical tables can be placed in slow (and thus cheaper) storage. As</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"250 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Technical Perspective: Query Answers - Fewer is Faster 技术角度:查询答案-越少越快
ACM SIGMOD Record Pub Date : 2023-06-08 DOI: https://dl.acm.org/doi/10.1145/3604437.3604451
Leonid Libkin
{"title":"Technical Perspective: Query Answers - Fewer is Faster","authors":"Leonid Libkin","doi":"https://dl.acm.org/doi/10.1145/3604437.3604451","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604451","url":null,"abstract":"<p>We often write queries using LIMIT k, indicating that only k answers are to be returned. This feature is present in most query languages, for different data models: SQL, SPARQL, Cypher etc. For example, in a repository of about 250M SPARQL queries, about 15M queries are of this form. Not surprisingly of course, the database research community studied such queries extensively. The dominant setting is this: there is an ordering on tuples that can be returned by a query. Then the answer is limited to the first k tuples in this ordering.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"251 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Technical Perspective: Conjunctive Queries with Comparisons 技术角度:带有比较的连接查询
ACM SIGMOD Record Pub Date : 2023-06-08 DOI: https://dl.acm.org/doi/10.1145/3604437.3604449
Stijn Vansummeren
{"title":"Technical Perspective: Conjunctive Queries with Comparisons","authors":"Stijn Vansummeren","doi":"https://dl.acm.org/doi/10.1145/3604437.3604449","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604449","url":null,"abstract":"<p>Query processing, the art of efficiently executing a relational query on a given database, is a foundational and core area in data management research. Established at the dawn of relational database systems in the 1970's, relational query processing remains a highly relevant and vibrant research topic today as recent work shows that, apart from its application in traditional database scenarios, it is also highly effective in optimizing machine learning workloads [1].</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"251 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Technical Perspective: Optimal Algorithms for Multiway Search on Partial Orders 技术视角:偏阶多路搜索的最优算法
ACM SIGMOD Record Pub Date : 2023-06-08 DOI: https://dl.acm.org/doi/10.1145/3604437.3604455
Rajesh Jayaram
{"title":"Technical Perspective: Optimal Algorithms for Multiway Search on Partial Orders","authors":"Rajesh Jayaram","doi":"https://dl.acm.org/doi/10.1145/3604437.3604455","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604455","url":null,"abstract":"<p>Given a list of comparable items A = {a1, . . . , an sorted so that a1 &lt; a2 &lt; . . . &lt; an, a canonical problem is locating a target item q within A if it exists. The canonical algorithm for this problem, of course, is binary search, which locates q using at most O(log n) comparisons between q and elements of A. Binary search is an indispensable tool for totally ordered datasets. However, many naturally occurring datasets are only partially ordered (posets), meaning that not all pairs of elements are comparable. Every such poset can be expressed as a directed acyclic graph (DAG), with edges (x,y) representing the relation x &lt; y.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"252 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信