ACM SIGMOD RecordPub Date : 2023-06-08DOI: https://dl.acm.org/doi/10.1145/3604437.3604462
Wei Dong, Juanru Fang, Ke Yi, Yuchao Tao, Ashwin Machanavajjhala
{"title":"R2T: Instance-optimal Truncation for Differentially Private Query Evaluation with Foreign Keys","authors":"Wei Dong, Juanru Fang, Ke Yi, Yuchao Tao, Ashwin Machanavajjhala","doi":"https://dl.acm.org/doi/10.1145/3604437.3604462","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604462","url":null,"abstract":"<p>Answering SPJA queries under differential privacy (DP), including graph pattern counting under node-DP as an important special case, has received considerable attention in recent years. The dual challenge of foreign-key constraints and self-joins is particularly tricky to deal with, and no existing DP mechanisms can correctly handle both. For the special case of graph pattern counting under node-DP, the existing mechanisms are correct (i.e., satisfy DP), but they do not offer nontrivial utility guarantees or are very complicated and costly. In this paper, we propose the first DP mechanism for answering arbitrary SPJA queries in a database with foreign-key constraints. Meanwhile, it achieves a fairly strong notion of optimality, which can be considered as a small and natural relaxation of instance optimality. Finally, our mechanism is simple enough that it can be easily implemented on top of any RDBMS and an LP solver. Experimental results show that it offers order-of-magnitude improvements in terms of utility over existing techniques, even those specifically designed for graph pattern counting.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ACM SIGMOD RecordPub Date : 2023-06-08DOI: https://dl.acm.org/doi/10.1145/3604437.3604460
Christina Pavlopoulou, Michael J. Carey, Vassilis J. Tsotras
{"title":"Revisiting Runtime Dynamic Optimization for Join Queries in Big Data Management Systems","authors":"Christina Pavlopoulou, Michael J. Carey, Vassilis J. Tsotras","doi":"https://dl.acm.org/doi/10.1145/3604437.3604460","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604460","url":null,"abstract":"<p>Effective query optimization remains an open problem for Big Data Management Systems. In this work, we revisit an old idea, runtime dynamic optimization, and adapt it to a big data management system, AsterixDB. The approach runs in stages (re-optimization points), starting by first executing all predicates local to a single dataset. The intermediate result created by a stage is then used to re-optimize the remaining query. This re-optimization approach avoids inaccurate intermediate result cardinality estimates, thus leading to much better execution plans. While it introduces overhead for materializing intermediate results, experiments show that this overhead is relatively small and is an acceptable price to pay given the optimization benefits.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"254 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ACM SIGMOD RecordPub Date : 2023-06-08DOI: https://dl.acm.org/doi/10.1145/3604437.3604447
Tim Kraska
{"title":"Technical Perspective for Sherman: A Write-Optimized Distributed B+Tree Index on Disaggregated Memory","authors":"Tim Kraska","doi":"https://dl.acm.org/doi/10.1145/3604437.3604447","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604447","url":null,"abstract":"<p>Separation of compute and storage has become the defacto standard for cloud database systems. First proposed in 2007 for database systems [2], it is now widely adopted by all major cloud providers such as Amazon Redshift, Google BigQuery, and Snowflake. Separation of compute and storage adds enormous value for the customer. Users can scale storage independently of compute, which enables them to only pay for what they really uses. Consider a scenario in which data grows linearly over time, but most queries only access the last month of data, which remains relatively stable. Without the separation of compute and storage, the user would gradually be forced to significantly increase the database cluster capacity. In contrast, modern cloud database systems allow scaling the storage separately from compute; the compute cluster stays the same over time, whereas the data is stored on cheap cloud storage services, like Amazon S3.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"250 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ACM SIGMOD RecordPub Date : 2023-06-08DOI: https://dl.acm.org/doi/10.1145/3604437.3604439
Kenneth Salem
{"title":"TECHNICAL PERSPECTIVE: Ad Hoc Transactions: What They Are and Why We Should Care","authors":"Kenneth Salem","doi":"https://dl.acm.org/doi/10.1145/3604437.3604439","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604439","url":null,"abstract":"<p>Most database research papers are prescriptive. They identify a technical problem and show us how to solve it. They present new algorithms, theorems, and evaluations of prototypes. Other papers follow a different path: descriptive rather than prescriptive. They tell us how data systems behave in practice, and how they are actually used. They employ a different set of tools, such as surveys, software analyses or user studies. These papers are much rarer at database research conferences, and they're all the more valuable for that.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"249 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ACM SIGMOD RecordPub Date : 2023-06-08DOI: https://dl.acm.org/doi/10.1145/3604437.3604454
Mahmoud Abo Khamis, Hung Q. Ngo, Reinhard Pichler, Dan Suciu, Yisu Remy Wang
{"title":"Convergence of Datalog over (Pre-) Semirings","authors":"Mahmoud Abo Khamis, Hung Q. Ngo, Reinhard Pichler, Dan Suciu, Yisu Remy Wang","doi":"https://dl.acm.org/doi/10.1145/3604437.3604454","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604454","url":null,"abstract":"<p>Recursive queries have been traditionally studied in the framework of datalog, a language that restricts recursion to monotone queries over sets, which is guaranteed to converge in polynomial time in the size of the input. But modern big data systems require recursive computations beyond the Boolean space. In this paper we study the convergence of datalog when it is interpreted over an arbitrary semiring. We consider an ordered semiring, define the semantics of a datalog program as a least fixpoint in this semiring, and study the number of steps required to reach that fixpoint, if ever. We identify algebraic properties of the semiring that correspond to certain convergence properties of datalog programs. Finally, we describe a class of ordered semirings on which one can generalize the semi-na¨ve evaluation algorithm to compute their minimal fixpoints.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"252 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ACM SIGMOD RecordPub Date : 2023-06-08DOI: https://dl.acm.org/doi/10.1145/3604437.3604444
Jianqiu Zhang, Kaisong Huang, Tianzheng Wang, King Lv
{"title":"Efficiently Making Cross-Engine Transactions Consistent","authors":"Jianqiu Zhang, Kaisong Huang, Tianzheng Wang, King Lv","doi":"https://dl.acm.org/doi/10.1145/3604437.3604444","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604444","url":null,"abstract":"<p>Database systems are becoming increasingly multi-engine. In particular, a main-memory engine may coexist with a traditional storage-centric engine in a system to support various applications. It is desirable to allow applications to access data in both engines using cross-engine transactions. But existing systems are either only designed for singleengine accesses, or impose many restrictions by limiting crossengine transactions to certain isolation levels and operations. The result is inadequate cross-engine support in terms of correctness, performance and programmability.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"250 S1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ACM SIGMOD RecordPub Date : 2023-06-08DOI: https://dl.acm.org/doi/10.1145/3604437.3604442
Per Fuchs, Domagoj Margan, Jana Giceva
{"title":"Sortledton: a Universal Graph Data Structure","authors":"Per Fuchs, Domagoj Margan, Jana Giceva","doi":"https://dl.acm.org/doi/10.1145/3604437.3604442","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604442","url":null,"abstract":"<p>Despite the wide adoption of graph processing across many different application domains, there is no underlying data structure that can serve a variety of graph workloads (analytics, traversals, and pattern matching) on dynamic graphs with single edge updates updates.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"250 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ACM SIGMOD RecordPub Date : 2023-06-08DOI: https://dl.acm.org/doi/10.1145/3604437.3604441
Angela Bonifati
{"title":"Technical Perspective: Sortledton: a Universal Graph Data Structure","authors":"Angela Bonifati","doi":"https://dl.acm.org/doi/10.1145/3604437.3604441","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604441","url":null,"abstract":"<p>Graph processing is becoming ubiquitous due to the proliferation of interconnected data in several domains, including life sciences, social networks, cybersecurity, finance and logistics, to name a few. In parallel with the growth of the underlying graph data sources, a plethora of graph workloads have appeared, ranging from graph analytics to graph traversals and graph pattern matching. Graph systems executing both complex and simple graph workloads need to leverage adequate data structures for efficiently processing heterogeneous graph data. While the underlying graph data structures have been extensively studied for the static case, they are less understood for the dynamic case, with the data undergoing several updates per second. Moreover, the existing solutions suffer lack of generality, as they focus on one specific requirement and workload type at a time. Designing a universal data structure that adapts to several kinds of graph workloads in a dynamic setting and achieves significant efficiency on all of them is far from being trivial.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"249 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ACM SIGMOD RecordPub Date : 2023-06-08DOI: https://dl.acm.org/doi/10.1145/3604437.3604452
Angela Bonifati, Stefania Dumbrava, George Fletcher, Jan Hidders, Matthias Hofer, Wim Martens, Filip Murlak, Joshua Shinavier, Slawek Staworko, Dominik Tomaszuk
{"title":"Threshold Queries","authors":"Angela Bonifati, Stefania Dumbrava, George Fletcher, Jan Hidders, Matthias Hofer, Wim Martens, Filip Murlak, Joshua Shinavier, Slawek Staworko, Dominik Tomaszuk","doi":"https://dl.acm.org/doi/10.1145/3604437.3604452","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604452","url":null,"abstract":"<p>Threshold queries are an important class of queries that only require computing or counting answers up to a specified threshold value. To the best of our knowledge, threshold queries have been largely disregarded in the research literature, which is surprising considering how common they are in practice. We explore how such queries appear in practice and present a method that can be used to significantly improve the asymptotic bounds of their state-of-the-art evaluation algorithms. Our experimental evaluation of these methods shows order-of-magnitude performance improvements.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"252 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ACM SIGMOD RecordPub Date : 2023-06-08DOI: https://dl.acm.org/doi/10.1145/3604437.3604450
Qichen Wang, Ke Yi
{"title":"Conjunctive Queries with Comparisons","authors":"Qichen Wang, Ke Yi","doi":"https://dl.acm.org/doi/10.1145/3604437.3604450","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604437.3604450","url":null,"abstract":"<p>Conjunctive queries with predicates in the form of comparisons that span multiple relations have regained interest recently, due to their relevance in OLAP queries, spatiotemporal databases, and machine learning over relational data. The standard technique, predicate pushdown, has limited efficacy on such comparisons. A technique by Willard can be used to process short comparisons that are adjacent in the join tree in time linear in the input size plus output size. In this paper, we describe a new algorithm for evaluating conjunctive queries with both short and long comparisons, and identify an acyclic condition under which linear time can be achieved. We have also implemented the new algorithm on top of Spark, and our experimental results demonstrate order-of-magnitude speedups over SparkSQL on a variety of graph patterns and analytical queries.</p>","PeriodicalId":501169,"journal":{"name":"ACM SIGMOD Record","volume":"251 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138510376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}