{"title":"Technical Perspective: (Pre-) Semirings Come to the Recursion Party","authors":"A. Rudra","doi":"10.1145/3604437.3604453","DOIUrl":"https://doi.org/10.1145/3604437.3604453","url":null,"abstract":"(This article is an imagined conversation with my U. at Buffalo UG algorithms class students.)","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124937015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Technical Perspective: When is it safe to run a transactional workload under Read Committed?","authors":"A. Fekete","doi":"10.1145/3604437.3604445","DOIUrl":"https://doi.org/10.1145/3604437.3604445","url":null,"abstract":"A data management platform provides many capabilities to assist the data owner, application coder, or end-user. For example, it should support an expressive query language, schema definition, and sophisticated access control. Another way many platforms add value is through a transaction mechanism, which allows the application programmer to indicate that a stretch of code, including multiple accesses to data, represents a single real-world activity and so all these steps should happen as if a single step, despite really being interleaved with other programs, or perhaps cancelled after partial execution. If the platform perfectly hides interleaving of different activities, the execution is called serializable, and this is a great aid to protecting data quality. Any integrity constraint over the data (whether explicitly declared in schema or not) which is preserved by each transaction running alone, is also valid at the end of any serializable execution of several transactions.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128985257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shangqi Lu, W. Martens, Matthias Niewerth, Yufei Tao
{"title":"An Optimal Algorithm for Partial Order Multiway Search","authors":"Shangqi Lu, W. Martens, Matthias Niewerth, Yufei Tao","doi":"10.1145/3604437.3604456","DOIUrl":"https://doi.org/10.1145/3604437.3604456","url":null,"abstract":"Partial order multiway search (POMS) is an important problem that finds use in crowdsourcing, distributed file systems, software testing, etc. In this problem, a game is played between an algorithm A and an oracle, based on a directed acyclic graph G known to both parties. First, the oracle picks a vertex t in G called the target; then, A aims to figure out which vertex is t by probing reachability. In each probe, A selects a set Q of vertices in G whose size is bounded by a pre-agreed value k, and the oracle then reveals, for each vertex q 2 Q, whether q can reach the target in G. The objective of A is to minimize the number of probes. This article presents an algorithm to solve POMS in O(log1+k n + d k log1+d n) probes, where n is the number of vertices in G, and d is the largest out-degree of the vertices in G. The probing complexity is asymptotically optimal.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115197016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Technical Perspective: Sortledton: a Universal Graph Data Structure","authors":"A. Bonifati","doi":"10.1145/3604437.3604441","DOIUrl":"https://doi.org/10.1145/3604437.3604441","url":null,"abstract":"Graph processing is becoming ubiquitous due to the proliferation of interconnected data in several domains, including life sciences, social networks, cybersecurity, finance and logistics, to name a few. In parallel with the growth of the underlying graph data sources, a plethora of graph workloads have appeared, ranging from graph analytics to graph traversals and graph pattern matching. Graph systems executing both complex and simple graph workloads need to leverage adequate data structures for efficiently processing heterogeneous graph data. While the underlying graph data structures have been extensively studied for the static case, they are less understood for the dynamic case, with the data undergoing several updates per second. Moreover, the existing solutions suffer lack of generality, as they focus on one specific requirement and workload type at a time. Designing a universal data structure that adapts to several kinds of graph workloads in a dynamic setting and achieves significant efficiency on all of them is far from being trivial.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125067800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Technical Perspective for Sherman: A Write-Optimized Distributed B+Tree Index on Disaggregated Memory","authors":"Tim Kraska","doi":"10.1145/3604437.3604447","DOIUrl":"https://doi.org/10.1145/3604437.3604447","url":null,"abstract":"Separation of compute and storage has become the defacto standard for cloud database systems. First proposed in 2007 for database systems [2], it is now widely adopted by all major cloud providers such as Amazon Redshift, Google BigQuery, and Snowflake. Separation of compute and storage adds enormous value for the customer. Users can scale storage independently of compute, which enables them to only pay for what they really uses. Consider a scenario in which data grows linearly over time, but most queries only access the last month of data, which remains relatively stable. Without the separation of compute and storage, the user would gradually be forced to significantly increase the database cluster capacity. In contrast, modern cloud database systems allow scaling the storage separately from compute; the compute cluster stays the same over time, whereas the data is stored on cheap cloud storage services, like Amazon S3.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129742433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Technical Perspective: Query Answers - Fewer is Faster","authors":"L. Libkin","doi":"10.1145/3604437.3604451","DOIUrl":"https://doi.org/10.1145/3604437.3604451","url":null,"abstract":"We often write queries using LIMIT k, indicating that only k answers are to be returned. This feature is present in most query languages, for different data models: SQL, SPARQL, Cypher etc. For example, in a repository of about 250M SPARQL queries, about 15M queries are of this form. Not surprisingly of course, the database research community studied such queries extensively. The dominant setting is this: there is an ordering on tuples that can be returned by a query. Then the answer is limited to the first k tuples in this ordering.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134313150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conjunctive Queries with Comparisons","authors":"Qichen Wang, K. Yi","doi":"10.1145/3604437.3604450","DOIUrl":"https://doi.org/10.1145/3604437.3604450","url":null,"abstract":"Conjunctive queries with predicates in the form of comparisons that span multiple relations have regained interest recently, due to their relevance in OLAP queries, spatiotemporal databases, and machine learning over relational data. The standard technique, predicate pushdown, has limited efficacy on such comparisons. A technique by Willard can be used to process short comparisons that are adjacent in the join tree in time linear in the input size plus output size. In this paper, we describe a new algorithm for evaluating conjunctive queries with both short and long comparisons, and identify an acyclic condition under which linear time can be achieved. We have also implemented the new algorithm on top of Spark, and our experimental results demonstrate order-of-magnitude speedups over SparkSQL on a variety of graph patterns and analytical queries.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129836803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianqiu Zhang, Kaisong Huang, Tianzheng Wang, King Lv
{"title":"Efficiently Making Cross-Engine Transactions Consistent","authors":"Jianqiu Zhang, Kaisong Huang, Tianzheng Wang, King Lv","doi":"10.1145/3604437.3604444","DOIUrl":"https://doi.org/10.1145/3604437.3604444","url":null,"abstract":"Database systems are becoming increasingly multi-engine. In particular, a main-memory engine may coexist with a traditional storage-centric engine in a system to support various applications. It is desirable to allow applications to access data in both engines using cross-engine transactions. But existing systems are either only designed for singleengine accesses, or impose many restrictions by limiting crossengine transactions to certain isolation levels and operations. The result is inadequate cross-engine support in terms of correctness, performance and programmability.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116942549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Technical Perspective: Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs","authors":"Dan Suciu","doi":"10.1145/3604437.3604457","DOIUrl":"https://doi.org/10.1145/3604437.3604457","url":null,"abstract":"Query engines are really good at choosing an efficient query plan. Users don't need to worry about how they write their query, since the optimizer makes all the right choices for executing the query, while taking into account all aspects of data, such as its size, the characteristics of the storage device, the distribution pattern, the availability of indexes, and so on. The query optimizer always makes the best choice, no matter how complex the query is, or how contrived it was written. Or, this is what we expect today from a modern query optimizer. Unfortunately, reality is not as nice.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133538883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Technical Perspective: Optimal Algorithms for Multiway Search on Partial Orders","authors":"Rajesh Jayaram","doi":"10.1145/3604437.3604455","DOIUrl":"https://doi.org/10.1145/3604437.3604455","url":null,"abstract":"Given a list of comparable items A = {a1, . . . , an sorted so that a1 < a2 < . . . < an, a canonical problem is locating a target item q within A if it exists. The canonical algorithm for this problem, of course, is binary search, which locates q using at most O(log n) comparisons between q and elements of A. Binary search is an indispensable tool for totally ordered datasets. However, many naturally occurring datasets are only partially ordered (posets), meaning that not all pairs of elements are comparable. Every such poset can be expressed as a directed acyclic graph (DAG), with edges (x,y) representing the relation x < y.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125088601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}