SIGMOD Rec.Pub Date : 2017-10-31DOI: 10.1145/3156655.3156663
M. Winslett, V. Braganholo
{"title":"Ron Fagin Speaks Out on His Trajectory as a Database Theoretician","authors":"M. Winslett, V. Braganholo","doi":"10.1145/3156655.3156663","DOIUrl":"https://doi.org/10.1145/3156655.3156663","url":null,"abstract":"Welcome ACM SIGMOD Record's series of interviews with distinguished members of the database community. I'm Marianne Winslett, and today we are in Snowbird, Utah, USA, site of the 2014 SIGMOD and PODS conference. I have here with me Ron Fagin, who has spent many years as a researcher at IBM. He is an IBM Fellow. He is a Fellow of ACM, IEEE, and the American Association for the Advancement of Science. He was elected to the National Academy of Engineering and the American Academy of Arts and Sciences. He has won the IEEE McDowell Award (the highest award of the IEEE Computer Society), the IEEE Technical Achievement Award, and the SIGMOD Edgar F. Codd Innovations Award, and he has won a bunch of Best Paper and Test-of-Time Awards. He was named Docteur Honoris Causa by the University of Paris. Most recently, he won the Gödel Prize in 2014. Ron's Ph.D. is in mathematics, from Berkeley","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"20 1","pages":"29-35"},"PeriodicalIF":0.0,"publicationDate":"2017-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76162385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIGMOD Rec.Pub Date : 2017-10-31DOI: 10.1145/3156655.3156659
Thao N. Pham, Nikos R. Katsipoulakis, Panos K. Chrysanthis, Alexandros Labrinidis
{"title":"Uninterruptible Migration of Continuous Queries without Operator State Migration","authors":"Thao N. Pham, Nikos R. Katsipoulakis, Panos K. Chrysanthis, Alexandros Labrinidis","doi":"10.1145/3156655.3156659","DOIUrl":"https://doi.org/10.1145/3156655.3156659","url":null,"abstract":"The elasticity brought by cloud infrastructure provides a promising solution for a data stream management system to handle its incoming workload, which can be highly variable: the system can scale out when heavily loaded, and scale in otherwise. In such a solution, the efficiency of the mechanism used to migrate a query from one node to another is very important. Generally, a stream application requires real-time outputs for its continuous queries, and downtime is not acceptable. Moreover, the migration should not add considerable processing cost to a node that could have been already overloaded. In this paper, we present our migration protocol, named UniMiCo, which satisfies those requirements. We implemented UniMiCo in a DSMS prototype and experimentally show that the protocol preserves correctness, while introducing no noticeable changes in the response time of the continuous query being migrated.","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"38 1","pages":"17-22"},"PeriodicalIF":0.0,"publicationDate":"2017-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88861834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIGMOD Rec.Pub Date : 2017-10-31DOI: 10.1145/3156655.3156665
Wolfgang Lehner
{"title":"The Dresden Database Systems Group","authors":"Wolfgang Lehner","doi":"10.1145/3156655.3156665","DOIUrl":"https://doi.org/10.1145/3156655.3156665","url":null,"abstract":"The Dresden Database Systems Group focuses on the advancement of data management techniques from a system level as well as information management perspective. With more than 15 PhD students the research group is involved in a variety of larger research projects ranging from activities to exploit modern hardware for scalable storage engines to advancing statistical methods for large-scale time series management. The group is visible at an international level as well as actively involved in cooperations with national and regional research partners","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"62 1","pages":"36-41"},"PeriodicalIF":0.0,"publicationDate":"2017-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83927293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIGMOD Rec.Pub Date : 2017-09-01DOI: 10.1145/3137586.3137588
P. Barceló, Andreas Pieris, M. Romero
{"title":"Semantic Optimization in Tractable Classes of Conjunctive Queries","authors":"P. Barceló, Andreas Pieris, M. Romero","doi":"10.1145/3137586.3137588","DOIUrl":"https://doi.org/10.1145/3137586.3137588","url":null,"abstract":"This paper reports on recent advances in semantic query optimization. We focus on the core class of conjunctive queries (CQs). Since CQ evaluation is NP-complete, a long line of research has concentrated on identifying fragments of CQs that can be efficiently evaluated. One of the most general such restrictions corresponds to bounded generalized hypertreewidth, which extends the notion of acyclicity. Here we discuss the problem of reformulating a CQ into one of bounded generalized hypertreewidth. Furthermore, we study whether knowing that such a reformulation exists alleviates the cost of CQ evaluation. In case a CQ cannot be reformulated as one of bounded generalized hypertreewidth, we discuss how it can be approximated in an optimal way. All the above issues are examined both for the constraint-free case, and the case where constraints, in fact, tuple-generating and equality-generating dependencies, are present","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"1 1","pages":"5-17"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88095606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIGMOD Rec.Pub Date : 2017-09-01DOI: 10.1145/3137586.3137594
M. Winslett, V. Braganholo
{"title":"Beng Chin Ooi Speaks Out on Building a Strong Database Group","authors":"M. Winslett, V. Braganholo","doi":"10.1145/3137586.3137594","DOIUrl":"https://doi.org/10.1145/3137586.3137594","url":null,"abstract":"Welcome to ACM SIGMOD Record's series of interviews with distinguished members of the database community. I'm Marianne Winslett, and today we are at my office at the Advanced Digital Sciences Center in Singapore, an outpost of the University of Illinois. I have here with me today Beng Chin Ooi who is the dean of the school of computing at the National University of Singapore where he's been a professor of computer science for many years. Beng Chin is editor-in-chief for IEEE Transactions on Knowledge and Data Engineering. He is the recipient of the 2009 SIGMOD Contributions Award, and he is an IEEE and ACM Fellow and Fellow of Singapore National Academy of Science. He is the co-founder of two startups and his Ph.D. is from Monash University. So Beng Chin, welcome! (Please note that this interview took place in 2011).","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"278 1","pages":"36-42"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82306972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIGMOD Rec.Pub Date : 2017-09-01DOI: 10.1145/3137586.3137590
Hari Singh, S. Bawa
{"title":"A Survey of Traditional and MapReduceBased Spatial Query Processing Approaches","authors":"Hari Singh, S. Bawa","doi":"10.1145/3137586.3137590","DOIUrl":"https://doi.org/10.1145/3137586.3137590","url":null,"abstract":"Various indexing methods of spatial data have come out after rigorous efforts put by many researchers for fast processing of spatial queries. Parallelizing spatial index building and query processing have become very popular for improving efficiency. The MapReduce framework provides a modern way of parallel processing. A MapReduce-based works for spatial queries consider the existing traditional spatial indexing for building spatial indexes in parallel. The majority of the spatial indexes implemented in MapReduce use R-Tree and its variants. Therefore, R-Tree and its variantbased traditional spatial indexes are thoroughly surveyed in the paper. The objective is to search for still less explored spatial indexing approaches, having the potential for parallelism in MapReduce. The review work also provides a detailed survey of MapReduce-based spatial query processing approaches - hierarchical indexed and packed key-value storage based spatial dataset. Both approaches use different data partitioning strategies for distributing data among cluster nodes and managing the partitioned dataset through different indexing. Finally, a number of parameters are selected for comparison and analysis of all the existing approaches in the literature.","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"3 1","pages":"18-29"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77465679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIGMOD Rec.Pub Date : 2017-09-01DOI: 10.1145/3137586.3137592
Yang Chen, Xiaofeng Zhou, Kun Li, Daisy Zhe Wang
{"title":"Archimedes: Efficient Query Processing over Probabilistic Knowledge Bases","authors":"Yang Chen, Xiaofeng Zhou, Kun Li, Daisy Zhe Wang","doi":"10.1145/3137586.3137592","DOIUrl":"https://doi.org/10.1145/3137586.3137592","url":null,"abstract":"We present the ARCHIMEDES system for efficient query processing over probabilistic knowledge bases. We design ARCHIMEDES for knowledge bases containing incomplete and uncertain information due to limitations of information sources and human knowledge. Answering queries over these knowledge bases requires efficient probabilistic inference. In this paper, we describe ARCHIMEDES's efficient knowledge expansion and querydriven inference over UDA-GIST, an in-database unified data- and graph-parallel computation framework. With an efficient inference engine, ARCHIMEDES produces reasonable results for queries over large uncertain knowledge bases. We use the Reverb-Sherlock andWikilinks knowledge bases to show ARCHIMEDES achieves satisfactory quality with real-time performance.","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"79 1","pages":"30-35"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75820038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIGMOD Rec.Pub Date : 2017-09-01DOI: 10.1145/3137586.3137596
F. Afrati, J. Hidders, C. Ré, J. Sroka, J. Ullman
{"title":"Report from the third workshop on Algorithms and Systems for MapReduce and Beyond (BeyondMR'16)","authors":"F. Afrati, J. Hidders, C. Ré, J. Sroka, J. Ullman","doi":"10.1145/3137586.3137596","DOIUrl":"https://doi.org/10.1145/3137586.3137596","url":null,"abstract":"This report summarizes the presentations and discussions of the third workshop on Algorithms and Systems for MapReduce and Beyond (BeyondMR'16). The BeyondMR workshop was held in conjunction with the 2016 SIGMOD conference in San Francisco, California, USA on July 1, 2016. The goal of the workshop was to bring together researchers and practitioners to explore algorithms, computational models, architectures, languages and interfaces for systems that need largescale parallelization and systems designed to support efficient parallelization and fault tolerance. These include specialized programming and data-management systems based on MapReduce and extensions, graph processing systems, data-intensive workflow and dataflow systems. The program featured two very well attended invited talks by Ion Stoica from AMPLab, University of California Berkeley and Carlos Guestrin from the University of Washington.","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"209 1","pages":"43-48"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89107432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIGMOD Rec.Pub Date : 2017-05-12DOI: 10.1145/3093754.3093762
J. Naughton
{"title":"Technical Perspective: Optimized Wandering for Online Aggregation","authors":"J. Naughton","doi":"10.1145/3093754.3093762","DOIUrl":"https://doi.org/10.1145/3093754.3093762","url":null,"abstract":"There is a rich history in the DBMS research literature involving sampling to estimate the results of queries faster than they can be computed exactly. A particularly interesting example of this is “Online Aggregation” proposed by Hellerstein et al. in 1997 [2]. There the idea is to combine sampling with a creative and intuitive user interface. Briefly, when a query starts to run, Online Aggregation will quickly present an estimate of the result of the query (based on data sampled up to that point) and will also present a confidence interval around the estimate. As query execution continues, the estimate is refined, and the confidence interval shrinks. Hidden in this attractive idea, however, are some di cult challenges. As an example, for queries that involve joins, the sampling process is in general slow, especially if most of the tuples from one relation participating in the join “match” with only a few tuples in the other relation. For 20 years the state of the art approach to this problem has been the “Ripple Join” [1]. The following paper by Li, Wu, Yi, and Zhao presents a highly e↵ective alternative. The main idea behind the wander join is to use the presence of indexes to speed the sampling, e↵ectively making a random walk through the data join graph. The details of doing this e ciently (both computationally and statistically) are not obvious. The authors of this paper use a clever combination of sampling strategies from the statistical literature and an on-line optimization process to order the paths chosen for the random walk, in the process achieving much better computational and statistical properties than the previously state of the art algorithm. The authors convincingly prove this through experimentation with an open-source implementation in the Postgres database management system.","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"17 1","pages":"32"},"PeriodicalIF":0.0,"publicationDate":"2017-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84967150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIGMOD Rec.Pub Date : 2017-05-12DOI: 10.1145/3093754.3093765
Ahmed Elgohary, Matthias Boehm, P. Haas, Frederick Reiss, B. Reinwald
{"title":"Scaling Machine Learning via Compressed Linear Algebra","authors":"Ahmed Elgohary, Matthias Boehm, P. Haas, Frederick Reiss, B. Reinwald","doi":"10.1145/3093754.3093765","DOIUrl":"https://doi.org/10.1145/3093754.3093765","url":null,"abstract":"Large-scale machine learning (ML) algorithms are often iterative, using repeated read-only data access and I/Obound matrix-vector multiplications to converge to an optimal model. It is crucial for performance to fit the data into single-node or distributed main memory and enable very fast matrix-vector operations on in-memory data. Generalpurpose, heavy- and lightweight compression techniques struggle to achieve both good compression ratios and fast decompression speed to enable block-wise uncompressed operations. Compressed linear algebra (CLA) avoids these problems by applying lightweight lossless database compression techniques to matrices and then executing linear algebra operations such as matrix-vector multiplication directly on the compressed representations. The key ingredients are effective column compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Experiments on an initial implementation in SystemML show in-memory operations performance close to the uncompressed case and good compression ratios.We thereby obtain significant end-to-end performance improvements up to 26x or reduced memory requirements.","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"1 1","pages":"42-49"},"PeriodicalIF":0.0,"publicationDate":"2017-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87036495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}