Proceedings of the 2016 International Conference on Management of Data最新文献_第5页

An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory 主存中13种关系对等连接的实验比较

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2882917

Stefan Schuh, Xiao Chen, J. Dittrich

{"title":"An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory","authors":"Stefan Schuh, Xiao Chen, J. Dittrich","doi":"10.1145/2882903.2882917","DOIUrl":"https://doi.org/10.1145/2882903.2882917","url":null,"abstract":"Relational equi-joins are at the heart of almost every query plan. They have been studied, improved, and reexamined on a regular basis since the existence of the database community. In the past four years several new join algorithms have been proposed and experimentally evaluated. Some of those papers contradict each other in their experimental findings. This makes it surprisingly hard to answer a very simple question: what is the fastest join algorithm in 2015? In this paper we will try to develop an answer. We start with an end-to-end black box comparison of the most important methods. Afterwards, we inspect the internals of these algorithms in a white box comparison. We derive improved variants of state-of-the-art join algorithms by applying optimizations like~software-write combine buffers, various hash table implementations, as well as NUMA-awareness in terms of data placement and scheduling. We also inspect various radix partitioning strategies. Eventually, we are in the position to perform a comprehensive comparison of thirteen different join algorithms. We factor in scaling effects in terms of size of the input datasets, the number of threads, different page sizes, and data distributions. Furthermore, we analyze the impact of various joins on an (unchanged) TPC-H query. Finally, we conclude with a list of major lessons learned from our study and a guideline for practitioners implementing massive main-memory joins. As is the case with almost all algorithms in databases, we will learn that there is no single best join algorithm. Each algorithm has its strength and weaknesses and shines in different areas of the parameter space.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"294 1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72920325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 91

Emma in Action: Declarative Dataflows for Scalable Data Analysis Emma in Action:可扩展数据分析的声明性数据流

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899396

Alexander B. Alexandrov, Andreas Salzmann, Georgi Krastev, Asterios Katsifodimos, V. Markl

{"title":"Emma in Action: Declarative Dataflows for Scalable Data Analysis","authors":"Alexander B. Alexandrov, Andreas Salzmann, Georgi Krastev, Asterios Katsifodimos, V. Markl","doi":"10.1145/2882903.2899396","DOIUrl":"https://doi.org/10.1145/2882903.2899396","url":null,"abstract":"Parallel dataflow APIs based on second-order functions were originally seen as a flexible alternative to SQL. Over time, however, their complexity increased due to the number of physical aspects that had to be exposed by the underlying engines in order to facilitate efficient execution. To retain a sufficient level of abstraction and lower the barrier of entry for data scientists, projects like Spark and Flink currently offer domain-specific APIs on top of their parallel collection abstractions. This demonstration highlights the benefits of an alternative design based on deep language embedding. We showcase Emma - a programming language embedded in Scala. Emma promotes parallel collection processing through native constructs like Scala's for-comprehensions - a declarative syntax akin to SQL. In addition, Emma also advocates quasi-quoting the entire data analysis algorithm rather than its individual dataflow expressions. This allows for decomposing the quoted code into (sequential) control flow and (parallel) dataflow fragments, optimizing the dataflows in context, and transparently offloading them to an engine like Spark or Flink. The proposed design promises increased programmer productivity due to avoiding an impedance mismatch, thereby reducing the lag times and cost of data analysis.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74145737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Main Memory Adaptive Denormalization 主存储器自适应反规范化

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2914835

Zezhou Liu, Stratos Idreos

引用次数: 11

Graph Summarization for Geo-correlated Trends Detection in Social Networks 社交网络中地理相关趋势检测的图形摘要

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2914832

Colin Biafore, Faisal Nawab

引用次数: 0

Design Tradeoffs of Data Access Methods 数据访问方法的设计权衡

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2912569

Manos Athanassoulis, Stratos Idreos

引用次数: 27

REACT: Context-Sensitive Recommendations for Data Analysis REACT:上下文敏感的数据分析建议

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899392

T. Milo, Amit Somech

引用次数: 19

RxSpatial: Reactive Spatial Library for Real-Time Location Tracking and Processing RxSpatial:用于实时位置跟踪和处理的响应空间库

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899411

Youying Shi, Abdeltawab M. Hendawi, H. Fattah, Mohamed H. Ali

引用次数: 7

QUEPA: QUerying and Exploring a Polystore by Augmentation QUEPA:通过增强来查询和探索Polystore

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899393

A. Maccioni, E. Basili, Riccardo Torlone

{"title":"QUEPA: QUerying and Exploring a Polystore by Augmentation","authors":"A. Maccioni, E. Basili, Riccardo Torlone","doi":"10.1145/2882903.2899393","DOIUrl":"https://doi.org/10.1145/2882903.2899393","url":null,"abstract":"Polystore systems (or simply polystores) have been recently proposed to support a common scenario in which enterprise data are stored in a variety of database technologies relying on different data models and languages. Polystores provide a loosely coupled integration of data sources and support the direct access, with the local language, to each specific storage engine to exploit its distinctive features. Given the absence of a global schema, new challenges for accessing data arise in these environments. In fact, it is usually hard to know in advance if a query to a specific data store can be satisfied with data stored elsewhere in the polystore. QUEPA addresses these issues by introducing augmented search and augmented exploration in a polystore, two access methods based on the automatic enrichment of the result of a query over a storage system with related data in the rest of the polystore. These features do not impact on the applications running on top of the polystore and are compatible with the most common database systems. QUEPA implements in this way a lightweight mechanism for data integration in the polystore and operates in a plug-and-play mode, thus reducing the need for ad-hoc configurations and for middleware layers involving standard APIs, unified query languages or shared data models. In our demonstration audience can experience with the augmentation construct by using the native query languages of the database systems available in the polystore.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"63 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81468448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Transaction Healing: Scaling Optimistic Concurrency Control on Multicores 事务修复:在多核上扩展乐观并发控制

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2915202

Yingjun Wu, C. Chan, K. Tan

{"title":"Transaction Healing: Scaling Optimistic Concurrency Control on Multicores","authors":"Yingjun Wu, C. Chan, K. Tan","doi":"10.1145/2882903.2915202","DOIUrl":"https://doi.org/10.1145/2882903.2915202","url":null,"abstract":"Today's main-memory databases can support very high transaction rate for OLTP applications. However, when a large number of concurrent transactions contend on the same data records, the system performance can deteriorate significantly. This is especially the case when scaling transaction processing with optimistic concurrency control (OCC) on multicore machines. In this paper, we propose a new concurrency-control mechanism, called transaction healing, that exploits program semantics to scale the conventional OCC towards dozens of cores even under highly contended workloads. Transaction healing captures the dependencies across operations within a transaction prior to its execution. Instead of blindly rejecting a transaction once its validation fails, the proposed mechanism judiciously restores any non-serializable operation and heals inconsistent transaction states as well as query results according to the extracted dependencies. Transaction healing can partially update the membership of read/write sets when processing dependent transactions. Such overhead, however, is largely reduced by carefully avoiding false aborts and rearranging validation orders. We implemented the idea of transaction healing in TheDB, a main-memory database prototype that provides full ACID guarantee with a scalable commit protocol. By evaluating TheDB on a 48-core machine with two widely-used benchmarks, we confirm that transaction healing can scale near-linearly, yielding significantly higher transaction rate than the state-of-the-art OCC implementations.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"00 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79020583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

QFix: Demonstrating Error Diagnosis in Query Histories QFix:在查询历史中演示错误诊断

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899388

Xiaolan Wang, A. Meliou, Eugene Wu

{"title":"QFix: Demonstrating Error Diagnosis in Query Histories","authors":"Xiaolan Wang, A. Meliou, Eugene Wu","doi":"10.1145/2882903.2899388","DOIUrl":"https://doi.org/10.1145/2882903.2899388","url":null,"abstract":"An increasing number of applications in all aspects of society rely on data. Despite the long line of research in data cleaning and repairs, data correctness has been an elusive goal. Errors in the data can be extremely disruptive, and are detrimental to the effectiveness and proper function of data-driven applications. Even when data is cleaned, new errors can be introduced by applications and users who interact with the data. Subsequent valid updates can obscure these errors and propagate them through the dataset causing more discrepancies. Any discovered errors tend to be corrected superficially, on a case-by-case basis, further obscuring the true underlying cause, and making detection of the remaining errors harder. In this demo proposal, we outline the design of QFix, a query-centric framework that derives explanations and repairs for discrepancies in relational data based on potential errors in the queries that operated on the data. This is a marked departure from traditional data-centric techniques that directly fix the data. We then describe how users will use QFix in a demonstration scenario. Participants will be able to select from a number of transactional benchmarks, introduce errors into the queries that are executed, and compare the fixes to the queries proposed by QFix as well as existing alternative algorithms such as decision trees.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"159 1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77818508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6