Proceedings of the 2016 International Conference on Management of Data最新文献_第2页

Energy Elasticity on Heterogeneous Hardware using Adaptive Resource Reconfiguration LIVE 基于自适应资源重构的异构硬件能量弹性研究

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899390

A. Ungethüm, T. Kissinger, Willi-Wolfram Mentzel, Eric Mier, Dirk Habich, Wolfgang Lehner

引用次数: 8

Rheem: Enabling Multi-Platform Task Execution Rheem:启用多平台任务执行

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899414

D. Agrawal, M. Ba, Laure Berti-Équille, S. Chawla, A. Elmagarmid, Hossam M. Hammady, Yasser Idris, Zoi Kaoudi, Zuhair Khayyat, Sebastian Kruse, M. Ouzzani, Paolo Papotti, Jorge-Arnulfo Quiané-Ruiz, N. Tang, Mohammed J. Zaki

引用次数: 43

Exploring Visualization of Data Transforms 探索数据转换的可视化

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2914837

Larry Xu

引用次数: 0

ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads ERMIA:用于异构工作负载的快速内存优化数据库系统

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2882905

Kangnyeon Kim, Tianzheng Wang, Ryan Johnson, I. Pandis

{"title":"ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads","authors":"Kangnyeon Kim, Tianzheng Wang, Ryan Johnson, I. Pandis","doi":"10.1145/2882903.2882905","DOIUrl":"https://doi.org/10.1145/2882903.2882905","url":null,"abstract":"Large main memories and massively parallel processors have triggered not only a resurgence of high-performance transaction processing systems optimized for large main-memory and massively parallel processors, but also an increasing demand for processing heterogeneous workloads that include read-mostly transactions. Many modern transaction processing systems adopt a lightweight optimistic concurrency control (OCC) scheme to leverage its low overhead in low contention workloads. However, we observe that the lightweight OCC is not suitable for heterogeneous workloads, causing significant starvation of read-mostly transactions and overall performance degradation. In this paper, we present ERMIA, a memory-optimized database system built from scratch to cater the need of handling heterogeneous workloads. ERMIA adopts snapshot isolation concurrency control to coordinate heterogeneous transactions and provides serializability when desired. Its physical layer supports the concurrency control schemes in a scalable way. Experimental results show that ERMIA delivers comparable or superior performance and near-linear scalability in a variety of workloads, compared to a recent lightweight OCC-based system. At the same time, ERMIA maintains high throughput on read-mostly transactions when the performance of the OCC-based system drops by orders of magnitude.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80818181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 112

CLAMS: Bringing Quality to Data Lakes 蛤蜊:为数据湖带来质量

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899391

Mina H. Farid, Alexandra Roatis, I. Ilyas, H. Hoffmann, Xu Chu

{"title":"CLAMS: Bringing Quality to Data Lakes","authors":"Mina H. Farid, Alexandra Roatis, I. Ilyas, H. Hoffmann, Xu Chu","doi":"10.1145/2882903.2899391","DOIUrl":"https://doi.org/10.1145/2882903.2899391","url":null,"abstract":"With the increasing incentive of enterprises to ingest as much data as they can in what is commonly referred to as \"data lakes\", and with the recent development of multiple technologies to support this \"load-first\" paradigm, the new environment presents serious data management challenges. Among them, the assessment of data quality and cleaning large volumes of heterogeneous data sources become essential tasks in unveiling the value of big data. The coveted use of unstructured and semi-structured data in large volumes makes current data cleaning tools (primarily designed for relational data) not directly adoptable. We present CLAMS, a system to discover and enforce expressive integrity constraints from large amounts of lake data with very limited schema information (e.g., represented as RDF triples). This demonstration shows how CLAMS is able to discover the constraints and the schemas they are defined on simultaneously. CLAMS also introduces a scale-out solution to efficiently detect errors in the raw data. CLAMS interacts with human experts to both validate the discovered constraints and to suggest data repairs. CLAMS has been deployed in a real large-scale enterprise data lake and was experimented with a real data set of 1.2 billion triples. It has been able to spot multiple obscure data inconsistencies and errors early in the data processing stack, providing huge value to the enterprise.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80406317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 72

Functional Dependencies for Graphs 图的函数依赖

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2915232

W. Fan, Yinghui Wu, Jingbo Xu

引用次数: 101

Making the Case for Query-by-Voice with EchoQuery 用EchoQuery实现语音查询

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899394

Gabriel Lyons, Vinh Q. Tran, Carsten Binnig, U. Çetintemel, Tim Kraska

{"title":"Making the Case for Query-by-Voice with EchoQuery","authors":"Gabriel Lyons, Vinh Q. Tran, Carsten Binnig, U. Çetintemel, Tim Kraska","doi":"10.1145/2882903.2899394","DOIUrl":"https://doi.org/10.1145/2882903.2899394","url":null,"abstract":"Recent advances in automatic speech recognition and natural language processing have led to a new generation of robust voice-based interfaces. Yet, there is very little work on using voice-based interfaces to query database systems. In fact, one might even wonder who in her right mind would want to query a database system using voice commands! With this demonstration, we make the case for querying database systems using a voice-based interface, a new querying and interaction paradigm we call Query-by-Voice (QbV). We will demonstrate the practicality and utility of QbV for relational DBMSs using a using a proof-of-concept system called EchoQuery. To achieve a smooth and intuitive interaction, the query interface of EchoQuery is inspired by casual human-to-human conversations. Our demo will show that voice-based interfaces present an intuitive means of querying and consuming data in a database. It will also highlight the unique advantages of QbV over the more traditional approaches, text-based or visual interfaces, for applications where context switching is too expensive, too risky or even not possible at all.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"91 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81471810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

The CloudMdsQL Multistore System CloudMdsQL多存储系统

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899400

B. Kolev, Carlyna Bondiombouy, P. Valduriez, R. Jiménez-Peris, Raquel Pau, José Pereira

{"title":"The CloudMdsQL Multistore System","authors":"B. Kolev, Carlyna Bondiombouy, P. Valduriez, R. Jiménez-Peris, Raquel Pau, José Pereira","doi":"10.1145/2882903.2899400","DOIUrl":"https://doi.org/10.1145/2882903.2899400","url":null,"abstract":"The blooming of different cloud data management infrastructures has turned multistore systems to a major topic in the nowadays cloud landscape. In this demonstration, we present a Cloud Multidatastore Query Language (CloudMdsQL), and its query engine. CloudMdsQL is a functional SQL-like language, capable of querying multiple heterogeneous data stores (relational and NoSQL) within a single query that may contain embedded invocations to each data store's native query interface. The major innovation is that a CloudMdsQL query can exploit the full power of local data stores, by simply allowing some local data store native queries (e.g. a breadth-first search query against a graph database) to be called as functions, and at the same time be optimized. Within our demonstration, we focus on two use cases each involving four diverse data stores (graph, document, relational, and key-value) with its corresponding CloudMdsQL queries. The query execution flows are visualized by an embedded real-time monitoring subsystem. The users can also try out different ad-hoc queries, not necessarily in the context of the use cases.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78747902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

ActiveClean: An Interactive Data Cleaning Framework For Modern Machine Learning ActiveClean:一个用于现代机器学习的交互式数据清理框架

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899409

S. Krishnan, M. Franklin, Ken Goldberg, Jiannan Wang, Eugene Wu

引用次数: 51

T-Part: Partitioning of Transactions for Forward-Pushing in Deterministic Database Systems 第三部分:确定性数据库系统中前推的事务划分

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2915227

Shan-Hung Wu, Tsai-Yu Feng, Meng-Kai Liao, Shao-Kan Pi, Yu-Shan Lin

{"title":"T-Part: Partitioning of Transactions for Forward-Pushing in Deterministic Database Systems","authors":"Shan-Hung Wu, Tsai-Yu Feng, Meng-Kai Liao, Shao-Kan Pi, Yu-Shan Lin","doi":"10.1145/2882903.2915227","DOIUrl":"https://doi.org/10.1145/2882903.2915227","url":null,"abstract":"Deterministic database systems have been shown to yield high throughput on a cluster of commodity machines while ensuring the strong consistency between replicas, provided that the data can be well-partitioned on these machines. However, data partitioning can be suboptimal for many reasons in real-world applications. In this paper, we present T-Part, a transaction execution engine that partitions transactions in a deterministic database system to deal with the unforeseeable workloads or workloads whose data are hard to partition. By modeling the dependency between transactions as a T-graph and continuously partitioning that graph, T-Part allows each transaction to know which later transactions on other machines will read its writes so that it can push forward the writes to those later transactions immediately after committing. This forward-pushing reduces the chance that the later transactions stall due to the unavailability of remote data. We implement a prototype for T-Part. Extensive experiments are conducted and the results demonstrate the effectiveness of T-Part.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86126791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11