Proceedings of the 2018 International Conference on Management of Data最新文献

Modern Recommender Systems: from Computing Matrices to Thinking with Neurons 现代推荐系统:从计算矩阵到神经元思考

Proceedings of the 2018 International Conference on Management of Data Pub Date : 2018-05-27 DOI: 10.1145/3183713.3197389

G. Koutrika

引用次数: 13

Session details: Research 1: Data Integration & Cleaning 会议详情:研究1:数据集成与清洗

Proceedings of the 2018 International Conference on Management of Data Pub Date : 2018-05-27 DOI: 10.1145/3258004

E. Rahm

引用次数: 0

SSD as SQLite Engine SSD作为SQLite引擎

Proceedings of the 2018 International Conference on Management of Data Pub Date : 2018-05-27 DOI: 10.1145/3183713.3183720

Soyee Choi

引用次数: 2

Carousel: Low-Latency Transaction Processing for Globally-Distributed Data Carousel:全球分布式数据的低延迟事务处理

Proceedings of the 2018 International Conference on Management of Data Pub Date : 2018-05-27 DOI: 10.1145/3183713.3196912

Xinan Yan, Linguan Yang, Hongbo Zhang, X. Lin, B. Wong, K. Salem, Tim Brecht

{"title":"Carousel: Low-Latency Transaction Processing for Globally-Distributed Data","authors":"Xinan Yan, Linguan Yang, Hongbo Zhang, X. Lin, B. Wong, K. Salem, Tim Brecht","doi":"10.1145/3183713.3196912","DOIUrl":"https://doi.org/10.1145/3183713.3196912","url":null,"abstract":"The trend towards global applications and services has created an increasing demand for transaction processing on globally-distributed data. Many database systems, such as Spanner and CockroachDB, support distributed transactions but require a large number of wide-area network roundtrips to commit each transaction and ensure the transaction's state is durably replicated across multiple datacenters. This can significantly increase transaction completion time, resulting in developers replacing database-level transactions with their own error-prone application-level solutions. This paper introduces Carousel, a distributed database system that provides low-latency transaction processing for multi-partition globally-distributed transactions. Carousel shortens transaction processing time by reducing the number of sequential wide-area network round trips required to commit a transaction and replicate its results while maintaining serializability. This is possible in part by using information about a transaction's potential write set to enable transaction processing, including any necessary remote read operations, to overlap with 2PC and state replication. Carousel further reduces transaction completion time by introducing a consensus protocol that can perform state replication in parallel with 2PC. For a multi-partition 2-round Fixed-set Interactive (2FI) transaction, Carousel requires at most two wide-area network roundtrips to commit the transaction when there are no failures, and only one round trip in the common case if local replicas are available.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84234232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Amazon Aurora: On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes Amazon Aurora:避免I/ o、提交和成员变更的分布式共识

Proceedings of the 2018 International Conference on Management of Data Pub Date : 2018-05-27 DOI: 10.1145/3183713.3196937

Alexandre Verbitski, Anurag Gupta, D. Saha, James Corey, K. Gupta, Murali Brahmadesam, Raman Mittal, S. Krishnamurthy, Sandor Maurice, T. Kharatishvili, Xiaofeng Bao

引用次数: 38

Session details: Research 7: Tuning, Monitoring & Query Optimization 研究7:调优，监控和查询优化

Proceedings of the 2018 International Conference on Management of Data Pub Date : 2018-05-27 DOI: 10.1145/3258013

Sudipto Das

引用次数: 0

EPUI: Experimental Platform for Urban Informatics 城市信息学实验平台

Proceedings of the 2018 International Conference on Management of Data Pub Date : 2018-05-27 DOI: 10.1145/3183713.3193560

Xiaoyu Ge, Panos K. Chrysanthis, K. Pelechrinis, D. Zeinalipour-Yazti

引用次数: 4

SQuID: Semantic Similarity-Aware Query Intent Discovery SQuID:语义相似感知查询意图发现

Proceedings of the 2018 International Conference on Management of Data Pub Date : 2018-05-27 DOI: 10.1145/3183713.3193548

Anna Fariha, Sheikh Muhammad Sarwar, A. Meliou

{"title":"SQuID: Semantic Similarity-Aware Query Intent Discovery","authors":"Anna Fariha, Sheikh Muhammad Sarwar, A. Meliou","doi":"10.1145/3183713.3193548","DOIUrl":"https://doi.org/10.1145/3183713.3193548","url":null,"abstract":"Recent expansion of database technology demands a convenient framework for non-expert users to explore datasets. Several approaches exist to assist these non-expert users where they can express their query intent by providing example tuples for their intended query output. However, these approaches treat the structural similarity among the example tuples as the only factor specifying query intent and ignore the richer context present in the data. In this demo, we present SQuID, a system for Semantic similarity-aware Query Intent Discovery. SQuID takes a few example tuples from the user as input, through a simple interface, and consults the database to discover deeper associations among these examples. These data-driven associations reveal the semantic context of the provided examples, allowing SQuID to infer the user's intended query precisely and effectively. SQuID further explains its inference, by displaying the discovered semantic context to the user, who can then provide feedback and tune the result. We demonstrate how SQuID can capture even esoteric and complex semantic contexts, alleviating the need for constructing complex SQL queries, while not requiring the user to have any schema or query language knowledge.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78954987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Maverick: Discovering Exceptional Facts from Knowledge Graphs 特立独行:从知识图谱中发现特殊事实

Proceedings of the 2018 International Conference on Management of Data Pub Date : 2018-05-27 DOI: 10.1145/3183713.3183730

Gensheng Zhang, Damian Jimenez, Chengkai Li

{"title":"Maverick: Discovering Exceptional Facts from Knowledge Graphs","authors":"Gensheng Zhang, Damian Jimenez, Chengkai Li","doi":"10.1145/3183713.3183730","DOIUrl":"https://doi.org/10.1145/3183713.3183730","url":null,"abstract":"We present Maverick, a general, extensible framework that discovers exceptional facts about entities in knowledge graphs. To the best of our knowledge, there was no previous study of the problem. We model an exceptional fact about an entity of interest as a context-subspace pair, in which a subspace is a set of attributes and a context is defined by a graph query pattern of which the entity is a match. The entity is exceptional among the entities in the context, with regard to the subspace. The search spaces of both patterns and subspaces are exponentially large. Maverick conducts beam search on the patterns which uses a match-based pattern construction method to evade the evaluation of invalid patterns. It applies two heuristics to select promising patterns to form the beam in each iteration. Maverick traverses and prunes the subspaces organized as a set enumeration tree by exploiting the upper bound properties of exceptionality scoring functions. Results of experiments and user studies using real-world datasets demonstrated substantial performance improvement of the proposed framework over the baselines as well as its effectiveness in discovering exceptional facts.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":"339 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76390576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Random Sampling over Joins Revisited 重新访问连接上的随机抽样

Proceedings of the 2018 International Conference on Management of Data Pub Date : 2018-05-27 DOI: 10.1145/3183713.3183739

Zhuoyue Zhao, Robert Christensen, Feifei Li, Xiao Hu, K. Yi

{"title":"Random Sampling over Joins Revisited","authors":"Zhuoyue Zhao, Robert Christensen, Feifei Li, Xiao Hu, K. Yi","doi":"10.1145/3183713.3183739","DOIUrl":"https://doi.org/10.1145/3183713.3183739","url":null,"abstract":"Joins are expensive, especially on large data and/or multiple relations. One promising approach in mitigating their high costs is to just return a simple random sample of the full join results, which is sufficient for many tasks. Indeed, in as early as 1999, Chaudhuri et al. posed the problem of sampling over joins as a fundamental challenge in large database systems. They also pointed out a fundamental barrier for this problem, that the sampling operator cannot be pushed through a join, i.e., sample( R bowtie S )≠ sample( R ) bowtie sample( S ). To overcome this barrier, they used precomputed statistics to guide the sampling process, but only showed how this works for two-relation joins. This paper revisits this classic problem for both acyclic and cyclic multi-way joins. We build upon the idea of Chaudhuri et al., but extend it in several nontrivial directions. First, we propose a general framework for random sampling over multi-way joins, which includes the algorithm of Chaudhuri et al. as a special case. Second, we explore several ways to instantiate this framework, depending on what prior information is available about the underlying data, and offer different tradeoffs between sample generation latency and throughput. We analyze the properties of different instantiations and evaluate them against the baseline methods; the results clearly demonstrate the superiority of our new techniques.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87632320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 93