Proceedings of the 2016 International Conference on Management of Data最新文献_第6页

SourceSight: Enabling Effective Source Selection SourceSight:启用有效选源功能

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899403

Theodoros Rekatsinas, A. Deshpande, X. Dong, L. Getoor, D. Srivastava

引用次数: 17

PerNav: A Route Summarization Framework for Personalized Navigation PerNav:一个用于个性化导航的路线汇总框架

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899384

Yaguang Li, Han Su, Ugur Demiryurek, Bolong Zheng, Kai Zeng, C. Shahabi

引用次数: 5

What Makes a Good Physical plan?: Experiencing Hardware-Conscious Query Optimization with Candomblé 什么是好的健身计划?:使用candomblaise体验基于硬件的查询优化

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899410

H. Pirk, Oscar Moll, S. Madden

引用次数: 5

Similarity Join over Array Data 数组数据上的相似性连接

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2915247

Weijie Zhao, Florin Rusu, Bin Dong, Kesheng Wu

引用次数: 27

Local Similarity Search for Unstructured Text 非结构化文本的局部相似度搜索

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2915211

Pei Wang, Chuan Xiao, Jianbin Qin, Wei Wang, Xiaoyan Zhang, Y. Ishikawa

{"title":"Local Similarity Search for Unstructured Text","authors":"Pei Wang, Chuan Xiao, Jianbin Qin, Wei Wang, Xiaoyan Zhang, Y. Ishikawa","doi":"10.1145/2882903.2915211","DOIUrl":"https://doi.org/10.1145/2882903.2915211","url":null,"abstract":"With the growing popularity of electronic documents, replication can occur for many reasons. People may copy text segments from various sources and make modifications. In this paper, we study the problem of local similarity search to find partially replicated text. Unlike existing studies on similarity search which find entirely duplicated documents, our target is to identify documents that approximately share a pair of sliding windows which differ by no more than τ tokens. Our problem is technically challenging because for sliding windows the tokens to be indexed are less selective than entire documents, rendering set similarity join-based algorithms less efficient. Our proposed method is based on enumerating token combinations to obtain signatures with high selectivity. In order to strike a balance between signature and candidate generation, we partition the token universe and for different partitions we generate combinations composed of different numbers of tokens. A cost-aware algorithm is devised to find a good partitioning of the token universe. We also propose to leverage the overlap between adjacent windows to share computation and thus speed up query processing. In addition, we develop the techniques to support the large thresholds. Experiments on real datasets demonstrate the efficiency of our method against alternative solutions.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76357666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Wander Join: Online Aggregation for Joins Wander Join:连接的在线聚合

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2899413

Feifei Li, Bin Wu, K. Yi, Zhuoyue Zhao

引用次数: 9

Towards a Non-2PC Transaction Management in Distributed Database Systems 面向分布式数据库系统的非2pc事务管理

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2882923

Qian Lin, Pengfei Chang, Gang Chen, B. Ooi, K. Tan, Zhengkui Wang

{"title":"Towards a Non-2PC Transaction Management in Distributed Database Systems","authors":"Qian Lin, Pengfei Chang, Gang Chen, B. Ooi, K. Tan, Zhengkui Wang","doi":"10.1145/2882903.2882923","DOIUrl":"https://doi.org/10.1145/2882903.2882923","url":null,"abstract":"Shared-nothing architecture has been widely used in distributed databases to achieve good scalability. While it offers superior performance for local transactions, the overhead of processing distributed transactions can degrade the system performance significantly. The key contributor to the degradation is the expensive two-phase commit (2PC) protocol used to ensure atomic commitment of distributed transactions. In this paper, we propose a transaction management scheme called LEAP to avoid the 2PC protocol within distributed transaction processing. Instead of processing a distributed transaction across multiple nodes, LEAP converts the distributed transaction into a local transaction. This benefits the processing locality and facilitates adaptive data repartitioning when there is a change in data access pattern. Based on LEAP, we develop an online transaction processing (OLTP) system, L-Store, and compare it with the state-of-the-art distributed in-memory OLTP system, H-Store, which relies on the 2PC protocol for distributed transaction processing, and H^L-Store, a H-Store that has been modified to make use of LEAP. Results of an extensive experimental evaluation show that our LEAP-based engines are superior over H-Store by a wide margin, especially for workloads that exhibit locality-based data accesses.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86541578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

LazyLSH: Approximate Nearest Neighbor Search for Multiple Distance Functions with a Single Index 用一个索引对多个距离函数进行近似最近邻搜索

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2882930

Yuxin Zheng, Qi Guo, A. Tung, Sai Wu

{"title":"LazyLSH: Approximate Nearest Neighbor Search for Multiple Distance Functions with a Single Index","authors":"Yuxin Zheng, Qi Guo, A. Tung, Sai Wu","doi":"10.1145/2882903.2882930","DOIUrl":"https://doi.org/10.1145/2882903.2882930","url":null,"abstract":"Due to the \"curse of dimensionality\" problem, it is very expensive to process the nearest neighbor (NN) query in high-dimensional spaces; and hence, approximate approaches, such as Locality-Sensitive Hashing (LSH), are widely used for their theoretical guarantees and empirical performance. Current LSH-based approaches target at the L1 and L2 spaces, while as shown in previous work, the fractional distance metrics (Lp metrics with 0 < p < 1) can provide more insightful results than the usual L1 and L2 metrics for data mining and multimedia applications. However, none of the existing work can support multiple fractional distance metrics using one index. In this paper, we propose LazyLSH that answers approximate nearest neighbor queries for multiple Lp metrics with theoretical guarantees. Different from previous LSH approaches which need to build one dedicated index for every query space, LazyLSH uses a single base index to support the computations in multiple Lp spaces, significantly reducing the maintenance overhead. Extensive experiments show that LazyLSH provides more accurate results for approximate kNN search under fractional distance metrics.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90201548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 63

Research Contribution as a Measure of Influence 研究贡献作为影响的衡量标准

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2914834

Lais M. A. Rocha, Mirella M. Moro

引用次数: 4

Scaling Multicore Databases via Constrained Parallel Execution 通过约束并行执行扩展多核数据库

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI: 10.1145/2882903.2882934

Zhaoguo Wang, Shuai Mu, Yang Cui, Han Yi, Haibo Chen, Jinyang Li

{"title":"Scaling Multicore Databases via Constrained Parallel Execution","authors":"Zhaoguo Wang, Shuai Mu, Yang Cui, Han Yi, Haibo Chen, Jinyang Li","doi":"10.1145/2882903.2882934","DOIUrl":"https://doi.org/10.1145/2882903.2882934","url":null,"abstract":"Multicore in-memory databases often rely on traditional con- currency control schemes such as two-phase-locking (2PL) or optimistic concurrency control (OCC). Unfortunately, when the workload exhibits a non-trivial amount of contention, both 2PL and OCC sacrifice much parallel execution op- portunity. In this paper, we describe a new concurrency control scheme, interleaving constrained concurrency con- trol (IC3), which provides serializability while allowing for parallel execution of certain conflicting transactions. IC3 combines the static analysis of the transaction workload with runtime techniques that track and enforce dependencies among concurrent transactions. The use of static analysis simplifies IC3's runtime design, allowing it to scale to many cores. Evaluations on a 64-core machine using the TPC- C benchmark show that IC3 outperforms traditional con- currency control schemes under contention. It achieves the throughput of 434K transactions/sec on the TPC-C bench- mark configured with only one warehouse. It also scales better than several recent concurrent control schemes that also target contended workloads.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79574668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 60