{"title":"Session details: Research 5: Graph Data Management","authors":"S. Bhowmick","doi":"10.1145/3258009","DOIUrl":"https://doi.org/10.1145/3258009","url":null,"abstract":"","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78300931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FastQRE: Fast Query Reverse Engineering","authors":"D. Kalashnikov, L. Lakshmanan, D. Srivastava","doi":"10.1145/3183713.3183727","DOIUrl":"https://doi.org/10.1145/3183713.3183727","url":null,"abstract":"We study the problem of Query Reverse Engineering (QRE), where given a database and an output table, the task is to find a simple project-join SQL query that generates that table when applied on the database. This problem is known for its efficiency challenge due to mainly two reasons. First, the problem has a very large search space and its various variants are known to be NP-hard. Second, executing even a single candidate SQL query can be very computationally expensive. In this work we propose a novel approach for solving the QRE problem efficiently. Our solution outperforms the existing state of the art by 2-3 orders of magnitude for complex queries, resolving those queries in seconds rather than days, thus making our approach more practical in real-life settings.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75454344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DITA: A Distributed In-Memory Trajectory Analytics System","authors":"Zeyuan Shang, Guoliang Li, Z. Bao","doi":"10.1145/3183713.3193553","DOIUrl":"https://doi.org/10.1145/3183713.3193553","url":null,"abstract":"Trajectory analytics can benefit many real-world applications, e.g., frequent trajectory based navigation systems, road planning, car pooling, and transportation optimizations. In this paper, we demonstrate a distributed in-memory trajectory analytics system DITA to support large-scale trajectory data analytics. DITA exhibit three unique features. First, DITA supports threshold-based and KNN-based trajectory similarity search and join operations, as well as range queries (i.e., space and time). Second, DITA is versatile to support most existing similarity functions to cater for different analytic purposes and scenarios. Last, DITA is seamlessly integrated into Spark SQL to support easy-to-use SQL and DataFrame API interfaces. Technically, DITA proposes an effective partitioning method, global index and local index, to address the data locality problem. It also devises cost-based techniques to balance the workload, and develops a filter-verification framework for efficient and scalable search and join.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73491422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Research 3: Transactions and Indexing","authors":"Pınar Tözün","doi":"10.1145/3258007","DOIUrl":"https://doi.org/10.1145/3258007","url":null,"abstract":"","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74467015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Research 2: Usability and Security/Privacy","authors":"Ashwin Machanavajjhala","doi":"10.1145/3258005","DOIUrl":"https://doi.org/10.1145/3258005","url":null,"abstract":"","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76021761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Qetch: Time Series Querying with Expressive Sketches","authors":"M. Mannino, A. Abouzeid","doi":"10.1145/3183713.3193547","DOIUrl":"https://doi.org/10.1145/3183713.3193547","url":null,"abstract":"Query-by-sketch tools allow users to sketch a pattern to search a time series database for matches. Prior work adopts a bottom-up design approach: the sketching interface is built to reflect the inner workings of popular matching algorithms like Dynamic time warping (DTW) or Euclidean distance (ED). We design Qetch, a query-by-sketch tool for time series data, top-down. Users freely sketch patterns on a scale-less canvas. By studying how humans sketch time series patterns we develop a matching algorithm that accounts for human sketching errors. Qetch's top-down design and novel matching algorithm enable the easy construction of expressive queries that include regular expressions over sketches and queries over multiple time series. Our demonstration showcases Qetch and summarizes results from our evaluation of Qetch's effectiveness.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75961849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sameera Ghayyur, Yan Chen, Roberto Yus, Ashwin Machanavajjhala, Michael Hay, G. Miklau, S. Mehrotra
{"title":"IoT-Detective: Analyzing IoT Data Under Differential Privacy","authors":"Sameera Ghayyur, Yan Chen, Roberto Yus, Ashwin Machanavajjhala, Michael Hay, G. Miklau, S. Mehrotra","doi":"10.1145/3183713.3193571","DOIUrl":"https://doi.org/10.1145/3183713.3193571","url":null,"abstract":"Emerging IoT technologies promise to bring revolutionary changes to many domains including health, transportation, and building management. However, continuous monitoring of individuals threatens privacy. The success of IoT thus depends on integrating privacy protections into IoT infrastructures. This demonstration adapts a recently-proposed system, PeGaSus, which releases streaming data under the formal guarantee of differential privacy, with a state-of-the-art IoT testbed (TIPPERS) located at UC Irvine. PeGaSus protects individuals' data by introducing distortion into the output stream. While PeGaSuS has been shown to offer lower numerical error compared to competing methods, assessing the usefulness of the output is application dependent. The goal of the demonstration is to assess the usefulness of private streaming data in a real-world IoT application setting. The demo consists of a game, IoT-Detective, in which participants carry out visual data analysis tasks on private data streams, earning points when they achieve results similar to those on the true data stream. The demo will educate participants about the impact of privacy mechanisms on IoT data while at the same time generating insights into privacy-utility trade-offs in IoT applications.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73807204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongxin Tong, Libin Wang, Zimu Zhou, Lei Chen, Bowen Du, Jieping Ye
{"title":"Dynamic Pricing in Spatial Crowdsourcing: A Matching-Based Approach","authors":"Yongxin Tong, Libin Wang, Zimu Zhou, Lei Chen, Bowen Du, Jieping Ye","doi":"10.1145/3183713.3196929","DOIUrl":"https://doi.org/10.1145/3183713.3196929","url":null,"abstract":"In spatial crowdsourcing, requesters submit their task-related locations and increase the demand of a local area. The platform prices these tasks and assigns spatial workers to serve if the prices are accepted by requesters. There exist mature pricing strategies which specialize in tackling the imbalance between supply and demand in a local market. However, in global optimization, the platform should consider the mobility of workers; that is, any single worker can be the potential supply for several areas, while it can only be the true supply of one area when assigned by the platform. The hardness lies in the uncertainty of the true supply of each area, hence the existing pricing strategies do not work. In the paper, we formally define this Global Dynamic Pricing(GDP) problem in spatial crowdsourcing. And since the objective is concerned with how the platform matches the supply to areas, we let the matching algorithm guide us how to price. We propose a MAtching-based Pricing Strategy (MAPS) with guaranteed bound. Extensive experiments conducted on the synthetic and real datasets demonstrate the effectiveness of MAPS.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83957391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GeoFlux: Hands-Off Data Integration Leveraging Join Key Knowledge","authors":"Jie Song, Danai Koutra, Murali Mani, H. Jagadish","doi":"10.1145/3183713.3193546","DOIUrl":"https://doi.org/10.1145/3183713.3193546","url":null,"abstract":"Data integration is frequently required to obtain the full value of data from multiple sources. In spite of extensive research on tools to assist users, data integration remains hard, particularly for users with limited technical proficiency. To address this barrier, we study how much we can do with no user guidance. Our vision is that the user should merely specify two input datasets to be joined and get a meaningful integrated result. It turns out that our vision can be realized if the system can correctly determine the join key, for example based on domain knowledge. We demonstrate this notion by considering a broad domain: socioeconomic data aggregated by geography, a widespread category that accounts for 80% of the data published by government agencies. Intuitively two such datasets can be integrated by joining on the geographic unit column. Although it sounds easy, this task has many challenges: How can we automatically identify columns corresponding to geographic units, other dimension variables and measure variables, respectively? If multiple geographic types exist, which one should be chosen for the join? How to join tables with idiosyncratic schema, different geographic units of aggregation or no aggregation at all? We have developed GeoFlux, a data integration system that handles all these challenges and joins tabular data by automatically aggregating geographic information with a new, advanced crosswalk algorithm. In this demo paper, we overview the architecture of the system and its user-friendly interfaces, and then demonstrate via a real-world example that it is general, fully automatic and easy-to-use. In the demonstration, we invite users to interact with GeoFlux to integrate more sample socioeconomic data from data.ny.gov.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84169369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henning Funke, S. Breß, Stefan Noll, V. Markl, J. Teubner
{"title":"Pipelined Query Processing in Coprocessor Environments","authors":"Henning Funke, S. Breß, Stefan Noll, V. Markl, J. Teubner","doi":"10.1145/3183713.3183734","DOIUrl":"https://doi.org/10.1145/3183713.3183734","url":null,"abstract":"Query processing on GPU-style coprocessors is severely limited by the movement of data. With teraflops of compute throughput in one device, even high-bandwidth memory cannot provision enough data for a reasonable utilization. Query compilation is a proven technique to improve memory efficiency. However, its inherent tuple-at-a-time processing style does not suit the massively parallel execution model of GPU-style coprocessors. This compromises the improvements in efficiency offered by query compilation. In this paper, we show how query compilation and GPU-style parallelism can be made to play in unison nevertheless. We describe a compiler strategy that merges multiple operations into a single GPU kernel, thereby significantly reducing bandwidth demand. Compared to operator-at-a-time, we show reductions of memory access volumes by factors of up to 7.5x resulting in shorter kernel execution times by factors of up to 9.5x.","PeriodicalId":20430,"journal":{"name":"Proceedings of the 2018 International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78163453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}