2020 IEEE 36th International Conference on Data Engineering (ICDE)最新文献_第6页

Speed Kit: A Polyglot & GDPR-Compliant Approach For Caching Personalized Content 速度套件:多语言和gdpr兼容的方法缓存个性化内容

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00142

Wolfram Wingerath, Felix Gessert, Erik Witt, Hannes Kuhlmann, Florian Bücklers, Benjamin Wollmer, N. Ritter

{"title":"Speed Kit: A Polyglot & GDPR-Compliant Approach For Caching Personalized Content","authors":"Wolfram Wingerath, Felix Gessert, Erik Witt, Hannes Kuhlmann, Florian Bücklers, Benjamin Wollmer, N. Ritter","doi":"10.1109/ICDE48307.2020.00142","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00142","url":null,"abstract":"Users leave when page loads take too long. This simple fact has complex implications for virtually all modern businesses, because accelerating content delivery through caching is not as simple as it used to be. As a fundamental technical challenge, the high degree of personalization in today’s Web has seemingly outgrown the capabilities of traditional content delivery networks (CDNs) which have been designed for distributing static assets under fixed caching times. As an additional legal challenge for services with personalized content, an increasing number of regional data protection laws constrain the ways in which CDNs can be used in the first place. In this paper, we present Speed Kit as a radically different approach for content distribution that combines (1) a polyglot architecture for efficiently caching personalized content with (2) a natively GDPR-compliant client proxy that handles all sensitive information within the user device. We describe the system design and implementation, explain the custom cache coherence protocol to avoid data staleness and achieve Δ-atomicity, and we share field experiences from over a year of productive use in the e-commerce industry.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"28 1","pages":"1603-1608"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90906910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Automatic View Generation with Deep Learning and Reinforcement Learning 基于深度学习和强化学习的自动视图生成

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00133

Haitao Yuan, Guoliang Li, Ling Feng, Ji Sun, Yue Han

{"title":"Automatic View Generation with Deep Learning and Reinforcement Learning","authors":"Haitao Yuan, Guoliang Li, Ling Feng, Ji Sun, Yue Han","doi":"10.1109/ICDE48307.2020.00133","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00133","url":null,"abstract":"Materializing views is an important method to reduce redundant computations in DBMS, especially for processing large scale analytical queries. However, many existing methods still need DBAs to manually generate materialized views, which are not scalable to a large number of database instances, especially on the cloud database. To address this problem, we propose an automatic view generation method which judiciously selects \"highly beneficial\" subqueries to generate materialized views. However, there are two challenges. (1) How to estimate the benefit of using a materialized view for a queryƒ (2) How to select optimal subqueries to generate materialized viewsƒ To address the first challenge, we propose a neural network based method to estimate the benefit of using a materialized view to answer a query. In particular, we extract significant features from different perspectives and design effective encoding models to transform these features into hidden representations. To address the second challenge, we model this problem to an ILP (Integer Linear Programming) problem, which aims to maximize the utility by selecting optimal subqueries to materialize. We design an iterative optimization method to select subqueries to materialize. However, this method cannot guarantee the convergence of the solution. To address this issue, we model the iterative optimization process as an MDP (Markov Decision Process) and use the deep reinforcement learning model to solve the problem. Extensive experiments show that our method outperforms existing solutions by 28.4%, 8.8% and 31.7% on three real-world datasets.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"37 1","pages":"1501-1512"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91210970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44

UniKV: Toward High-Performance and Scalable KV Storage in Mixed Workloads via Unified Indexing UniKV:通过统一索引在混合工作负载中实现高性能和可扩展的KV存储

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00034

Qiang Zhang, Yongkun Li, P. Lee, Yinlong Xu, Qiu Cui, L. Tang

引用次数: 11

Task Allocation in Dependency-aware Spatial Crowdsourcing 依赖感知空间众包中的任务分配

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00090

Wangze Ni, Peng Cheng, Lei Chen, Xuemin Lin

{"title":"Task Allocation in Dependency-aware Spatial Crowdsourcing","authors":"Wangze Ni, Peng Cheng, Lei Chen, Xuemin Lin","doi":"10.1109/ICDE48307.2020.00090","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00090","url":null,"abstract":"Ubiquitous smart devices and high-quality wireless networks enable people to participate in spatial crowdsourcing tasks easily, which require workers to physically move to specific locations to conduct their assigned tasks. Spatial crowdsourcing has attracted much attention from both academia and industry. In this paper, we consider a spatial crowdsourcing scenario, where the tasks may have some dependencies among them. Specifically, one task can only be dispatched when its dependent tasks have already been assigned. In fact, task dependencies are quite common in many real-life applications, such as house repairing and holding sports games. We formally define the dependency-aware spatial crowdsourcing (DA-SC), which focuses on finding an optimal worker-and-task assignment under the constraints of dependencies, skills of workers, moving distances and deadlines to maximize the successfully assigned tasks. We prove that the DA-SC problem is NP-hard and thus intractable. Therefore, we propose two approximation algorithms, including a greedy approach and a game-theoretic approach, which can guarantee the approximate bounds of the results in each batch process. Through extensive experiments on both real and synthetic data sets, we demonstrate the efficiency and effectiveness of our DA-SC approaches.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"61 1","pages":"985-996"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90996187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Online Indices for Predictive Top-k Entity and Aggregate Queries on Knowledge Graphs 知识图谱上预测Top-k实体和聚合查询的在线索引

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00096

Yan Li, Tingjian Ge, Cindy X. Chen

引用次数: 5

HBP: Hotness Balanced Partition for Prioritized Iterative Graph Computations 优先迭代图计算的热度平衡划分

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00209

Shufeng Gong, Yanfeng Zhang, Ge Yu

引用次数: 5

Efficiently Answering Span-Reachability Queries in Large Temporal Graphs 大时间图中跨度可达性查询的有效回答

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00104

Dong Wen, Yilun Huang, Ying Zhang, Lu Qin, W. Zhang, Xuemin Lin

引用次数: 13

Cool, a COhort OnLine analytical processing system Cool，一个队列在线分析处理系统

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00056

Zhongle Xie, Hongbin Ying, Cong Yue, Meihui Zhang, Gang Chen, B. Ooi

引用次数: 2

Deciding When to Trade Data Freshness for Performance in MongoDB-as-a-Service 在mongodb即服务中决定何时以数据新鲜度换取性能

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00207

Chenhao Huang, Michael J. Cahill, A. Fekete, Uwe Röhm

引用次数: 5

Crowdsourcing-based Data Extraction from Visualization Charts 基于众包的可视化图表数据提取

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00177

Chengliang Chai, Guoliang Li, Ju Fan, Yuyu Luo

{"title":"Crowdsourcing-based Data Extraction from Visualization Charts","authors":"Chengliang Chai, Guoliang Li, Ju Fan, Yuyu Luo","doi":"10.1109/ICDE48307.2020.00177","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00177","url":null,"abstract":"Visualization charts are widely utilized for presenting structured data. Under many circumstances, people want to explore the data in the charts collected from various sources, such as papers and websites, so as to further analyzing the data or creating new charts. However, the existing automatic and semi-automatic approaches are not always effective due to the variety of charts. In this paper, we introduce a crowdsourcing approach that leverages human ability to extract data from visualization charts. There are several challenges. The first one is how to avoid tedious human interaction with charts and design simple crowdsourcing tasks. Second, it is challenging to evaluate worker’s quality for truth inference, because workers may not only provide inaccurate values but also misalign values to wrong data series. To address the challenges, we design an effective crowdsourcing task scheme that splits a chart into simple micro-tasks. We introduce a novel worker quality model by considering worker’s accuracy and task difficulty. We also devise an effective early-stopping mechanisms to save the cost. We have conducted experiments on a real crowdsourcing platform, and the results show that our framework outperforms state-of-the-art approaches on both cost and quality.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"1 1","pages":"1814-1817"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82983461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7