2020 IEEE 36th International Conference on Data Engineering (ICDE)最新文献

筛选
英文 中文
HBP: Hotness Balanced Partition for Prioritized Iterative Graph Computations 优先迭代图计算的热度平衡划分
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00209
Shufeng Gong, Yanfeng Zhang, Ge Yu
{"title":"HBP: Hotness Balanced Partition for Prioritized Iterative Graph Computations","authors":"Shufeng Gong, Yanfeng Zhang, Ge Yu","doi":"10.1109/ICDE48307.2020.00209","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00209","url":null,"abstract":"Existing graph partition methods are designed for round-robin synchronous distributed frameworks. They balance workload without discrimination of vertex importance and fail to consider the characteristics of priority-based scheduling, which may limit the benefit of prioritized graph computation. To accelerate prioritized iterative graph computations, we propose Hotness Balanced Partition (HBP) and a stream-based partition algorithm Pb-HBP. Pb-HBP partitions graph by distributing vertices with discrimination according to their hotness rather than blindly distributing vertices with equal weights, which aims to evenly distribute the hot vertices among workers. Our results show that our proposed partition method outperforms the state-of-the-art partition methods, Fennel and HotGraph. Specifically, Pb-HBP can reduce 40–90% runtime of that by hash partition, 5–75% runtime of that by Fennel, and 22–50% runtime of that by HotGraph.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"13 1","pages":"1942-1945"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86640639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
StructSim: Querying Structural Node Similarity at Billion Scale StructSim:在十亿尺度上查询结构节点相似度
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00211
Xiaoshuang Chen, Longbin Lai, Lu Qin, Xuemin Lin
{"title":"StructSim: Querying Structural Node Similarity at Billion Scale","authors":"Xiaoshuang Chen, Longbin Lai, Lu Qin, Xuemin Lin","doi":"10.1109/ICDE48307.2020.00211","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00211","url":null,"abstract":"Structural node similarity is widely used in analyzing complex networks. As one of the structural node similarity metrics, role similarity has the good merit of indicating automorphism (isomorphism). Existing algorithms to compute role similarity (e.g., RoleSim and NED) suffer from severe performance bottlenecks, and thus cannot handle large real-world graphs. In this paper, we propose a new framework StructSim to compute nodes’ role similarity. Under this framework, we prove that StructSim is guaranteed to be an admissible role similarity metric based on the maximum matching. While maximum matching is too costly to scale, we then devise the BinCount matching to speed up the computation. BinCount-based StructSim admits a precomputed index to query one single pair in O(k log D) time, where k is a small user-defined parameter and D is the maximum node degree. Extensive empirical studies show that StructSim is significantly faster than existing works for computing structural node similarities on the real-world graphs, with comparable effectiveness.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"30 1","pages":"1950-1953"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81154043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Indoor Mobility Semantics Annotation Using Coupled Conditional Markov Networks 基于耦合条件马尔可夫网络的室内移动语义标注
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00128
Huan Li, Hua Lu, M. A. Cheema, L. Shou, Gang Chen
{"title":"Indoor Mobility Semantics Annotation Using Coupled Conditional Markov Networks","authors":"Huan Li, Hua Lu, M. A. Cheema, L. Shou, Gang Chen","doi":"10.1109/ICDE48307.2020.00128","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00128","url":null,"abstract":"Indoor mobility semantics analytics can greatly benefit many pertinent applications. Existing semantic annotation methods mainly focus on outdoor space and require extra knowledge such as POI category or human activity regularity. However, these conditions are difficult to meet in indoor venues with relatively small extents but complex topology. This work studies the annotation of indoor mobility semantics that describe an object’s mobility event (what ) at a semantic indoor region (where ) during a time period (when ). A coupled conditional Markov network (C2MN) is proposed with a set of feature functions carefully designed by incorporating indoor topology and mobility behaviors. C2MN is able to capture probabilistic dependencies among positioning records, semantic regions, and mobility events jointly. Nevertheless, the correlation of regions and events hinders the parameters learning. Therefore, we devise an alternate learning algorithm to enable the parameter learning over correlated variables. The extensive experiments demonstrate that our C2MN-based semantic annotation is efficient and effective on both real and synthetic indoor mobility data.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"19 1","pages":"1441-1452"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81988838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Contribution Maximization in Probabilistic Datalog 概率数据中的贡献最大化
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00076
T. Milo, Y. Moskovitch, Brit Youngmann
{"title":"Contribution Maximization in Probabilistic Datalog","authors":"T. Milo, Y. Moskovitch, Brit Youngmann","doi":"10.1109/ICDE48307.2020.00076","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00076","url":null,"abstract":"The use of probabilistic datalog programs has been recently advocated for applications that involve recursive computation and uncertainty. While using such programs allows for a flexible knowledge derivation, it makes the analysis of query results a challenging task. Particularly, given a set O of output tuples and a number k, one would like to understand which k-size subset of the input tuples have contributed the most to the derivation of O. This is useful for multiple tasks, such as identifying the critical sources of errors and understanding surprising results. Previous works have mainly focused on the quantification of tuples contribution to a query result in non-recursive SQL queries, very often disregarding probabilistic inference. To quantify the contribution in probabilistic datalog programs, one must account for the recursive relations between input and output data, and the uncertainty. To this end, we formalize the Contribution Maximization (CM) problem. We then reduce CM to the well-studied Influence Maximization (IM) problem, showing that we can harness techniques developed for IM to our setting. However, we show that such naïve adoption results in poor performance. To overcome this, we propose an optimized algorithm which injects a refined variant of the classic Magic Sets technique, integrated with a sampling method, into IM algorithms, achieving a significant saving of space and execution time. Our experiments demonstrate the effectiveness of our algorithm, even where the naïve approach is infeasible.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"72 1","pages":"817-828"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78747207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
vCBIR: A Verifiable Search Engine for Content-Based Image Retrieval 基于内容的图像检索的可验证搜索引擎vCBIR
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00156
Shangwei Guo, Yang Ji, Ce Zhang, Cheng Xu, Jianliang Xu
{"title":"vCBIR: A Verifiable Search Engine for Content-Based Image Retrieval","authors":"Shangwei Guo, Yang Ji, Ce Zhang, Cheng Xu, Jianliang Xu","doi":"10.1109/ICDE48307.2020.00156","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00156","url":null,"abstract":"We demonstrate vCBIR, a verifiable search engine for Content-Based Image Retrieval. vCBIR allows a small or medium-sized enterprise to outsource its image database to a cloud-based service provider and ensures the integrity of query processing. Like other common data-as-a-service (DaaS) systems, vCBIR consists of three parties: (i) the image owner who outsources its database, (ii) the service provider who executes the authenticated query processing, and (iii) the client who issues search queries. By employing a novel query authentication scheme proposed in our prior work [4], the system not only supports cloud-based image retrieval, but also generates a cryptographic proof for each query, by which the client could verify the integrity of query results. During the demonstration, we will showcase the usage of vCBIR and also provide attendees interactive experience of verifying query results against an untrustworthy service provider through graphical user interface (GUI).","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"5 1","pages":"1730-1733"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75899262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Cool, a COhort OnLine analytical processing system Cool,一个队列在线分析处理系统
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00056
Zhongle Xie, Hongbin Ying, Cong Yue, Meihui Zhang, Gang Chen, B. Ooi
{"title":"Cool, a COhort OnLine analytical processing system","authors":"Zhongle Xie, Hongbin Ying, Cong Yue, Meihui Zhang, Gang Chen, B. Ooi","doi":"10.1109/ICDE48307.2020.00056","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00056","url":null,"abstract":"With a huge volume and variety of data accumulated over the years, OnLine Analytical Processing (OLAP) systems are facing challenges in query efficiency. Furthermore, the design of OLAP systems cannot serve modern applications well due to their inefficiency in processing complex queries such as cohort queries with low query latency. In this paper, we present Cool, a cohort online analytical processing system. As an integrated system with the support of several newly proposed operators on top of a sophisticated storage layer, it processes both cohort queries and conventional OLAP queries with superb performance. Its distributed design contains minimal load balancing and fault tolerance support and is scalable. Our evaluation results show that Cool outperforms two state-of-the-art systems, MonetDB and Druid, by a wide margin in single-node setting. The multi-node version of Cool can also beat the distributed Druid, as well as SparkSQL, by one order of magnitude in terms of query latency.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"1 1","pages":"577-588"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85476654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Deciding When to Trade Data Freshness for Performance in MongoDB-as-a-Service 在mongodb即服务中决定何时以数据新鲜度换取性能
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00207
Chenhao Huang, Michael J. Cahill, A. Fekete, Uwe Röhm
{"title":"Deciding When to Trade Data Freshness for Performance in MongoDB-as-a-Service","authors":"Chenhao Huang, Michael J. Cahill, A. Fekete, Uwe Röhm","doi":"10.1109/ICDE48307.2020.00207","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00207","url":null,"abstract":"MongoDB is a popular document store that is also available as a cloud-hosted service. MongoDB internally deploys primary-copy asynchronous replication, and it allows clients to vary the Read Preference, so reads can deliberately be directed to secondaries rather than the primary site. Doing this can sometimes improve performance, but the returned data might be stale, whereas the primary always returns the freshest data value. While state-of-practice is for programmers to decide where to direct the reads at application development time, they do not have full understanding then of workload or hardware capacity. It should be better to choose the appropriate Read Preference setting at runtime, as we describe in this paper.We show how a system can detect when the primary copy is saturated in MongoDB-as-a-Service, and use this to choose where reads should be done to improve overall performance. Our approach is aimed at a cloud-consumer; it assumes access to only the limited diagnostic data provided to clients of the hosted service.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"53 1","pages":"1934-1937"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85713594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Efficiently Answering Span-Reachability Queries in Large Temporal Graphs 大时间图中跨度可达性查询的有效回答
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00104
Dong Wen, Yilun Huang, Ying Zhang, Lu Qin, W. Zhang, Xuemin Lin
{"title":"Efficiently Answering Span-Reachability Queries in Large Temporal Graphs","authors":"Dong Wen, Yilun Huang, Ying Zhang, Lu Qin, W. Zhang, Xuemin Lin","doi":"10.1109/ICDE48307.2020.00104","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00104","url":null,"abstract":"Reachability is a fundamental problem in graph analysis. In applications such as social networks and collaboration networks, edges are always associated with timestamps. Most existing works on reachability queries in temporal graphs assume that two vertices are related if they are connected by a path with non-decreasing timestamps (time-respecting) of edges. This assumption fails to capture the relationship between entities involved in the same group or activity with no time-respecting path connecting them. In this paper, we define a new reachability model, called span-reachability, designed to relax the time order dependency and identify the relationship between entities in a given time period. We adopt the idea of two-hop cover and propose an index-based method to answer span-reachability queries. Several optimizations are also given to improve the efficiency of index construction and query processing. We conduct extensive experiments on 17 real-world datasets to show the efficiency of our proposed solution.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"109 1","pages":"1153-1164"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87325816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
UniKV: Toward High-Performance and Scalable KV Storage in Mixed Workloads via Unified Indexing UniKV:通过统一索引在混合工作负载中实现高性能和可扩展的KV存储
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00034
Qiang Zhang, Yongkun Li, P. Lee, Yinlong Xu, Qiu Cui, L. Tang
{"title":"UniKV: Toward High-Performance and Scalable KV Storage in Mixed Workloads via Unified Indexing","authors":"Qiang Zhang, Yongkun Li, P. Lee, Yinlong Xu, Qiu Cui, L. Tang","doi":"10.1109/ICDE48307.2020.00034","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00034","url":null,"abstract":"Persistent key-value (KV) stores are mainly designed based on the Log-Structured Merge-tree (LSM-tree), which suffer from large read and write amplifications, especially when KV stores grow in size. Existing design optimizations for LSM-tree-based KV stores often make certain trade-offs and fail to simultaneously improve both the read and write performance on large KV stores without sacrificing scan performance. We design UniKV, which unifies the key design ideas of hash indexing and the LSM-tree in a single system. Specifically, UniKV leverages data locality to differentiate the indexing management of KV pairs. It also develops multiple techniques to tackle the issues caused by unifying the indexing techniques, so as to simultaneously improve the performance in reads, writes, and scans. Experiments show that UniKV significantly outperforms several state-of-the-art KV stores (e.g., LevelDB, RocksDB, HyperLevelDB, and PebblesDB) in overall throughput under read-write mixed workloads.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"14 1","pages":"313-324"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88573062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Task Allocation in Dependency-aware Spatial Crowdsourcing 依赖感知空间众包中的任务分配
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00090
Wangze Ni, Peng Cheng, Lei Chen, Xuemin Lin
{"title":"Task Allocation in Dependency-aware Spatial Crowdsourcing","authors":"Wangze Ni, Peng Cheng, Lei Chen, Xuemin Lin","doi":"10.1109/ICDE48307.2020.00090","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00090","url":null,"abstract":"Ubiquitous smart devices and high-quality wireless networks enable people to participate in spatial crowdsourcing tasks easily, which require workers to physically move to specific locations to conduct their assigned tasks. Spatial crowdsourcing has attracted much attention from both academia and industry. In this paper, we consider a spatial crowdsourcing scenario, where the tasks may have some dependencies among them. Specifically, one task can only be dispatched when its dependent tasks have already been assigned. In fact, task dependencies are quite common in many real-life applications, such as house repairing and holding sports games. We formally define the dependency-aware spatial crowdsourcing (DA-SC), which focuses on finding an optimal worker-and-task assignment under the constraints of dependencies, skills of workers, moving distances and deadlines to maximize the successfully assigned tasks. We prove that the DA-SC problem is NP-hard and thus intractable. Therefore, we propose two approximation algorithms, including a greedy approach and a game-theoretic approach, which can guarantee the approximate bounds of the results in each batch process. Through extensive experiments on both real and synthetic data sets, we demonstrate the efficiency and effectiveness of our DA-SC approaches.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"61 1","pages":"985-996"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90996187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信