2016 IEEE 32nd International Conference on Data Engineering (ICDE)最新文献_第5页

Mutual benefit aware task assignment in a bipartite labor market 二元劳动力市场中具有互利意识的任务分配

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498230

Liu Zheng, Lei Chen

引用次数: 15

GARNET: A holistic system approach for trending queries in microblogs 微博趋势查询的整体系统方法

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498329

C. Jonathan, A. Magdy, M. Mokbel, A. Jonathan

{"title":"GARNET: A holistic system approach for trending queries in microblogs","authors":"C. Jonathan, A. Magdy, M. Mokbel, A. Jonathan","doi":"10.1109/ICDE.2016.7498329","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498329","url":null,"abstract":"The recent wide popularity of microblogs (e.g., tweets, online comments) has empowered various important applications, including, news delivery, event detection, market analysis, and target advertising. A core module in all these applications is a frequent/trending query processor that aims to find out those topics that are highly frequent or trending in the social media through posted microblogs. Unfortunately current attempts for such core module suffer from several drawbacks. Most importantly, their narrow scope, as they focus only on solving trending queries for a very special case of localized and very recent microblogs. This paper presents GARNET; a holistic system equipped with one-stop efficient and scalable solution for supporting a generic form of context-aware frequent and trending queries on microblogs. GARNET supports both frequent and trending queries, any arbitrary time interval either current, recent, or past, of fixed granularity, and having a set of arbitrary filters over contextual attributes. From a system point of view, GARNET is very appealing and industry-friendly, as one needs to realize it once in the system. Then, a myriad of various forms of trending and frequent queries are immediately supported. Experimental evidence based on a real system prototype of GARNET and billions of real Twitter data show the scalability and efficiency of GARNET for various query types.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"50 1","pages":"1251-1262"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87367829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Adaptive noise immune cluster ensemble using affinity propagation 基于亲和传播的自适应噪声免疫聚类集成

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498371

Zhiwen Yu, Guoqiang Han, Le Li, Jiming Liu, Jun Zhang

引用次数: 7

Cross-layer betweenness centrality in multiplex networks with applications 带应用的多路网络中的跨层中间性中心

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498257

Tanmoy Chakraborty, Ramasuri Narayanam

{"title":"Cross-layer betweenness centrality in multiplex networks with applications","authors":"Tanmoy Chakraborty, Ramasuri Narayanam","doi":"10.1109/ICDE.2016.7498257","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498257","url":null,"abstract":"Several real-life social systems witness the presence of multiple interaction types (or layers) among the entities, thus establishing a collection of co-evolving networks, known as multiplex networks. More recently, there has been a significant interest in developing certain centrality measures in multiplex networks to understand the influential power of the entities (to be referred as vertices or nodes hereafter). In this paper, we consider the problem of studying how frequently the nodes occur on the shortest paths between other nodes in the multiplex networks. As opposed to simplex networks, the shortest paths between nodes can possibly traverse through multiple layers in multiplex networks. Motivated by this phenomenon, we propose a new metric to address the above problem and we call this new metric cross-layer betweenness centrality (CBC). Our definition of CBC measure takes into account the interplay among multiple layers in determining the shortest paths in multiplex networks. We propose an efficient algorithm to compute CBC and show that it runs much faster than the naïve computation of this measure. We show the efficacy of the proposed algorithm using thorough experimentation on two real-world multiplex networks. We further demonstrate the practical utility of CBC by applying it in the following three application contexts: discovering non-overlapping community structure in multiplex networks, identifying interdisciplinary researchers from a multiplex co-authorship network, and the initiator selection for message spreading. In all these application scenarios, the respective solution methods based on the proposed CBC are found to be significantly better performing than that of the corresponding benchmark approaches.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"13 1","pages":"397-408"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90810164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Online mobile Micro-Task Allocation in spatial crowdsourcing 空间众包中的在线移动微任务分配

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498228

Yongxin Tong, Jieying She, Bolin Ding, Libin Wang, Lei Chen

{"title":"Online mobile Micro-Task Allocation in spatial crowdsourcing","authors":"Yongxin Tong, Jieying She, Bolin Ding, Libin Wang, Lei Chen","doi":"10.1109/ICDE.2016.7498228","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498228","url":null,"abstract":"With the rapid development of smartphones, spatial crowdsourcing platforms are getting popular. A foundational research of spatial crowdsourcing is to allocate micro-tasks to suitable crowd workers. Most existing studies focus on offline scenarios, where all the spatiotemporal information of micro-tasks and crowd workers is given. However, they are impractical since micro-tasks and crowd workers in real applications appear dynamically and their spatiotemporal information cannot be known in advance. In this paper, to address the shortcomings of existing offline approaches, we first identify a more practical micro-task allocation problem, called the Global Online Micro-task Allocation in spatial crowdsourcing (GOMA) problem. We first extend the state-of-art algorithm for the online maximum weighted bipartite matching problem to the GOMA problem as the baseline algorithm. Although the baseline algorithm provides theoretical guarantee for the worst case, its average performance in practice is not good enough since the worst case happens with a very low probability in real world. Thus, we consider the average performance of online algorithms, a.k.a online random order model.We propose a two-phase-based framework, based on which we present the TGOA algorithm with 1 over 4 -competitive ratio under the online random order model. To improve its efficiency, we further design the TGOA-Greedy algorithm following the framework, which runs faster than the TGOA algorithm but has lower competitive ratio of 1 over 8. Finally, we verify the effectiveness and efficiency of the proposed methods through extensive experiments on real and synthetic datasets.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"32 1","pages":"49-60"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81759829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 254

Practical private shortest path computation based on Oblivious Storage 基于遗忘存储的实用私有最短路径计算

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498254

Dong Xie, Guanru Li, Bin Yao, Xuan Wei, Xiaokui Xiao, Yunjun Gao, M. Guo

{"title":"Practical private shortest path computation based on Oblivious Storage","authors":"Dong Xie, Guanru Li, Bin Yao, Xuan Wei, Xiaokui Xiao, Yunjun Gao, M. Guo","doi":"10.1109/ICDE.2016.7498254","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498254","url":null,"abstract":"As location-based services (LBSs) become popular, location-dependent queries have raised serious privacy concerns since they may disclose sensitive information in query processing. Among typical queries supported by LBSs, shortest path queries may reveal information about not only current locations of the clients, but also their potential destinations and travel plans. Unfortunately, existing methods for private shortest path computation suffer from issues of weak privacy property, low performance or poor scalability. In this paper, we aim at a strong privacy guarantee, where the adversary cannot infer almost any information about the queries, with better performance and scalability. To achieve this goal, we introduce a general system model based on the concept of Oblivious Storage (OS), which can deal with queries requiring strong privacy properties. Furthermore, we propose a new oblivious shuffle algorithm to optimize an existing OS scheme. By making trade-offs between query performance, scalability and privacy properties, we design different schemes for private shortest path computation. Eventually, we comprehensively evaluate our schemes upon real road networks in a practical environment and show their efficiency.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"1 1","pages":"361-372"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84384233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

OLAP over probabilistic data cubes I: Aggregating, materializing, and querying 基于概率数据集的OLAP I:聚合、具体化和查询

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498291

Xike Xie, Xingjun Hao, T. Pedersen, Peiquan Jin, Jinchuan Chen

{"title":"OLAP over probabilistic data cubes I: Aggregating, materializing, and querying","authors":"Xike Xie, Xingjun Hao, T. Pedersen, Peiquan Jin, Jinchuan Chen","doi":"10.1109/ICDE.2016.7498291","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498291","url":null,"abstract":"On-Line Analytical Processing (OLAP) enables powerful analytics by quickly computing aggregate values of numerical measures over multiple hierarchical dimensions for massive datasets. However, many types of source data, e.g., from GPS, sensors, and other measurement devices, are intrinsically inaccurate (imprecise and/or uncertain) and thus OLAP cannot be readily applied. In this paper, we address the resulting data veracity problem in OLAP by proposing the concept of probabilistic data cubes. Such a cube is comprised of a set of probabilistic cuboids which summarize the aggregated values in the form of probability mass functions (pmfs in short) and thus offer insights into the underlying data quality and enable confidence-aware query evaluation and analysis. However, the probabilistic nature of data poses computational challenges as even simple operations are #P-hard under the possible world semantics. Even worse, it is hard to share computations among different cuboids, as aggregation functions that are distributive for traditional data cubes, e.g., SUM and COUNT, become holistic in probabilistic settings. In this paper, we propose a complete set of techniques for probabilistic data cubes, from cuboid aggregation, over cube materialization, to query evaluation. For aggregation, we focus on how to maximize the sharing of computation among cells and cuboids. We present two aggregation methods: convolution and sketch-based. The two methods scale down the time complexities of building a probabilistic cuboid to polynomial and linear, respectively. Each of the two supports both full and partial data cube materialization. Then, we devise a cost model which guides the aggregation methods to be deployed and combined during the cube materialization. We further provide algorithms for probabilistic slicing and dicing queries on the data cube. Extensive experiments over real and synthetic datasets are conducted to show that the techniques are effective and scalable.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"51 1","pages":"799-810"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88976090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Indexing multi-metric data 索引多度量数据

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498318

Maximilian Franzke, Tobias Emrich, Andreas Züfle, M. Renz

{"title":"Indexing multi-metric data","authors":"Maximilian Franzke, Tobias Emrich, Andreas Züfle, M. Renz","doi":"10.1109/ICDE.2016.7498318","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498318","url":null,"abstract":"The proliferation of the Web 2.0 and the ubiquitousness of social media yield a huge flood of heterogenous data that is voluntarily published and shared by billions of individual users all over the world. As a result, the representation of an entity (such as a real person) in this data may consist of various data types, including location and other numeric attributes, textual descriptions, images, videos, social network information and other types of information. Searching similar entities in this multi-enriched data exploiting the information of multiple representations simultaneously promises to yield more interesting and relevant information than searching among each data type individually. While efficient similarity search on single representations is a well studied problem, existing studies lacks appropriate solutions for multi-enriched data taking into account the combination of all representations as a whole. In this paper, we address the problem of index-supported similarity search on multi-enriched (a.k.a. multi-represented) objects based on a set of metrics, one metric for each representation. We define multimetric similarity search queries by employing user-defined weight function specifying the impact of each metric at query time. Our main contribution is an index structure which combines all metrics into a single multi-dimensional access method that works for arbitrary weights preferences. The experimental evaluation shows that our proposed index structure is more efficient than existing multi-metric access methods considering different cost criteria and tremendously outperforms traditional approaches when querying very large sets of multi-enriched objects.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"26 1","pages":"1122-1133"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81064677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Platform-independent robust query processing 独立于平台的健壮查询处理

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498251

S. Karthik, J. Haritsa, Sreyash Kenkre, Vinayaka Pandit

引用次数: 10

SCouT: Scalable coupled matrix-tensor factorization - algorithm and discoveries 可伸缩耦合矩阵张量分解-算法和发现

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498292

Byungsoo Jeon, Inah Jeon, Lee Sael, U. Kang

引用次数: 57