2016 IEEE 32nd International Conference on Data Engineering (ICDE)最新文献

筛选
英文 中文
Mutual benefit aware task assignment in a bipartite labor market 二元劳动力市场中具有互利意识的任务分配
2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498230
Liu Zheng, Lei Chen
{"title":"Mutual benefit aware task assignment in a bipartite labor market","authors":"Liu Zheng, Lei Chen","doi":"10.1109/ICDE.2016.7498230","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498230","url":null,"abstract":"As one of the three major steps (question design, task assignment, answer aggregation) in crowdsourcing, task assignment directly affects the quality of the crowdsourcing result. A good assignment will not only improve the answers' quality, but also boost the workers' willingness to participate. Although a lot of works have been made to produce better assignment, most of them neglected one of its most important properties: the bipartition, which exists widely in real world scenarios. Such ignorance greatly limits their application under general settings.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"331 1","pages":"73-84"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77141120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
GARNET: A holistic system approach for trending queries in microblogs 微博趋势查询的整体系统方法
2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498329
C. Jonathan, A. Magdy, M. Mokbel, A. Jonathan
{"title":"GARNET: A holistic system approach for trending queries in microblogs","authors":"C. Jonathan, A. Magdy, M. Mokbel, A. Jonathan","doi":"10.1109/ICDE.2016.7498329","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498329","url":null,"abstract":"The recent wide popularity of microblogs (e.g., tweets, online comments) has empowered various important applications, including, news delivery, event detection, market analysis, and target advertising. A core module in all these applications is a frequent/trending query processor that aims to find out those topics that are highly frequent or trending in the social media through posted microblogs. Unfortunately current attempts for such core module suffer from several drawbacks. Most importantly, their narrow scope, as they focus only on solving trending queries for a very special case of localized and very recent microblogs. This paper presents GARNET; a holistic system equipped with one-stop efficient and scalable solution for supporting a generic form of context-aware frequent and trending queries on microblogs. GARNET supports both frequent and trending queries, any arbitrary time interval either current, recent, or past, of fixed granularity, and having a set of arbitrary filters over contextual attributes. From a system point of view, GARNET is very appealing and industry-friendly, as one needs to realize it once in the system. Then, a myriad of various forms of trending and frequent queries are immediately supported. Experimental evidence based on a real system prototype of GARNET and billions of real Twitter data show the scalability and efficiency of GARNET for various query types.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"50 1","pages":"1251-1262"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87367829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Adaptive noise immune cluster ensemble using affinity propagation 基于亲和传播的自适应噪声免疫聚类集成
2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498371
Zhiwen Yu, Guoqiang Han, Le Li, Jiming Liu, Jun Zhang
{"title":"Adaptive noise immune cluster ensemble using affinity propagation","authors":"Zhiwen Yu, Guoqiang Han, Le Li, Jiming Liu, Jun Zhang","doi":"10.1109/ICDE.2016.7498371","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498371","url":null,"abstract":"Cluster ensemble, as one of the important research directions in the ensemble learning area, is gaining more and more attention, due to its powerful capability to integrate multiple clustering solutions and provide a more accurate, stable and robust result. Cluster ensemble has a lot of useful applications in a large number of areas. Although most of traditional cluster ensemble approaches obtain good results, few of them consider how to achieve good performance for noisy datasets. Some noisy datasets have a number of noisy attributes which may degrade the performance of conventional cluster ensemble approaches. Some noisy datasets which contain noisy samples will affect the final results. Other noisy datasets may be sensitive to distance functions.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"5 1","pages":"1454-1455"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80369256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Cross-layer betweenness centrality in multiplex networks with applications 带应用的多路网络中的跨层中间性中心
2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498257
Tanmoy Chakraborty, Ramasuri Narayanam
{"title":"Cross-layer betweenness centrality in multiplex networks with applications","authors":"Tanmoy Chakraborty, Ramasuri Narayanam","doi":"10.1109/ICDE.2016.7498257","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498257","url":null,"abstract":"Several real-life social systems witness the presence of multiple interaction types (or layers) among the entities, thus establishing a collection of co-evolving networks, known as multiplex networks. More recently, there has been a significant interest in developing certain centrality measures in multiplex networks to understand the influential power of the entities (to be referred as vertices or nodes hereafter). In this paper, we consider the problem of studying how frequently the nodes occur on the shortest paths between other nodes in the multiplex networks. As opposed to simplex networks, the shortest paths between nodes can possibly traverse through multiple layers in multiplex networks. Motivated by this phenomenon, we propose a new metric to address the above problem and we call this new metric cross-layer betweenness centrality (CBC). Our definition of CBC measure takes into account the interplay among multiple layers in determining the shortest paths in multiplex networks. We propose an efficient algorithm to compute CBC and show that it runs much faster than the naïve computation of this measure. We show the efficacy of the proposed algorithm using thorough experimentation on two real-world multiplex networks. We further demonstrate the practical utility of CBC by applying it in the following three application contexts: discovering non-overlapping community structure in multiplex networks, identifying interdisciplinary researchers from a multiplex co-authorship network, and the initiator selection for message spreading. In all these application scenarios, the respective solution methods based on the proposed CBC are found to be significantly better performing than that of the corresponding benchmark approaches.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"13 1","pages":"397-408"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90810164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Online mobile Micro-Task Allocation in spatial crowdsourcing 空间众包中的在线移动微任务分配
2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498228
Yongxin Tong, Jieying She, Bolin Ding, Libin Wang, Lei Chen
{"title":"Online mobile Micro-Task Allocation in spatial crowdsourcing","authors":"Yongxin Tong, Jieying She, Bolin Ding, Libin Wang, Lei Chen","doi":"10.1109/ICDE.2016.7498228","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498228","url":null,"abstract":"With the rapid development of smartphones, spatial crowdsourcing platforms are getting popular. A foundational research of spatial crowdsourcing is to allocate micro-tasks to suitable crowd workers. Most existing studies focus on offline scenarios, where all the spatiotemporal information of micro-tasks and crowd workers is given. However, they are impractical since micro-tasks and crowd workers in real applications appear dynamically and their spatiotemporal information cannot be known in advance. In this paper, to address the shortcomings of existing offline approaches, we first identify a more practical micro-task allocation problem, called the Global Online Micro-task Allocation in spatial crowdsourcing (GOMA) problem. We first extend the state-of-art algorithm for the online maximum weighted bipartite matching problem to the GOMA problem as the baseline algorithm. Although the baseline algorithm provides theoretical guarantee for the worst case, its average performance in practice is not good enough since the worst case happens with a very low probability in real world. Thus, we consider the average performance of online algorithms, a.k.a online random order model.We propose a two-phase-based framework, based on which we present the TGOA algorithm with 1 over 4 -competitive ratio under the online random order model. To improve its efficiency, we further design the TGOA-Greedy algorithm following the framework, which runs faster than the TGOA algorithm but has lower competitive ratio of 1 over 8. Finally, we verify the effectiveness and efficiency of the proposed methods through extensive experiments on real and synthetic datasets.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"32 1","pages":"49-60"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81759829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 254
Practical private shortest path computation based on Oblivious Storage 基于遗忘存储的实用私有最短路径计算
2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498254
Dong Xie, Guanru Li, Bin Yao, Xuan Wei, Xiaokui Xiao, Yunjun Gao, M. Guo
{"title":"Practical private shortest path computation based on Oblivious Storage","authors":"Dong Xie, Guanru Li, Bin Yao, Xuan Wei, Xiaokui Xiao, Yunjun Gao, M. Guo","doi":"10.1109/ICDE.2016.7498254","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498254","url":null,"abstract":"As location-based services (LBSs) become popular, location-dependent queries have raised serious privacy concerns since they may disclose sensitive information in query processing. Among typical queries supported by LBSs, shortest path queries may reveal information about not only current locations of the clients, but also their potential destinations and travel plans. Unfortunately, existing methods for private shortest path computation suffer from issues of weak privacy property, low performance or poor scalability. In this paper, we aim at a strong privacy guarantee, where the adversary cannot infer almost any information about the queries, with better performance and scalability. To achieve this goal, we introduce a general system model based on the concept of Oblivious Storage (OS), which can deal with queries requiring strong privacy properties. Furthermore, we propose a new oblivious shuffle algorithm to optimize an existing OS scheme. By making trade-offs between query performance, scalability and privacy properties, we design different schemes for private shortest path computation. Eventually, we comprehensively evaluate our schemes upon real road networks in a practical environment and show their efficiency.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"1 1","pages":"361-372"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84384233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
OLAP over probabilistic data cubes I: Aggregating, materializing, and querying 基于概率数据集的OLAP I:聚合、具体化和查询
2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498291
Xike Xie, Xingjun Hao, T. Pedersen, Peiquan Jin, Jinchuan Chen
{"title":"OLAP over probabilistic data cubes I: Aggregating, materializing, and querying","authors":"Xike Xie, Xingjun Hao, T. Pedersen, Peiquan Jin, Jinchuan Chen","doi":"10.1109/ICDE.2016.7498291","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498291","url":null,"abstract":"On-Line Analytical Processing (OLAP) enables powerful analytics by quickly computing aggregate values of numerical measures over multiple hierarchical dimensions for massive datasets. However, many types of source data, e.g., from GPS, sensors, and other measurement devices, are intrinsically inaccurate (imprecise and/or uncertain) and thus OLAP cannot be readily applied. In this paper, we address the resulting data veracity problem in OLAP by proposing the concept of probabilistic data cubes. Such a cube is comprised of a set of probabilistic cuboids which summarize the aggregated values in the form of probability mass functions (pmfs in short) and thus offer insights into the underlying data quality and enable confidence-aware query evaluation and analysis. However, the probabilistic nature of data poses computational challenges as even simple operations are #P-hard under the possible world semantics. Even worse, it is hard to share computations among different cuboids, as aggregation functions that are distributive for traditional data cubes, e.g., SUM and COUNT, become holistic in probabilistic settings. In this paper, we propose a complete set of techniques for probabilistic data cubes, from cuboid aggregation, over cube materialization, to query evaluation. For aggregation, we focus on how to maximize the sharing of computation among cells and cuboids. We present two aggregation methods: convolution and sketch-based. The two methods scale down the time complexities of building a probabilistic cuboid to polynomial and linear, respectively. Each of the two supports both full and partial data cube materialization. Then, we devise a cost model which guides the aggregation methods to be deployed and combined during the cube materialization. We further provide algorithms for probabilistic slicing and dicing queries on the data cube. Extensive experiments over real and synthetic datasets are conducted to show that the techniques are effective and scalable.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"51 1","pages":"799-810"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88976090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Indexing multi-metric data 索引多度量数据
2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498318
Maximilian Franzke, Tobias Emrich, Andreas Züfle, M. Renz
{"title":"Indexing multi-metric data","authors":"Maximilian Franzke, Tobias Emrich, Andreas Züfle, M. Renz","doi":"10.1109/ICDE.2016.7498318","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498318","url":null,"abstract":"The proliferation of the Web 2.0 and the ubiquitousness of social media yield a huge flood of heterogenous data that is voluntarily published and shared by billions of individual users all over the world. As a result, the representation of an entity (such as a real person) in this data may consist of various data types, including location and other numeric attributes, textual descriptions, images, videos, social network information and other types of information. Searching similar entities in this multi-enriched data exploiting the information of multiple representations simultaneously promises to yield more interesting and relevant information than searching among each data type individually. While efficient similarity search on single representations is a well studied problem, existing studies lacks appropriate solutions for multi-enriched data taking into account the combination of all representations as a whole. In this paper, we address the problem of index-supported similarity search on multi-enriched (a.k.a. multi-represented) objects based on a set of metrics, one metric for each representation. We define multimetric similarity search queries by employing user-defined weight function specifying the impact of each metric at query time. Our main contribution is an index structure which combines all metrics into a single multi-dimensional access method that works for arbitrary weights preferences. The experimental evaluation shows that our proposed index structure is more efficient than existing multi-metric access methods considering different cost criteria and tremendously outperforms traditional approaches when querying very large sets of multi-enriched objects.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"26 1","pages":"1122-1133"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81064677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Platform-independent robust query processing 独立于平台的健壮查询处理
2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498251
S. Karthik, J. Haritsa, Sreyash Kenkre, Vinayaka Pandit
{"title":"Platform-independent robust query processing","authors":"S. Karthik, J. Haritsa, Sreyash Kenkre, Vinayaka Pandit","doi":"10.1109/ICDE.2016.7498251","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498251","url":null,"abstract":"To address the classical selectivity estimation problem in databases, a radically different approach called PlanBouquet was recently proposed in [3], wherein the estimation process is completely abandoned and replaced with a calibrated discovery mechanism. The beneficial outcome of this new construction is that, for the first time, provable guarantees are obtained on worst-case performance, thereby facilitating robust query processing.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"4 1","pages":"325-336"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76259139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
SCouT: Scalable coupled matrix-tensor factorization - algorithm and discoveries 可伸缩耦合矩阵张量分解-算法和发现
2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI: 10.1109/ICDE.2016.7498292
Byungsoo Jeon, Inah Jeon, Lee Sael, U. Kang
{"title":"SCouT: Scalable coupled matrix-tensor factorization - algorithm and discoveries","authors":"Byungsoo Jeon, Inah Jeon, Lee Sael, U. Kang","doi":"10.1109/ICDE.2016.7498292","DOIUrl":"https://doi.org/10.1109/ICDE.2016.7498292","url":null,"abstract":"How can we analyze very large real-world tensors where additional information is coupled with certain modes of tensors? Coupled matrix-tensor factorization is a useful tool to simultaneously analyze matrices and a tensor, and has been used for important applications including collaborative filtering, multi-way clustering, and link prediction. However, existing single machine or distributed algorithms for coupled matrix-tensor factorization do not scale for tensors with billions of elements in each mode. In this paper, we propose SCOUT, a large-scale coupled matrix-tensor factorization algorithm running on the distributed MAPREDUCE platform. By carefully reorganizing operations, and reusing intermediate data, SCOUT decomposes up to 100× larger tensors than existing methods, and shows linear scalability for order and machines while other methods are limited in scalability. We also apply SCOUT on real world tensors and discover interesting hidden patterns like seasonal spike, and steady attentions for healthy food on Yelp dataset containing user-business-yearmonth tensor and two coupled matrices.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"27 1","pages":"811-822"},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73465160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信