2020 IEEE 36th International Conference on Data Engineering (ICDE)最新文献_第8页

A Unified Framework for Multi-view Spectral Clustering 多视点光谱聚类的统一框架

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00187

Guo Zhong, Chi-Man Pun

引用次数: 7

SUDAF: Sharing User-Defined Aggregate Functions SUDAF:共享用户自定义聚合函数

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00161

Chao Zhang, F. Toumani, B. Doreau

引用次数: 2

Outdated Fact Detection in Knowledge Bases 知识库中过时的事实检测

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00196

Shuang Hao, Chengliang Chai, Guoliang Li, N. Tang, Ning Wang, Xiang Yu

{"title":"Outdated Fact Detection in Knowledge Bases","authors":"Shuang Hao, Chengliang Chai, Guoliang Li, N. Tang, Ning Wang, Xiang Yu","doi":"10.1109/ICDE48307.2020.00196","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00196","url":null,"abstract":"Knowledge bases (KBs), which store high-quality information, are crucial for many applications, such as enhancing search results and serving as external sources for data cleaning. Not surprisingly, there exist outdated facts in most KBs due to the rapid change of information. Naturally, it is important to keep KBs up-to-date. Traditional wisdom has investigated the problem of using reference data (such as new facts extracted from the news) to detect outdated facts in KBs. However, existing approaches can only cover a small percentage of facts in KBs. In this paper, we propose a novel human-in-the-loop approach for outdated fact detection in KBs. It trains a binary classifier using features such as historical update frequency and existence time of a fact to compute the likelihood of a fact in a KB to be outdated. Then, it interacts with humans to verify whether a fact with high likelihood is indeed outdated. In addition, it also uses logical rules to detect more outdated facts based on human feedback. The outdated facts detected by the logical rules will also be fed back to train the ML model further for data augmentation. Extensive experiments on real-world KBs, such as Yago and DBpedia, show the effectiveness of our solution.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"76 1","pages":"1890-1893"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84934452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

A Class of R*-tree Indexes for Spatial-Visual Search of Geo-tagged Street Images 一类基于R*树索引的地理标记街道图像空间视觉搜索

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00221

Abdullah Alfarrarjeh, S. H. Kim, V. Hegde, Akshansh, C. Shahabi, Q. Xie, S. Ravada

{"title":"A Class of R*-tree Indexes for Spatial-Visual Search of Geo-tagged Street Images","authors":"Abdullah Alfarrarjeh, S. H. Kim, V. Hegde, Akshansh, C. Shahabi, Q. Xie, S. Ravada","doi":"10.1109/ICDE48307.2020.00221","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00221","url":null,"abstract":"Due to the prevalence of GPS-equipped cameras (e.g., smartphones and surveillance cameras), massive amounts of geo-tagged images capturing urban streets are increasingly being collected. Consequently, many smart city applications have emerged, relying on efficient image search. Such searches include spatial-visual queries in which spatial and visual properties are used in tandem to retrieve similar images to a given query image within a given geographical region. Towards this end, new index structures that organize images based on both spatial and visual properties are needed to efficiently execute such queries. Based on our observation that street images are typically similar in the same spatial locality, index structures for spatial-visual queries can be effectively built on a spatial index (i.e., R*-tree). Therefore, we propose a class of R*-tree indexes, particularly, by associating each node with two separate minimum bounding rectangles (MBR), one for spatial and the other for (dimension-reduced) visual properties of their contained images, and adapting the R*-tree optimization criteria to both property types.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"49 1","pages":"1990-1993"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85035560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Graph Embeddings for One-pass Processing of Heterogeneous Queries 异构查询一次处理的图嵌入

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00222

Chi Thang Duong, Hongzhi Yin, Dung Hoang, Minn Hung Nguyen, M. Weidlich, Quoc Viet Hung Nguyen, K. Aberer

{"title":"Graph Embeddings for One-pass Processing of Heterogeneous Queries","authors":"Chi Thang Duong, Hongzhi Yin, Dung Hoang, Minn Hung Nguyen, M. Weidlich, Quoc Viet Hung Nguyen, K. Aberer","doi":"10.1109/ICDE48307.2020.00222","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00222","url":null,"abstract":"Effective information retrieval (IR) relies on the ability to comprehensively capture a user’s information needs. Traditional IR systems are limited to homogeneous queries that define the information to retrieve by a single modality. Support for heterogeneous queries that combine different modalities has been proposed recently. Yet, existing approaches for heterogeneous querying are computationally expensive, as they require several passes over the data to construct a query answer.In this paper, we propose an IR system that overcomes the computational challenges imposed by heterogeneous queries by adopting graph embeddings. Specifically, we propose graph-based models in which both, data and queries, incorporate information of different modalities. Then, we show how either representation is transformed into a graph embedding in the same space, capturing relations between information of different modalities. By grounding query processing in graph embeddings, we enable processing of heterogeneous queries with a single pass over the data representation. Our experiments on several real-world and synthetic datasets illustrate that our technique is able to return twice the amount of relevant information in comparison with several baselines, while being scalable to large-scale data.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"6 1","pages":"1994-1997"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87283235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Answering Skyline Queries over Incomplete Data with Crowdsourcing(Extended Abstract) 用众包解决不完整数据的Skyline查询(扩展摘要)

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00235

Xiaoye Miao, Yunjun Gao, Su Guo, Lu Chen, Jianwei Yin, Qing Li

引用次数: 0

PrefixFPM: A Parallel Framework for General-Purpose Frequent Pattern Mining PrefixFPM:通用频繁模式挖掘的并行框架

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00208

Da Yan, Wenwen Qu, Guimu Guo, Xiaoling Wang

{"title":"PrefixFPM: A Parallel Framework for General-Purpose Frequent Pattern Mining","authors":"Da Yan, Wenwen Qu, Guimu Guo, Xiaoling Wang","doi":"10.1109/ICDE48307.2020.00208","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00208","url":null,"abstract":"Frequent pattern mining (FPM) has been a focused theme in data mining research for decades, but there lacks a general programming framework that can be easily customized to mine different kinds of frequent patterns, and existing solutions to FPM over big transaction databases are IO-bound rendering CPU cores underutilized even though FPM is NP-hard.This paper presents, PrefixFPM, a general-purpose framework for FPM that is able to fully utilize the CPU cores in a multicore machine. PrefixFPM follows the idea of prefix projection to partition the workloads of PFM into independent tasks by divide and conquer. PrefixFPM exposes a unified programming interface to users who can customize it to mine their desired patterns, and the parallel execution engine is transparent to end-users and can be reused for mining all kinds of patterns. We have adapted the state-of-the-art serial algorithms for mining frequent patterns including subsequences, subtrees, and subgraphs on top of PrefixFPM, and extensive experiments demonstrate an excellent speedup ratio of PrefixFPM with the number of cores.A demo is available at https://youtu.be/PfioC0GDpsw; the code is available at https://github.com/yanlab19870714/PrefixFPM.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"45 1","pages":"1938-1941"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88085549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Automated Anomaly Detection in Large Sequences 大序列中的自动异常检测

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00182

Paul Boniol, Michele Linardi, Federico Roncallo, Themis Palpanas

引用次数: 42

Computing Mutual Information of Big Categorical Data and Its Application to Feature Grouping 大分类数据互信息计算及其在特征分组中的应用

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00210

Junli Li, Chaowei Zhang, Jifu Zhang, X. Qin

引用次数: 2

Adaptive Network Alignment with Unsupervised and Multi-order Convolutional Networks 无监督多阶卷积网络的自适应网络对齐

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00015

T. T. Huynh, Vinh Tong, T. Nguyen, Hongzhi Yin, M. Weidlich, Nguyen Quoc Viet Hung

{"title":"Adaptive Network Alignment with Unsupervised and Multi-order Convolutional Networks","authors":"T. T. Huynh, Vinh Tong, T. Nguyen, Hongzhi Yin, M. Weidlich, Nguyen Quoc Viet Hung","doi":"10.1109/ICDE48307.2020.00015","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00015","url":null,"abstract":"Network alignment is the problem of pairing nodes between two graphs such that the paired nodes are structurally and semantically similar. A well-known application of network alignment is to identify which accounts in different social networks belong to the same person. Existing alignment techniques, however, lack scalability, cannot incorporate multi-dimensional information without training data, and are limited in the consistency constraints enforced by an alignment. In this paper, we propose a fully unsupervised network alignment framework based on a multi-order embedding model. The model learns the embeddings of each node using a graph convolutional neural representation, which we prove to satisfy consistency constraints. We further design a data augmentation method and a refinement mechanism to make the model adaptive to consistency violations and noise. Extensive experiments on real and synthetic datasets show that our model outperforms state-of-the-art alignment techniques. We also demonstrate the robustness of our model against adversarial conditions, such as structural noises, attribute noises, graph size imbalance, and hyper-parameter sensitivity.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"64 1","pages":"85-96"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84484193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 52