2020 IEEE 36th International Conference on Data Engineering (ICDE)最新文献

筛选
英文 中文
On the Integration of Machine Learning and Array Databases 论机器学习与阵列数据库的集成
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00170
Sebastián Villarroya, P. Baumann
{"title":"On the Integration of Machine Learning and Array Databases","authors":"Sebastián Villarroya, P. Baumann","doi":"10.1109/ICDE48307.2020.00170","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00170","url":null,"abstract":"Machine Learning is increasingly being applied to many different application domains. From cancer detection to weather forecast, a large number of different applications leverage machine learning algorithms to get faster and more accurate results over huge datasets. Although many of these datasets are mainly composed of array data, a vast majority of machine learning applications do not use array databases. This tutorial focuses on the integration of machine learning algorithms and array databases. By implementing machine learning algorithms in array databases users can boost the native efficient array data processing with machine learning methods to perform accurate and fast array data analytics.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"24 1","pages":"1786-1789"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74754436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A Novel Approach to Learning Consensus and Complementary Information for Multi-View Data Clustering 一种新的多视图数据聚类共识和互补信息学习方法
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00080
Khanh Luong, R. Nayak
{"title":"A Novel Approach to Learning Consensus and Complementary Information for Multi-View Data Clustering","authors":"Khanh Luong, R. Nayak","doi":"10.1109/ICDE48307.2020.00080","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00080","url":null,"abstract":"Effective methods are required to be developed that can deal with the multi-faceted nature of the multi-view data. We design a factorization-based loss function-based method to simultaneously learn two components encoding the consensus and complementary information present in multi-view data by using the Coupled Matrix Factorization (CMF) and Non-negative Matrix Factorization (NMF). We propose a novel optimal manifold for multi-view data which is the most consensed manifold embedded in the high-dimensional multi-view data. A new complementary enhancing term is added in the loss function to enhance the complementary information inherent in each view. An extensive experiment with diverse datasets, benchmarking the state-of-the-art multi-view clustering methods, has demonstrated the effectiveness of the proposed method in obtaining accurate clustering solution.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"6 1","pages":"865-876"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73425873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Message from the ICDE 2020 Chairs ICDE 2020主席致辞
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/icde48307.2020.00005
{"title":"Message from the ICDE 2020 Chairs","authors":"","doi":"10.1109/icde48307.2020.00005","DOIUrl":"https://doi.org/10.1109/icde48307.2020.00005","url":null,"abstract":"","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73778172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-Level Data Compression using Machine Learning in Time Series Database 时间序列数据库中使用机器学习的两级数据压缩
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00119
Xinyang Yu, Yanqing Peng, Feifei Li, Sheng Wang, Xiaowei Shen, Huijun Mai, Yue Xie
{"title":"Two-Level Data Compression using Machine Learning in Time Series Database","authors":"Xinyang Yu, Yanqing Peng, Feifei Li, Sheng Wang, Xiaowei Shen, Huijun Mai, Yue Xie","doi":"10.1109/ICDE48307.2020.00119","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00119","url":null,"abstract":"The explosion of time series advances the development of time series databases. To reduce storage overhead in these systems, data compression is widely adopted. Most existing compression algorithms utilize the overall characteristics of the entire time series to achieve high compression ratio, but ignore local contexts around individual points. In this way, they are effective for certain data patterns, and may suffer inherent pattern changes in real-world time series. It is therefore strongly desired to have a compression method that can always achieve high compression ratio in the existence of pattern diversity.In this paper, we propose a two-level compression model that selects a proper compression scheme for each individual point, so that diverse patterns can be captured at a fine granularity. Based on this model, we design and implement AMMMO framework, where a set of control parameters is defined to distill and categorize data patterns. At the top level, we evaluate each sub-sequence to fill in these parameters, generating a set of compression scheme candidates (i.e., major mode selection). At the bottom level, we choose the best scheme from these candidates for each data point respectively (i.e., sub-mode selection). To effectively handle diverse data patterns, we introduce a reinforcement learning based approach to learn parameter values automatically. Our experimental evaluation shows that our approach improves compression ratio by up to 120% (with an average of 50%), compared to other time-series compression methods.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"8 1","pages":"1333-1344"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83698489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
SFour: A Protocol for Cryptographically Secure Record Linkage at Scale 一种大规模加密安全记录链接协议
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00031
Basit Khurram, F. Kerschbaum
{"title":"SFour: A Protocol for Cryptographically Secure Record Linkage at Scale","authors":"Basit Khurram, F. Kerschbaum","doi":"10.1109/ICDE48307.2020.00031","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00031","url":null,"abstract":"The prevalence of various (and increasingly large) datasets presents the challenging problem of discovering common entities dispersed across disparate datasets. Solutions to the private record linkage problem (PRL) aim to enable such explorations of datasets in a secure manner. A two-party PRL protocol allows two parties to determine for which entities they each possess a record (either an exact matching record or a fuzzy matching record) in their respective datasets — without revealing to one another information about any entities for which they do not both possess records. Although several solutions have been proposed to solve the PRL problem, no current solution offers a fully cryptographic security guarantee while maintaining both high accuracy of output and subquadratic runtime efficiency. To this end, we propose the first known efficient PRL protocol that runs in subquadratic time, provides high accuracy, and guarantees cryptographic security in the semi-honest security model.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"83 1","pages":"277-288"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83065431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Optimizing Knowledge Graphs through Voting-based User Feedback 通过基于投票的用户反馈优化知识图谱
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00043
Ruida Yang, Xin Lin, Jianliang Xu, Yan Yang, Liang He
{"title":"Optimizing Knowledge Graphs through Voting-based User Feedback","authors":"Ruida Yang, Xin Lin, Jianliang Xu, Yan Yang, Liang He","doi":"10.1109/ICDE48307.2020.00043","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00043","url":null,"abstract":"Knowledge graphs have been used in a wide range of applications to support search, recommendation, and question answering (Q&A). For example, in Q&A systems, given a new question, we may use a knowledge graph to automatically identify the most suitable answers based on similarity evaluation. However, such systems may suffer from two major limitations. First, the knowledge graph constructed based on source data may contain errors. Second, the knowledge graph may become out of date and cannot quickly adapt to new knowledge. To address these issues, in this paper, we propose an interactive framework that refines and optimizes knowledge graphs through user votes. We develop an efficient similarity evaluation notion, called extended inverse P-distance, based on which the graph optimization problem can be formulated as a signomial geometric programming problem. We then propose a basic single-vote solution and a more advanced multi-vote solution for graph optimization. We also propose a split-and-merge optimization strategy to scale up the multi-vote solution. Extensive experiments based on real-life and synthetic graphs demonstrate the effectiveness and efficiency of our proposed framework.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"1 1","pages":"421-432"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77809538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Sentinel: A Declarative Production-Scale Data Validation Platform 数据哨兵:声明式生产规模数据验证平台
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00140
A. Swami, Sriram Vasudevan, Joojay Huyn
{"title":"Data Sentinel: A Declarative Production-Scale Data Validation Platform","authors":"A. Swami, Sriram Vasudevan, Joojay Huyn","doi":"10.1109/ICDE48307.2020.00140","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00140","url":null,"abstract":"Many organizations process big data for important business operations and decisions. Hence, data quality greatly affects their success. Data quality problems continue to be widespread, costing US businesses an estimated $600 billion annually. To date, addressing data quality in production environments still poses many challenges: easily defining properties of high-quality data; validating production-scale data in a timely manner; debugging poor quality data; designing data quality solutions to be easy to use, understand, and operate; and designing data quality solutions to easily integrate with other systems. Current data validation solutions do not comprehensively address these challenges. To address data quality in production environments at LinkedIn, we developed Data Sentinel, a declarative production-scale data validation platform. In a simple and well-structured configuration, users declaratively specify the desired data checks. Then, Data Sentinel performs these data checks and writes the results to an easily understandable report. Furthermore, Data Sentinel provides well-defined schemas for the configuration and report. This makes it easy for other systems to interface or integrate with Data Sentinel. To make Data Sentinel even easier to use, understand, and operate in production environments, we provide Data Sentinel Service (DSS), a complementary system to help specify data checks, schedule, deploy, and tune data validation jobs, and understand data checking results. The contributions of this paper include the following: 1) Data Sentinel, a declarative production-scale data validation platform successfully deployed at LinkedIn 2) A generic design to build and deploy similar systems for production environments 3) Experiences and lessons learned that can benefit practitioners with similar objectives.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"13 1","pages":"1579-1590"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83360330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
CrashSim: An Efficient Algorithm for Computing SimRank over Static and Temporal Graphs CrashSim:一个在静态和时间图上计算simmrank的有效算法
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00103
Mo Li, F. Choudhury, Renata Borovica-Gajic, Zhiqiong Wang, Junchang Xin, Jianxin Li
{"title":"CrashSim: An Efficient Algorithm for Computing SimRank over Static and Temporal Graphs","authors":"Mo Li, F. Choudhury, Renata Borovica-Gajic, Zhiqiong Wang, Junchang Xin, Jianxin Li","doi":"10.1109/ICDE48307.2020.00103","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00103","url":null,"abstract":"SimRank is a significant metric to measure the similarity of nodes in graph data analysis. The problem of SimRank computation has been studied extensively, however there is no existing work that can provide one unified algorithm to support the SimRank computation both on static and temporal graphs. In this work, we first propose CrashSim, an index-free algorithm for single-source SimRank computation in static graphs. CrashSim can provide provable approximation guarantees for the computational results in an efficient way. In addition, as the reallife graphs are often represented as temporal graphs, CrashSim enables efficient computation of SimRank in temporal graphs. We formally define two typical SimRank queries in temporal graphs, and then solve them by developing an efficient algorithm based on CrashSim, called CrashSim-T. From the extensive experimental evaluation using five real-life and synthetic datasets, it can be seen that the CrashSim algorithm and CrashSim-T algorithm substantially improve the efficiency of the state-of-the-art SimRank algorithms by about 30%, while achieving the precision of the result set with about 97%.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"25 1","pages":"1141-1152"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89364839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
SVkNN: Efficient Secure and Verifiable k-Nearest Neighbor Query on the Cloud Platform* 云平台上高效、安全、可验证的k近邻查询方法*
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00029
Ningning Cui, Xiaochun Yang, Bin Wang, Jianxin Li, Guoren Wang
{"title":"SVkNN: Efficient Secure and Verifiable k-Nearest Neighbor Query on the Cloud Platform*","authors":"Ningning Cui, Xiaochun Yang, Bin Wang, Jianxin Li, Guoren Wang","doi":"10.1109/ICDE48307.2020.00029","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00029","url":null,"abstract":"With the boom in cloud computing, data outsourcing in location-based services is proliferating and has attracted increasing interest from research communities and commercial applications. Nevertheless, since the cloud server is probably both untrusted and malicious, concerns of data security and result integrity have become on the rise sharply. However, there exist little work that can commendably assure the data security and result integrity using a unified way. In this paper, we study the problem of secure and verifiable k nearest neighbor query (SVkNN). To support SVkNN, we first propose a novel unified structure, called verifiable and secure index (VSI). Based on this, we devise a series of secure protocols to facilitate query processing and develop a compact verification strategy. Given an SVkNN query, our proposed solution can not merely answer the query efficiently while can guarantee: 1) preserving the privacy of data, query, result and access patterns; 2) authenticating the correctness and completeness of the results without leaking the confidentiality. Finally, the formal security analysis and complexity analysis are theoretically proven and the performance and feasibility of our proposed approaches are empirically evaluated and demonstrated.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"7 1","pages":"253-264"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86526204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Practical Anonymous Subscription with Revocation Based on Broadcast Encryption 基于广播加密的实用可撤销匿名订阅
2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI: 10.1109/ICDE48307.2020.00028
X. Yi, Russell Paulet, E. Bertino, Fang-Yu Rao
{"title":"Practical Anonymous Subscription with Revocation Based on Broadcast Encryption","authors":"X. Yi, Russell Paulet, E. Bertino, Fang-Yu Rao","doi":"10.1109/ICDE48307.2020.00028","DOIUrl":"https://doi.org/10.1109/ICDE48307.2020.00028","url":null,"abstract":"In this paper we consider the problem where a client wishes to subscribe to some product or service provided by a server, but maintain their anonymity. At the same time, the server must be able to authenticate the client as a genuine user and be able to discontinue (or revoke) the client’s access if the subscription fees are not paid. Current solutions for this problem are typically constructed using some combination of blind signature or zero-knowledge proof techniques, which do not directly support client revocation (that is, revoking a user before expiry of their secret value). In this paper, we present a solution for this problem on the basis of the broadcast encryption scheme, suggested by Boneh et al., by which the server can broadcast a secret to a group of legitimate clients. Our solution allows the registered client to log into the server anonymously and also supports client revocation by the server. Our solution can be used in many applications, such as location-based queries. We formally define a model for our anonymous subscription protocol and prove the security of our solution under this model. In addition, we present experimental results from an implementation of our protocol. These experimental results demonstrate that our protocol is practical.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"3 1","pages":"241-252"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87917424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信