33rd International Conference on Scientific and Statistical Database Management最新文献

筛选
英文 中文
Practical Fully-Decentralized Secure Aggregation for Personal Data Management Systems 实用的完全分散的安全聚合个人数据管理系统
33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3468821
Julien Mirval, Luc Bouganim, I. S. Popa
{"title":"Practical Fully-Decentralized Secure Aggregation for Personal Data Management Systems","authors":"Julien Mirval, Luc Bouganim, I. S. Popa","doi":"10.1145/3468791.3468821","DOIUrl":"https://doi.org/10.1145/3468791.3468821","url":null,"abstract":"Personal Data Management Systems (PDMS) are flourishing, boosted by legal and technical means like smart disclosure, data portability and data altruism. A PDMS allows its owner to easily collect, store and manage data, directly generated by her devices, or resulting from her interactions with companies or administrations. PDMSs unlock innovative usages by crossing multiple data sources from one or many users, thus requiring aggregation primitives. Indeed, aggregation primitives are essential to compute statistics on user data, but are also a fundamental building block for machine learning algorithms. This paper proposes a protocol allowing for secure aggregation in a massively distributed PDMS environment, which adapts to selective participation and PDMSs characteristics, and is reliable with respect to failures, with no compromise on accuracy. Preliminary experiments show the effectiveness of our protocol which can adapt to several contexts with varying PDMSs characteristics in terms of communication speed or CPU resources and can adjust the aggregation strategy to the estimated selective participation.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126486671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Local Gaussian Process Model Inference Classification for Time Series Data 时间序列数据的局部高斯过程模型推断分类
33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3468839
Fabian Berns, Joschka Hannes Strueber, C. Beecks
{"title":"Local Gaussian Process Model Inference Classification for Time Series Data","authors":"Fabian Berns, Joschka Hannes Strueber, C. Beecks","doi":"10.1145/3468791.3468839","DOIUrl":"https://doi.org/10.1145/3468791.3468839","url":null,"abstract":"One of the prominent types of time series analytics is classification, which entails identifying expressive class-wise features for determining class labels of time series data. In this paper, we propose a novel approach for time series classification called Local Gaussian Process Model Inference Classification (LOGIC). Our idea consists in (i) approximating the latent, class-wise characteristics of given time series data by means of Gaussian processes and (ii) aggregating these characteristics into a feature representation to (iii) provide a model-agnostic interface for state-of-the-art feature classification mechanisms. By making use of a fully-connected neural network as classification model, we show that the LOGIC model is able to compete with state-of-the-art approaches.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132052469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UNSUPERVISED ANOMALY DETECTION FOR TIME SERIES WITH OUTLIER EXPOSURE 具有异常值暴露的时间序列的无监督异常检测
33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3468793
Jiaming Feng, Zheng Huang, Jie Guo, Weidong Qiu
{"title":"UNSUPERVISED ANOMALY DETECTION FOR TIME SERIES WITH OUTLIER EXPOSURE","authors":"Jiaming Feng, Zheng Huang, Jie Guo, Weidong Qiu","doi":"10.1145/3468791.3468793","DOIUrl":"https://doi.org/10.1145/3468791.3468793","url":null,"abstract":"It is of great practical significance to accurately model and analyze abnormal events in time series. For example, the identification of anomaly patterns on infrastructure sensor curves helps locate equipment failures. In this paper, we propose an unsupervised anomaly detection approach for time series, which can comprehensively consider both point anomalies and subsequence anomalies. We innovatively introduce RNN into the architecture of Adversarial Autoencoder to better analyze anomaly events based on overall relationship of time series. In addition, we innovatively apply the Outlier Exposure technique for the performance optimization of anomaly detector. Meanwhile, a WGAN-based method is utilized to generate anomaly datasets through normal distribution learning. Finally, we apply the proposed method for fraud detection on a financial statement dataset and intrusion detection on a network traffic dataset. Experimental results demonstrates that our model can comprehensively consider different anomaly types in time series, and achieve promising detection performance overall. In the experiment of fraud detection, the LSTM integrated AAE model achieves an F1 score of 0.810, while the Outlier Exposure enhanced model achieves an F1 score of 0.894. This indicates that our method can improve the performance of current audit systems and facilitate discovering malicious behaviors.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131094464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Depth-First Traversal by Graph Ordering 通过图排序加速深度优先遍历
33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3468796
Qiuyi Lyu, M. Sha, Bin Gong, Kuangda Lyu
{"title":"Accelerating Depth-First Traversal by Graph Ordering","authors":"Qiuyi Lyu, M. Sha, Bin Gong, Kuangda Lyu","doi":"10.1145/3468791.3468796","DOIUrl":"https://doi.org/10.1145/3468791.3468796","url":null,"abstract":"Cache efficiency is an important factor in the performance of graph processing due to the irregular memory access patterns caused by the sparse nature of graphs. To increase the cache hit rate, prior studies proposed a variety of preprocessing approaches based on the reordering, which permutes the vertexes’ labels to improve the locality of graph structures. However, the locality enhancement of existing reordering approaches does not bring much performance benefit in depth-first traversal, which is widely adopted in a majority of graph processing applications. Furthermore, the state-of-the-art reordering approach suffers from an obvious overhead on preprocessing which will greatly limit the application of their approach. In this paper, we propose SeqDFS, a depth-first graph traversal method that optimizes the cache efficiency by adjusting the order of vertexes visited and can be further extended to dynamic scenarios. We conduct extensive experiments on 16 real-world datasets and 3 representative depth-first graph applications, of which the results show that our proposal achieves a significant speed-up on both directed and undirected graphs.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116435360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Tensor-Relational Algebra, and Other Ideas in Machine Learning System Design 张量-关系代数,以及机器学习系统设计中的其他思想
33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3472262
C. Jermaine
{"title":"The Tensor-Relational Algebra, and Other Ideas in Machine Learning System Design","authors":"C. Jermaine","doi":"10.1145/3468791.3472262","DOIUrl":"https://doi.org/10.1145/3468791.3472262","url":null,"abstract":"ACM Reference Format: Chris Jermaine. 2021. The Tensor-Relational Algebra, and Other Ideas in Machine Learning System Design. In 33rd International Conference on Scientific and Statistical Database Management, July 06–07, 2021, Tampa, FL, USA. ACM, New York, NY, USA, 1 page. https://doi.org/10.1145/3468791.3472262 Systems for machine learning such as TensorFlow and PyTorch have greatly increased the complexity of the models that can be prototyped, tested, and moved into production, as well as reducing the time and effort required to do this. However, the systems have significant limitations. In these systems, a matrix multiplication (or a 2-D convolution, or any of the operations offered by the system) is a black-box operation that must actually be executed somewhere. As such, if there are multiple GPUs available to execute the multiplication the system cannot “figure out” how to automatically distribute the multiplication over them. It has to run an available matrix multiply somewhere, on some hardware. If there is one GPU available but the inputs are too large to fit in the GPU RAM, the system cannot automatically decompose the operation to perform the computation in stages, moving parts of the matrices on and off of the GPU as needed, to stay within the available memory budget. In this talk, I will argue that relations make a compelling implementation abstraction for building ML systems. Modern ML computations often manipuate matrices and tensors. A tensor can be decomposed into a binary relation between (key, payload) pairs, where key identifies the sub-tensor stored in payload (payload could be a scalar value, but more likely, it is a multidimensional array). Such a simple binary relation allows many (or perhaps all) common ML computations to be expressed relationally. For example, consider two, 2× 104 by 2× 104 matrices, decomposed into relations having 400 tuples each:","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125163901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Frequent Itemsets Mining with a Guaranteed Local Differential Privacy in Small Datasets 小数据集中保证局部差分隐私的频繁项集挖掘
33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3468807
Sharmin Afrose, T. Hashem, Mohammed Eunus Ali
{"title":"Frequent Itemsets Mining with a Guaranteed Local Differential Privacy in Small Datasets","authors":"Sharmin Afrose, T. Hashem, Mohammed Eunus Ali","doi":"10.1145/3468791.3468807","DOIUrl":"https://doi.org/10.1145/3468791.3468807","url":null,"abstract":"In this paper, we propose an iterative approach to estimate the frequent itemsets with high accuracy while satisfying the local differential privacy (LDP). The key component behind the improved accuracy of the estimated frequent itemsets by our approach is our novel two-level randomization technique for guaranteeing the LDP. Our randomization technique exploits the correlation of the presence of items in a user’s itemset, which has not been considered before. We present a mathematical proof that shows that our approach satisfies the LDP constraint. Extensive experiments are performed to validate the effectiveness and efficiency of our proposed algorithms using real datasets.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126901608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
MoParkeR : Multi-objective Parking Recommendation mopark:多目标停车建议
33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-06-10 DOI: 10.1145/3468791.3468810
M. Rahaman, Wei Shao, F. Salim, A. Turky, A. Song, Jeffrey Chan, Junliang Jiang, D. Bradbrook
{"title":"MoParkeR : Multi-objective Parking Recommendation","authors":"M. Rahaman, Wei Shao, F. Salim, A. Turky, A. Song, Jeffrey Chan, Junliang Jiang, D. Bradbrook","doi":"10.1145/3468791.3468810","DOIUrl":"https://doi.org/10.1145/3468791.3468810","url":null,"abstract":"Existing parking recommendation solutions mainly focus on finding and suggesting parking spaces based on the unoccupied options only. However, there are other factors associated with parking spaces that can influence someone’s choice of parking such as fare, parking rule, walking distance to destination, travel time, likelihood to be unoccupied at a given time. More importantly, these factors may change over time and conflict with each other which makes the recommendations produced by current parking recommender systems ineffective. In this paper, we propose a novel problem called multi-objective parking recommendation. We present a solution by designing a multi-objective parking recommendation engine called MoParkeR that considers various conflicting factors together. Specifically, we utilise a non-dominated sorting technique to calculate a set of Pareto-optimal solutions, consisting of recommended trade-off parking spots. We conduct extensive experiments using two real-world datasets to show the applicability of our multi-objective recommendation methodology.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123726914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Sub-trajectory Similarity Join with Obfuscation 带混淆的子轨迹相似联接
33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-06-07 DOI: 10.1145/3468791.3468822
Yanchuan Chang, Jianzhong Qi, E. Tanin, Xingjun Ma, H. Samet
{"title":"Sub-trajectory Similarity Join with Obfuscation","authors":"Yanchuan Chang, Jianzhong Qi, E. Tanin, Xingjun Ma, H. Samet","doi":"10.1145/3468791.3468822","DOIUrl":"https://doi.org/10.1145/3468791.3468822","url":null,"abstract":"User trajectory data is becoming increasingly accessible due to the prevalence of GPS-equipped devices such as smartphones. Many existing studies focus on querying trajectories that are similar to each other in their entirety. We observe that trajectories partially similar to each other contain useful information about users’ travel patterns which should not be ignored. Such partially similar trajectories are critical in applications such as epidemic contact tracing. We thus propose to query trajectories that are within a given distance range from each other for a given period of time. We formulate this problem as a sub-trajectory similarity join query named as the STS-Join. We further propose a distributed index structure and a query algorithm for STS-Join, where users retain their raw location data and only send obfuscated trajectories to a server for query processing. This helps preserve user location privacy which is vital when dealing with such data. Theoretical analysis and experiments on real data confirm the effectiveness and the efficiency of our proposed index structure and query algorithm.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"91 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129983071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Automatic View Selection in Graph Databases 图数据库中的自动视图选择
33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-05-19 DOI: 10.1145/3468791.3468794
Chao Zhang, Jiaheng Lu, Qingsong Guo, Xinyong Zhang, Xiaochun Han, Minqi Zhou
{"title":"Automatic View Selection in Graph Databases","authors":"Chao Zhang, Jiaheng Lu, Qingsong Guo, Xinyong Zhang, Xiaochun Han, Minqi Zhou","doi":"10.1145/3468791.3468794","DOIUrl":"https://doi.org/10.1145/3468791.3468794","url":null,"abstract":"Recently, several works have studied the problem of view selection in graph databases. However, existing methods cannot fully exploit the graph properties of views, e.g., supergraph views and common subgraph views, which leads to a low view utility and duplicate view content. To address the problem, we propose an extended graph view that persists all the edge-induced subgraphs to answer the subgraph and supergraph queries simultaneously. Furthermore, we present the graph gene algorithm (GGA), which relies on a set of view transformations to reduce the view space and optimize the view benefit. Extensive experiments on real-life and synthetic datasets demonstrated GGA outperformed other selection methods in both effectiveness and efficiency.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"188 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114554656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SCHeMa: Scheduling Scientific Containers on a Cluster of Heterogeneous Machines 模式:在异构机器集群上调度科学容器
33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-03-24 DOI: 10.1145/3468791.3468813
Thanasis Vergoulis, Konstantinos Zagganas, Loukas Kavouras, M. Reczko, S. Sartzetakis, Theodore Dalamagas
{"title":"SCHeMa: Scheduling Scientific Containers on a Cluster of Heterogeneous Machines","authors":"Thanasis Vergoulis, Konstantinos Zagganas, Loukas Kavouras, M. Reczko, S. Sartzetakis, Theodore Dalamagas","doi":"10.1145/3468791.3468813","DOIUrl":"https://doi.org/10.1145/3468791.3468813","url":null,"abstract":"In the era of data-driven science, conducting computational experiments that involve analysing large datasets using heterogeneous computational clusters, is part of the everyday routine for many scientists. Moreover, to ensure the credibility of their results, it is very important for these analyses to be easily reproducible by other researchers. Although various technologies, that could facilitate the work of scientists in this direction, have been introduced in the recent years, there is still a lack of open-source platforms that combine them to this end. In this work, we describe and demonstrate SCHeMa, an open-source platform that facilitates the execution and reproducibility of computational analysis on heterogeneous clusters, leveraging containerization, experiment packaging, workflow management, and machine learning technologies.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129174858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信