33rd International Conference on Scientific and Statistical Database Management最新文献_第3页

Practical Fully-Decentralized Secure Aggregation for Personal Data Management Systems 实用的完全分散的安全聚合个人数据管理系统

33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3468821

Julien Mirval, Luc Bouganim, I. S. Popa

{"title":"Practical Fully-Decentralized Secure Aggregation for Personal Data Management Systems","authors":"Julien Mirval, Luc Bouganim, I. S. Popa","doi":"10.1145/3468791.3468821","DOIUrl":"https://doi.org/10.1145/3468791.3468821","url":null,"abstract":"Personal Data Management Systems (PDMS) are flourishing, boosted by legal and technical means like smart disclosure, data portability and data altruism. A PDMS allows its owner to easily collect, store and manage data, directly generated by her devices, or resulting from her interactions with companies or administrations. PDMSs unlock innovative usages by crossing multiple data sources from one or many users, thus requiring aggregation primitives. Indeed, aggregation primitives are essential to compute statistics on user data, but are also a fundamental building block for machine learning algorithms. This paper proposes a protocol allowing for secure aggregation in a massively distributed PDMS environment, which adapts to selective participation and PDMSs characteristics, and is reliable with respect to failures, with no compromise on accuracy. Preliminary experiments show the effectiveness of our protocol which can adapt to several contexts with varying PDMSs characteristics in terms of communication speed or CPU resources and can adjust the aggregation strategy to the estimated selective participation.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126486671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Local Gaussian Process Model Inference Classification for Time Series Data 时间序列数据的局部高斯过程模型推断分类

33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3468839

Fabian Berns, Joschka Hannes Strueber, C. Beecks

引用次数: 0

UNSUPERVISED ANOMALY DETECTION FOR TIME SERIES WITH OUTLIER EXPOSURE 具有异常值暴露的时间序列的无监督异常检测

33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3468793

Jiaming Feng, Zheng Huang, Jie Guo, Weidong Qiu

{"title":"UNSUPERVISED ANOMALY DETECTION FOR TIME SERIES WITH OUTLIER EXPOSURE","authors":"Jiaming Feng, Zheng Huang, Jie Guo, Weidong Qiu","doi":"10.1145/3468791.3468793","DOIUrl":"https://doi.org/10.1145/3468791.3468793","url":null,"abstract":"It is of great practical significance to accurately model and analyze abnormal events in time series. For example, the identification of anomaly patterns on infrastructure sensor curves helps locate equipment failures. In this paper, we propose an unsupervised anomaly detection approach for time series, which can comprehensively consider both point anomalies and subsequence anomalies. We innovatively introduce RNN into the architecture of Adversarial Autoencoder to better analyze anomaly events based on overall relationship of time series. In addition, we innovatively apply the Outlier Exposure technique for the performance optimization of anomaly detector. Meanwhile, a WGAN-based method is utilized to generate anomaly datasets through normal distribution learning. Finally, we apply the proposed method for fraud detection on a financial statement dataset and intrusion detection on a network traffic dataset. Experimental results demonstrates that our model can comprehensively consider different anomaly types in time series, and achieve promising detection performance overall. In the experiment of fraud detection, the LSTM integrated AAE model achieves an F1 score of 0.810, while the Outlier Exposure enhanced model achieves an F1 score of 0.894. This indicates that our method can improve the performance of current audit systems and facilitate discovering malicious behaviors.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131094464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Accelerating Depth-First Traversal by Graph Ordering 通过图排序加速深度优先遍历

33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3468796

Qiuyi Lyu, M. Sha, Bin Gong, Kuangda Lyu

{"title":"Accelerating Depth-First Traversal by Graph Ordering","authors":"Qiuyi Lyu, M. Sha, Bin Gong, Kuangda Lyu","doi":"10.1145/3468791.3468796","DOIUrl":"https://doi.org/10.1145/3468791.3468796","url":null,"abstract":"Cache efficiency is an important factor in the performance of graph processing due to the irregular memory access patterns caused by the sparse nature of graphs. To increase the cache hit rate, prior studies proposed a variety of preprocessing approaches based on the reordering, which permutes the vertexes’ labels to improve the locality of graph structures. However, the locality enhancement of existing reordering approaches does not bring much performance benefit in depth-first traversal, which is widely adopted in a majority of graph processing applications. Furthermore, the state-of-the-art reordering approach suffers from an obvious overhead on preprocessing which will greatly limit the application of their approach. In this paper, we propose SeqDFS, a depth-first graph traversal method that optimizes the cache efficiency by adjusting the order of vertexes visited and can be further extended to dynamic scenarios. We conduct extensive experiments on 16 real-world datasets and 3 representative depth-first graph applications, of which the results show that our proposal achieves a significant speed-up on both directed and undirected graphs.","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116435360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Tensor-Relational Algebra, and Other Ideas in Machine Learning System Design 张量-关系代数，以及机器学习系统设计中的其他思想

33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3472262

C. Jermaine

{"title":"The Tensor-Relational Algebra, and Other Ideas in Machine Learning System Design","authors":"C. Jermaine","doi":"10.1145/3468791.3472262","DOIUrl":"https://doi.org/10.1145/3468791.3472262","url":null,"abstract":"ACM Reference Format: Chris Jermaine. 2021. The Tensor-Relational Algebra, and Other Ideas in Machine Learning System Design. In 33rd International Conference on Scientific and Statistical Database Management, July 06–07, 2021, Tampa, FL, USA. ACM, New York, NY, USA, 1 page. https://doi.org/10.1145/3468791.3472262 Systems for machine learning such as TensorFlow and PyTorch have greatly increased the complexity of the models that can be prototyped, tested, and moved into production, as well as reducing the time and effort required to do this. However, the systems have significant limitations. In these systems, a matrix multiplication (or a 2-D convolution, or any of the operations offered by the system) is a black-box operation that must actually be executed somewhere. As such, if there are multiple GPUs available to execute the multiplication the system cannot “figure out” how to automatically distribute the multiplication over them. It has to run an available matrix multiply somewhere, on some hardware. If there is one GPU available but the inputs are too large to fit in the GPU RAM, the system cannot automatically decompose the operation to perform the computation in stages, moving parts of the matrices on and off of the GPU as needed, to stay within the available memory budget. In this talk, I will argue that relations make a compelling implementation abstraction for building ML systems. Modern ML computations often manipuate matrices and tensors. A tensor can be decomposed into a binary relation between (key, payload) pairs, where key identifies the sub-tensor stored in payload (payload could be a scalar value, but more likely, it is a multidimensional array). Such a simple binary relation allows many (or perhaps all) common ML computations to be expressed relationally. For example, consider two, 2× 104 by 2× 104 matrices, decomposed into relations having 400 tuples each:","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125163901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Frequent Itemsets Mining with a Guaranteed Local Differential Privacy in Small Datasets 小数据集中保证局部差分隐私的频繁项集挖掘

33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-07-06 DOI: 10.1145/3468791.3468807

Sharmin Afrose, T. Hashem, Mohammed Eunus Ali

引用次数: 4

MoParkeR : Multi-objective Parking Recommendation mopark:多目标停车建议

33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-06-10 DOI: 10.1145/3468791.3468810

M. Rahaman, Wei Shao, F. Salim, A. Turky, A. Song, Jeffrey Chan, Junliang Jiang, D. Bradbrook

引用次数: 2

Sub-trajectory Similarity Join with Obfuscation 带混淆的子轨迹相似联接

33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-06-07 DOI: 10.1145/3468791.3468822

Yanchuan Chang, Jianzhong Qi, E. Tanin, Xingjun Ma, H. Samet

引用次数: 5

Automatic View Selection in Graph Databases 图数据库中的自动视图选择

33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-05-19 DOI: 10.1145/3468791.3468794

Chao Zhang, Jiaheng Lu, Qingsong Guo, Xinyong Zhang, Xiaochun Han, Minqi Zhou

引用次数: 0

SCHeMa: Scheduling Scientific Containers on a Cluster of Heterogeneous Machines 模式:在异构机器集群上调度科学容器

33rd International Conference on Scientific and Statistical Database Management Pub Date : 2021-03-24 DOI: 10.1145/3468791.3468813

Thanasis Vergoulis, Konstantinos Zagganas, Loukas Kavouras, M. Reczko, S. Sartzetakis, Theodore Dalamagas

引用次数: 1