Proceedings of the 30th International Conference on Scientific and Statistical Database Management最新文献_第2页

Federated database system for scientific data 科学数据联邦数据库系统

Proceedings of the 30th International Conference on Scientific and Statistical Database Management Pub Date : 2018-07-09 DOI: 10.1145/3221269.3222332

Sangchul Kim, Bongki Moon

引用次数: 6

Crossing an OCEAN of queries: analyzing SQL query logs with OCEANLog 跨越查询的海洋:用OCEANLog分析SQL查询日志

Proceedings of the 30th International Conference on Scientific and Statistical Database Management Pub Date : 2018-07-09 DOI: 10.1145/3221269.3223025

A. Wahl, Gregor Endler, Peter K. Schwab, Sebastian Herbst, Julian Rith, R. Lenz

{"title":"Crossing an OCEAN of queries: analyzing SQL query logs with OCEANLog","authors":"A. Wahl, Gregor Endler, Peter K. Schwab, Sebastian Herbst, Julian Rith, R. Lenz","doi":"10.1145/3221269.3223025","DOIUrl":"https://doi.org/10.1145/3221269.3223025","url":null,"abstract":"SQL queries encapsulate the knowledge of their authors about the usage of the queried data sources. This knowledge also contains aspects that cannot be inferred by analyzing the contents of the queried data sources alone. Due to the complexity of analytical SQL queries, specialized mechanisms are necessary to enable the user-friendly formulation of meta-queries over an existing query log. Currently existing approaches do not sufficiently consider syntactic and semantic aspects of queries along with contextual information. During our demonstration, conference participants learn how to use the latest release of OCEANLog, a framework for analyzing SQL query logs. Our demonstration encompasses several scenarios. Participants can explore an existing query log using domain-specific graph traversal expressions, set up continuous subscriptions for changes in the graph, create time-based visualizations of query results, configure an OCEANLog instance and learn how to choose a decide which specific graph database to use. We also provide them with access to the native meta-query mechanisms of a DBMS to further emphasize the benefits of our graph-based approach.","PeriodicalId":365491,"journal":{"name":"Proceedings of the 30th International Conference on Scientific and Statistical Database Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130759974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Publishing spatial histograms under differential privacy 在差分隐私下发布空间直方图

Proceedings of the 30th International Conference on Scientific and Statistical Database Management Pub Date : 2018-07-09 DOI: 10.1145/3221269.3223039

S. Ghane, L. Kulik, K. Ramamohanarao

{"title":"Publishing spatial histograms under differential privacy","authors":"S. Ghane, L. Kulik, K. Ramamohanarao","doi":"10.1145/3221269.3223039","DOIUrl":"https://doi.org/10.1145/3221269.3223039","url":null,"abstract":"Studying trajectories of individuals has received growing interest. The aggregated movement behaviour of people provides important insights about their habits, interests, and lifestyles. Understanding and utilizing trajectory data is a crucial part of many applications such as location based services, urban planning, and traffic monitoring systems. Spatial histograms and spatial range queries are key components in such applications to efficiently store and answer queries on trajectory data. A spatial histogram maintains the sequentiality of location points in a trajectory by a strong sequential dependency among histogram cells. This dependency is an essential property in answering spatial range queries. However, the trajectories of individuals are unique and even aggregating them in spatial histograms cannot completely ensure an individual's privacy. A key technique to ensure privacy for data publishing ϵ-differential privacy as it provides a strong guarantee on an individual's provided data. Our work is the first that guarantees ϵ-differential privacy for spatial histograms on trajectories, while ensuring the sequentiality of trajectory data, i.e., its consistency. Consistency is key for any database and our proposed mechanism, PriSH, synthesizes a spatial histogram and ensures the consistency of published histogram with respect to the strong dependency constraint. In extensive experiments on real and synthetic datasets, we show that (1) PriSH is highly scalable with the dataset size and granularity of the space decomposition, (2) the distribution of aggregate trajectory information in the synthesized histogram accurately preserves the distribution of original histogram, and (3) the output has high accuracy in answering arbitrary spatial range queries.","PeriodicalId":365491,"journal":{"name":"Proceedings of the 30th International Conference on Scientific and Statistical Database Management","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123112111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

A unified framework of density-based clustering for semi-supervised classification 半监督分类中基于密度聚类的统一框架

Proceedings of the 30th International Conference on Scientific and Statistical Database Management Pub Date : 2018-07-09 DOI: 10.1145/3221269.3223037

J. C. Gertrudes, A. Zimek, J. Sander, R. Campello

引用次数: 9

Metadata-driven error detection 元数据驱动的错误检测

Proceedings of the 30th International Conference on Scientific and Statistical Database Management Pub Date : 2018-07-09 DOI: 10.1145/3221269.3223028

L. Visengeriyeva, Ziawasch Abedjan

{"title":"Metadata-driven error detection","authors":"L. Visengeriyeva, Ziawasch Abedjan","doi":"10.1145/3221269.3223028","DOIUrl":"https://doi.org/10.1145/3221269.3223028","url":null,"abstract":"Scientific data often originates from multiple sources and human agents. The integration of data from different sources must also resolve data quality problems that might occur because of inconsistency or different quality assurance levels of the sources. To identify various data quality problems in a dataset, it is necessary to use several error detection methods. Existing error detection solutions are usually tailored towards one specific type of data errors, such as rule violations or outliers, requiring the application of multiple strategies. Using all possible error detection methods is also not satisfying, as some systems might perform poorly on a particular dataset by producing a large number of false positives and missing some results. However, it is not trivial to assess the effectiveness of each strategy upfront. We propose two new holistic approaches for effectively combining off-the-shelf error detection systems. Our approaches are learning-based and incorporate metadata extracted from the dataset at hand. We empirically show, using four real-world datasets, that our method of combining error-detecting strategies achieves an average F1 score 15% higher than multiple heuristics-based baselines.","PeriodicalId":365491,"journal":{"name":"Proceedings of the 30th International Conference on Scientific and Statistical Database Management","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127657571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Visual querying of large multilayer graphs 大型多层图的可视化查询

Proceedings of the 30th International Conference on Scientific and Statistical Database Management Pub Date : 2018-07-09 DOI: 10.1145/3221269.3223027

Erick Cuenca, A. Sallaberry, D. Ienco, P. Poncelet

引用次数: 6

PathGraph PathGraph

Proceedings of the 30th International Conference on Scientific and Statistical Database Management Pub Date : 2018-07-09 DOI: 10.1145/3221269.3222331

Dario Colazzo, Vincenzo Mecca, Maurizio Nolé, C. Sartiani

引用次数: 3

Efficient anti-community detection in complex networks 复杂网络中的高效反社团检测

Proceedings of the 30th International Conference on Scientific and Statistical Database Management Pub Date : 2018-07-09 DOI: 10.1145/3221269.3221289

Sebastian Lackner, Andreas Spitz, M. Weidemüller, Michael Gertz

{"title":"Efficient anti-community detection in complex networks","authors":"Sebastian Lackner, Andreas Spitz, M. Weidemüller, Michael Gertz","doi":"10.1145/3221269.3221289","DOIUrl":"https://doi.org/10.1145/3221269.3221289","url":null,"abstract":"Modeling the relations between the components of complex systems as networks of vertices and edges is a commonly used method in many scientific disciplines that serves to obtain a deeper understanding of the systems themselves. In particular, the detection of densely connected communities in these networks is frequently used to identify functionally related components, such as social circles in networks of personal relations or interactions between agents in biological networks. Traditionally, communities are considered to have a high density of internal connections, combined with a low density of external edges between different communities. However, not all naturally occurring communities in complex networks are characterized by this notion of structural equivalence, such as groups of energy states with shared quantum numbers in networks of spectral line transitions. In this paper, we focus on this inverse task of detecting anti-communities that are characterized by an exceptionally low density of internal connections and a high density of external connections. While anti-communities have been discussed in the literature for anecdotal applications or as a modification of traditional community detection, no rigorous investigation of algorithms for the problem has been presented. To this end, we introduce and discuss a broad range of possible approaches and evaluate them with regard to efficiency and effectiveness on a range of real-world and synthetic networks. Furthermore, we show that the presence of a community and anti-community structure are not mutually exclusive, and that even networks with a strong traditional community structure may also contain anti-communities.","PeriodicalId":365491,"journal":{"name":"Proceedings of the 30th International Conference on Scientific and Statistical Database Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130614548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

ERMrest: a web service for collaborative data management ERMrest:用于协作数据管理的web服务

Proceedings of the 30th International Conference on Scientific and Statistical Database Management Pub Date : 2018-07-09 DOI: 10.1145/3221269.3222333

K. Czajkowski, C. Kesselman, R. Schuler, H. Tangmunarunkit

引用次数: 8

Proceedings of the 30th International Conference on Scientific and Statistical Database Management 第30届科学与统计数据库管理国际会议论文集

Proceedings of the 30th International Conference on Scientific and Statistical Database Management Pub Date : 2018-07-09 DOI: 10.1145/3221269

引用次数: 1