Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data最新文献_第2页

Aggregate suppression for enterprise search engines 企业搜索引擎聚合抑制

Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data Pub Date : 2012-05-20 DOI: 10.1145/2213836.2213890

Mingyang Zhang, Nan Zhang, Gautam Das

{"title":"Aggregate suppression for enterprise search engines","authors":"Mingyang Zhang, Nan Zhang, Gautam Das","doi":"10.1145/2213836.2213890","DOIUrl":"https://doi.org/10.1145/2213836.2213890","url":null,"abstract":"Many enterprise websites provide search engines to facilitate customer access to their underlying documents or data. With the web interface of such a search engine, a customer can specify one or a few keywords that he/she is interested in; and the search engine returns a list of documents/tuples matching the user-specified keywords, sorted by an often-proprietary scoring function. It was traditionally believed that, because of its highly-restrictive interface (i.e., keyword search only, no SQL-style queries), such a search engine serves its purpose of answering individual keyword-search queries without disclosing big-picture aggregates over the data which, as we shall show in the paper, may incur significant privacy concerns to the enterprise. Nonetheless, recent work on sampling and aggregate estimation over a search engine's corpus through its keyword-search interface transcends this traditional belief. In this paper, we consider a novel problem of suppressing sensitive aggregates for enterprise search engines while maintaining the quality of answers provided to individual keyword-search queries. We demonstrate the effectiveness and efficiency of our novel techniques through theoretical analysis and extensive experimental studies.","PeriodicalId":212616,"journal":{"name":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133066423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Differential privacy in data publication and analysis 数据发布和分析中的差异隐私

Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data Pub Date : 2012-05-20 DOI: 10.1145/2213836.2213910

Y. Yang, Zhenjie Zhang, G. Miklau, M. Winslett, Xiaokui Xiao

{"title":"Differential privacy in data publication and analysis","authors":"Y. Yang, Zhenjie Zhang, G. Miklau, M. Winslett, Xiaokui Xiao","doi":"10.1145/2213836.2213910","DOIUrl":"https://doi.org/10.1145/2213836.2213910","url":null,"abstract":"Data privacy has been an important research topic in the security, theory and database communities in the last few decades. However, many existing studies have restrictive assumptions regarding the adversary's prior knowledge, meaning that they preserve individuals' privacy only when the adversary has rather limited background information about the sensitive data, or only uses certain kinds of attacks. Recently, differential privacy has emerged as a new paradigm for privacy protection with very conservative assumptions about the adversary's prior knowledge. Since its proposal, differential privacy had been gaining attention in many fields of computer science, and is considered among the most promising paradigms for privacy-preserving data publication and analysis. In this tutorial, we will motivate its introduction as a replacement for other paradigms, present the basics of the differential privacy model from a database perspective, describe the state of the art in differential privacy research, explain the limitations and shortcomings of differential privacy, and discuss open problems for future research.","PeriodicalId":212616,"journal":{"name":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114647142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 98

Can we beat the prefix filtering?: an adaptive framework for similarity join and search 我们能打败前缀过滤吗?:一个自适应的相似性连接和搜索框架

Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data Pub Date : 2012-05-20 DOI: 10.1145/2213836.2213847

Jiannan Wang, Guoliang Li, Jianhua Feng

引用次数: 225

JustMyFriends: full SQL, full transactional amenities, and access privacy JustMyFriends:完整的SQL，完整的事务性设施和访问隐私

Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data Pub Date : 2012-05-20 DOI: 10.1145/2213836.2213918

Arthur Meacham, D. Shasha

{"title":"JustMyFriends: full SQL, full transactional amenities, and access privacy","authors":"Arthur Meacham, D. Shasha","doi":"10.1145/2213836.2213918","DOIUrl":"https://doi.org/10.1145/2213836.2213918","url":null,"abstract":"A major obstacle to using Cloud services for many enterprises is the fear that the data will be stolen. Bringing the Cloud in-house is an incomplete solution to the problem because that implies that data center personnel as well as myriad repair personnel must be trusted. An ideal security solution would be to share data among precisely the people who should see it (\"my friends\") and nobody else. Encryption might seem to be an easy answer. Each friend could download the data, update it perhaps, and return it to a shared untrusted repository. But such a solution permits no concurrency and therefore no real sharing. JustMyFriends ensures sharing among friends without revealing unencrypted data to anyone outside of a circle of trust. In fact, non-friends (such as system administrators) see only encrypted blobs being added to a persistent store. JustMyFriends allows data sharing and full transactions. It supports the use of all SQL including stored procedures, updates, and arbitrary queries. Additionally, it provides full access privacy, preventing the host from discovering patterns or correlations in the user's data access behavior. The demonstration will show how friends in an unnamed government agency can coordinate the management of a spy network in a transactional fashion. Demo visitors will be able to play the roles of station chiefs and/or of troublemakers. As station chiefs, they will write their own transactions and queries, logout, login. As troublemakers, visitors will be able to play the role of a curious observer, kill client processes, and in general try to disrupt the system.","PeriodicalId":212616,"journal":{"name":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114266200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Sindbad: a location-based social networking system Sindbad:一个基于位置的社交网络系统

Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data Pub Date : 2012-05-20 DOI: 10.1145/2213836.2213923

Mohamed Sarwat, Jie Bao, A. Eldawy, Justin J. Levandoski, A. Magdy, M. Mokbel

引用次数: 34

Prediction-based geometric monitoring over distributed data streams 基于预测的分布式数据流几何监控

Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data Pub Date : 2012-05-20 DOI: 10.1145/2213836.2213867

Nikos Giatrakos, Antonios Deligiannakis, M. Garofalakis, I. Sharfman, A. Schuster

{"title":"Prediction-based geometric monitoring over distributed data streams","authors":"Nikos Giatrakos, Antonios Deligiannakis, M. Garofalakis, I. Sharfman, A. Schuster","doi":"10.1145/2213836.2213867","DOIUrl":"https://doi.org/10.1145/2213836.2213867","url":null,"abstract":"Many modern streaming applications, such as online analysis of financial, network, sensor and other forms of data are inherently distributed in nature. An important query type that is the focal point in such application scenarios regards actuation queries, where proper action is dictated based on a trigger condition placed upon the current value that a monitored function receives. Recent work studies the problem of (non-linear) sophisticated function tracking in a distributed manner. The main concept behind the geometric monitoring approach proposed there, is for each distributed site to perform the function monitoring over an appropriate subset of the input domain. In the current work, we examine whether the distributed monitoring mechanism can become more efficient, in terms of the number of communicated messages, by extending the geometric monitoring framework to utilize prediction models. We initially describe a number of local estimators (predictors) that are useful for the applications that we consider and which have already been shown particularly useful in past work. We then demonstrate the feasibility of incorporating predictors in the geometric monitoring framework and show that prediction-based geometric monitoring in fact generalizes the original geometric monitoring framework. We propose a large variety of different prediction-based monitoring models for the distributed threshold monitoring of complex functions. Our extensive experimentation with a variety of real data sets, functions and parameter settings indicates that our approaches can provide significant communication savings ranging between two times and up to three orders of magnitude, compared to the transmission cost of the original monitoring framework.","PeriodicalId":212616,"journal":{"name":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126974452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 64

AstroShelf: understanding the universe through scalable navigation of a galaxy of annotations AstroShelf:通过可扩展的星系注释导航来了解宇宙

Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data Pub Date : 2012-05-20 DOI: 10.1145/2213836.2213940

P. Neophytou, Roxana Gheorghiu, Rebecca Hachey, T. Luciani, Di Bao, Alexandros Labrinidis, G. Marai, Panos K. Chrysanthis

{"title":"AstroShelf: understanding the universe through scalable navigation of a galaxy of annotations","authors":"P. Neophytou, Roxana Gheorghiu, Rebecca Hachey, T. Luciani, Di Bao, Alexandros Labrinidis, G. Marai, Panos K. Chrysanthis","doi":"10.1145/2213836.2213940","DOIUrl":"https://doi.org/10.1145/2213836.2213940","url":null,"abstract":"This demo presents AstroShelf, our on-going effort to enable astrophysicists to collaboratively investigate celestial objects using data originating from multiple sky surveys, hosted at different sites. The AstroShelf platform combines database and data stream, workflow and visualization technologies to provide a means for querying and displaying telescope images (in a Google Sky manner), visualizations of spectrum data, and for managing annotations. In addition to the user interface, AstroShelf supports a programmatic interface (available as a web service), which allows astrophysicists to incorporate functionality from AstroShelf in their own programs. A key feature is Live Annotations which is the detection and delivery of events or annotations to users in real-time, based on their profiles. We demonstrate the capabilities of AstroShelf through real end-user exploration scenarios (with participation from \"stargazers\" in the audience), in the presence of simulated annotation workloads executed through web services.","PeriodicalId":212616,"journal":{"name":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133613106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Declarative error management for robust data-intensive applications 用于健壮的数据密集型应用程序的声明式错误管理

Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data Pub Date : 2012-05-20 DOI: 10.1145/2213836.2213860

C. Kanne, V. Ercegovac

{"title":"Declarative error management for robust data-intensive applications","authors":"C. Kanne, V. Ercegovac","doi":"10.1145/2213836.2213860","DOIUrl":"https://doi.org/10.1145/2213836.2213860","url":null,"abstract":"We present an approach to declaratively manage run-time errors in data-intensive applications. When large volumes of raw data meet complex third-party libraries, deterministic run-time errors become likely, and existing query processors typically stop without returning a result when a run-time error occurs. The ability to degrade gracefully in the presence of run-time errors, and partially execute jobs, is typically limited to specific operators such as bulkloading. We generalize this concept to all operators of a query processing system, introducing a novel data type \"partial result with errors\" and corresponding operators. We show how to extend existing error-unaware operators to support this type, and as an added benefit, eliminate side-effect based error reporting. We use declarative specifications of acceptable results to control the semantics of error-aware operators. We have incorporated our approach into a declarative query processing system, which compiles the language constructs into instrumented execution plans for clusters of machines. We experimentally validate that the instrumentation overhead is below 20% in microbenchmarks, and not detectable when running I/O-intensive workloads.","PeriodicalId":212616,"journal":{"name":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","volume":"213 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132339465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Query preserving graph compression 保持查询的图压缩

Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data Pub Date : 2012-05-20 DOI: 10.1145/2213836.2213855

W. Fan, Jianzhong Li, Xin Wang, Yinghui Wu

{"title":"Query preserving graph compression","authors":"W. Fan, Jianzhong Li, Xin Wang, Yinghui Wu","doi":"10.1145/2213836.2213855","DOIUrl":"https://doi.org/10.1145/2213836.2213855","url":null,"abstract":"It is common to find graphs with millions of nodes and billions of edges in, e.g., social networks. Queries on such graphs are often prohibitively expensive. These motivate us to propose query preserving graph compression, to compress graphs relative to a class Λ of queries of users' choice. We compute a small Gr from a graph G such that (a) for any query Q Ε Λ Q, Q(G) = Q'(Gr), where Q' Ε Λ can be efficiently computed from Q; and (b) any algorithm for computing Q(G) can be directly applied to evaluating Q' on Gr as is. That is, while we cannot lower the complexity of evaluating graph queries, we reduce data graphs while preserving the answers to all the queries in Λ. To verify the effectiveness of this approach, (1) we develop compression strategies for two classes of queries: reachability and graph pattern queries via (bounded) simulation. We show that graphs can be efficiently compressed via a reachability equivalence relation and graph bisimulation, respectively, while reserving query answers. (2) We provide techniques for aintaining compressed graph Gr in response to changes ΔG to the original graph G. We show that the incremental maintenance problems are unbounded for the two lasses of queries, i.e., their costs are not a function of the size of ΔG and changes in Gr. Nevertheless, we develop incremental algorithms that depend only on ΔG and Gr, independent of G, i.e., we do not have to decompress Gr to propagate the changes. (3) Using real-life data, we experimentally verify that our compression techniques could reduce graphs in average by 95% for reachability and 57% for graph pattern matching, and that our incremental maintenance algorithms are efficient.","PeriodicalId":212616,"journal":{"name":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114518614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 201

SCARAB: scaling reachability computation on large graphs SCARAB:在大图形上的缩放可达性计算

Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data Pub Date : 2012-05-20 DOI: 10.1145/2213836.2213856

R. Jin, Ning Ruan, S. Dey, J. Yu

引用次数: 88