Proceedings of the 2009 ACM SIGMOD International Conference on Management of data最新文献

筛选
英文 中文
Session details: Research session 1: security I 会议详情:研究会议1:安全
G. Miklau
{"title":"Session details: Research session 1: security I","authors":"G. Miklau","doi":"10.1145/3257449","DOIUrl":"https://doi.org/10.1145/3257449","url":null,"abstract":"","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":" 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133020547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Research session 3: information extraction 会议详情:研究部分3:信息提取
Mirek Riedewal
{"title":"Session details: Research session 3: information extraction","authors":"Mirek Riedewal","doi":"10.1145/3257451","DOIUrl":"https://doi.org/10.1145/3257451","url":null,"abstract":"","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134094062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cost based plan selection for xpath xpath的基于成本的计划选择
H. Georgiadis, M. Charalambides, V. Vassalos
{"title":"Cost based plan selection for xpath","authors":"H. Georgiadis, M. Charalambides, V. Vassalos","doi":"10.1145/1559845.1559909","DOIUrl":"https://doi.org/10.1145/1559845.1559909","url":null,"abstract":"We present a complete XPath cost-based optimization and execution framework and demonstrate its effectiveness and efficiency for a variety of queries and datasets. The framework is based on a logical XPath algebra with novel features and operators and a comprehensive set of rewriting rules that together enable us to algebraically capture many existing and novel processing strategies for XPath queries. An important part of the framework is PSA, a very efficient cost-based plan selection algorithm for XPath queries. In the presented experimental evaluation, PSA picked the cheapest estimated query plan in 100% of the cases. Our cost-based query optimizer independent of the underlying physical data model and storage system and of the available logical operator implementations, depending on a set of well-defined APIs. We also present an implementation of those APIs, including primitive access methods, a large pool of physical operators, statistics estimators and cost models, and experimentally demonstrate the effectiveness of our end-to-end query optimization system.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132315809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Monitoring path nearest neighbor in road networks 监控道路网络中最近邻的路径
Zaiben Chen, Heng Tao Shen, Xiaofang Zhou, J. Yu
{"title":"Monitoring path nearest neighbor in road networks","authors":"Zaiben Chen, Heng Tao Shen, Xiaofang Zhou, J. Yu","doi":"10.1145/1559845.1559907","DOIUrl":"https://doi.org/10.1145/1559845.1559907","url":null,"abstract":"This paper addresses the problem of monitoring the k nearest neighbors to a dynamically changing path in road networks. Given a destination where a user is going to, this new query returns the k-NN with respect to the shortest path connecting the destination and the user's current location, and thus provides a list of nearest candidates for reference by considering the whole coming journey. We name this query the k-Path Nearest Neighbor query (k-PNN). As the user is moving and may not always follow the shortest path, the query path keeps changing. The challenge of monitoring the k-PNN for an arbitrarily moving user is to dynamically determine the update locations and then refresh the k-PNN efficiently. We propose a three-phase Best-first Network Expansion (BNE) algorithm for monitoring the k-PNN and the corresponding shortest path. In the searching phase, the BNE finds the shortest path to the destination, during which a candidate set that guarantees to include the k-PNN is generated at the same time. Then in the verification phase, a heuristic algorithm runs for examining candidates' exact distances to the query path, and it achieves significant reduction in the number of visited nodes. The monitoring phase deals with computing update locations as well as refreshing the k-PNN in different user movements. Since determining the network distance is a costly process, an expansion tree and the candidate set are carefully maintained by the BNE algorithm, which can provide efficient update on the shortest path and the k-PNN results. Finally, we conduct extensive experiments on real road networks and show that our methods achieve satisfactory performance.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"40 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121010066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 111
Large-scale uncertainty management systems: learning and exploiting your data 大规模不确定性管理系统:学习和利用您的数据
S. Babu, S. Guha, Kamesh Munagala
{"title":"Large-scale uncertainty management systems: learning and exploiting your data","authors":"S. Babu, S. Guha, Kamesh Munagala","doi":"10.1145/1559845.1559964","DOIUrl":"https://doi.org/10.1145/1559845.1559964","url":null,"abstract":"The database community has made rapid strides in capturing, representing, and querying uncertain data. Probabilistic databases capture the inherent uncertainty in derived tuples as probability estimates. Data acquisition and stream systems can produce succinct summaries of very large and time-varying datasets. This tutorial addresses the natural next step in harnessing uncertain data: How can we efficiently and quantifiably determine what, how, and how much to learn in order to make good decisions based on the imprecise information available. The material in this tutorial is drawn from a range of fields including database systems, control and information theory, operations research, convex optimization, and statistical learning. The focus of the tutorial is on the natural constraints that are imposed in a database context and the demands of imprecise information from an optimization point of view. We look both into the past as well as into the future; to discuss general tools and techniques that can serve as a guide to database researchers and practitioners, and to enumerate the challenges that lie ahead.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117252757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A decisions query language (DQL): high-level abstraction for mathematical programming over databases 决策查询语言(DQL):对数据库进行数学编程的高级抽象
A. Brodsky, Mayur M. Bhot, Manasa Chandrashekar, N. Egge, X. Wang
{"title":"A decisions query language (DQL): high-level abstraction for mathematical programming over databases","authors":"A. Brodsky, Mayur M. Bhot, Manasa Chandrashekar, N. Egge, X. Wang","doi":"10.1145/1559845.1559981","DOIUrl":"https://doi.org/10.1145/1559845.1559981","url":null,"abstract":"The demonstrated, high-level decisions query language DQL combines the decision optimization capability of mathematical programming and the data manipulation capability of traditional database query languages. DQL benefits application developers in two aspects. First, it avoids a conceptual impedance mismatch between mathematical programming and data access and makes decision optimization functionality readily accessible to database programmers with no prior experience in operations research. Second, a tight integration provides unique opportunities for more efficient evaluation as compared to a loosely coupled system. This demonstration uses an emergency response scenario to illustrate the power of the language and its implementation.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123215749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Interactive anonymization of sensitive data 敏感数据的交互式匿名化
Xiaokui Xiao, Guozhang Wang, J. Gehrke
{"title":"Interactive anonymization of sensitive data","authors":"Xiaokui Xiao, Guozhang Wang, J. Gehrke","doi":"10.1145/1559845.1559979","DOIUrl":"https://doi.org/10.1145/1559845.1559979","url":null,"abstract":"There has been much recent work on algorithms for limiting disclosure in data publishing, however they have not been put to use in any toolkit for practicioners. We will demonstrate CAT, the Cornell Anonymization Toolkit, designed for interactive anonymization. CAT has an interface that is easy to use; it guides users through the process of preparing a dataset for publication while limiting disclosure through the identification of records that have high risk under various attacker models.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121727001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
A comparison of flexible schemas for software as a service 软件即服务的灵活模式的比较
Stefan Aulbach, D. Jacobs, A. Kemper, Michael Seibold
{"title":"A comparison of flexible schemas for software as a service","authors":"Stefan Aulbach, D. Jacobs, A. Kemper, Michael Seibold","doi":"10.1145/1559845.1559941","DOIUrl":"https://doi.org/10.1145/1559845.1559941","url":null,"abstract":"A multi-tenant database system for Software as a Service (SaaS) should offer schemas that are flexible in that they can be extended different versions of the application and dynamically modified while the system is on-line. This paper presents an experimental comparison of five techniques for implementing flexible schemas for SaaS. In three of these techniques, the database \"owns\" the schema in that its structure is explicitly defined in DDL. Included here is the commonly-used mapping where each tenant is given their own private tables, which we take as the baseline, and a mapping that employs Sparse Columns in Microsoft SQL Server. These techniques perform well, however they offer only limited support for schema evolution in the presence of existing data. Moreover they do not scale beyond a certain level. In the other two techniques, the application \"owns\" the schema in that it is mapped into generic structures in the database. Included here are XML in DB2 and Pivot Tables in HBase. These techniques give the application complete control over schema evolution, however they can produce a significant decrease in performance. We conclude that the ideal database for SaaS has not yet been developed and offer some suggestions as to how it should be designed.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129237081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 111
GAMPS: compressing multi sensor data by grouping and amplitude scaling GAMPS:通过分组和幅度缩放压缩多传感器数据
Sorabh Gandhi, Suman Nath, S. Suri, Jie Liu
{"title":"GAMPS: compressing multi sensor data by grouping and amplitude scaling","authors":"Sorabh Gandhi, Suman Nath, S. Suri, Jie Liu","doi":"10.1145/1559845.1559926","DOIUrl":"https://doi.org/10.1145/1559845.1559926","url":null,"abstract":"We consider the problem of collectively approximating a set of sensor signals using the least amount of space so that any individual signal can be efficiently reconstructed within a given maximum (L∞) error ε. The problem arises naturally in applications that need to collect large amounts of data from multiple concurrent sources, such as sensors, servers and network routers, and archive them over a long period of time for offline data mining. We present GAMPS, a general framework that addresses this problem by combining several novel techniques. First, it dynamically groups multiple signals together so that signals within each group are correlated and can be maximally compressed jointly. Second, it appropriately scales the amplitudes of different signals within a group and compresses them within the maximum allowed reconstruction error bound. Our schemes are polynomial time O(α, β approximation schemes, meaning that the maximum (L∞) error is at most α ε and it uses at most β times the optimal memory. Finally, GAMPS maintains an index so that various queries can be issued directly on compressed data. Our experiments on several real-world sensor datasets show that GAMPS significantly reduces space without compromising the quality of search and query.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129301510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 68
Query processing techniques for solid state drives 固态硬盘的查询处理技术
Dimitris Tsirogiannis, S. Harizopoulos, Mehul A. Shah, J. Wiener, G. Graefe
{"title":"Query processing techniques for solid state drives","authors":"Dimitris Tsirogiannis, S. Harizopoulos, Mehul A. Shah, J. Wiener, G. Graefe","doi":"10.1145/1559845.1559854","DOIUrl":"https://doi.org/10.1145/1559845.1559854","url":null,"abstract":"Solid state drives perform random reads more than 100x faster than traditional magnetic hard disks, while offering comparable sequential read and write bandwidth. Because of their potential to speed up applications, as well as their reduced power consumption, these new drives are expected to gradually replace hard disks as the primary permanent storage media in large data centers. However, although they may benefit applications that stress random reads immediately, they may not improve database applications, especially those running long data analysis queries. Database query processing engines have been designed around the speed mismatch between random and sequential I/O on hard disks and their algorithms currently emphasize sequential accesses for disk-resident data. In this paper, we investigate data structures and algorithms that leverage fast random reads to speed up selection, projection, and join operations in relational query processing. We first demonstrate how a column-based layout within each page reduces the amount of data read during selections and projections. We then introduce FlashJoin, a general pipelined join algorithm that minimizes accesses to base and intermediate relational data. FlashJoin's binary join kernel accesses only the join attributes, producing partial results in the form of a join index. Subsequently, its fetch kernel retrieves the attributes for later nodes in the query plan as they are needed. FlashJoin significantly reduces memory and I/O requirements for each join in the query. We implemented these techniques inside Postgres and experimented with an enterprise SSD drive. Our techniques improved query runtimes by up to 6x for queries ranging from simple relational scans and joins to full TPC-H queries.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130951450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 155
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信