Proceedings of the 2006 ACM SIGMOD international conference on Management of data最新文献_第6页

Windows and RSS: beyond blogging Windows和RSS:超越博客

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI: 10.1145/1142473.1142563

Sean Lyndersay

引用次数: 2

Using the oracle database as a declarative RSS hub 使用oracle数据库作为声明性RSS中心

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI: 10.1145/1142473.1142562

D. Gawlick, M. Krishnaprasad, Z. Liu

{"title":"Using the oracle database as a declarative RSS hub","authors":"D. Gawlick, M. Krishnaprasad, Z. Liu","doi":"10.1145/1142473.1142562","DOIUrl":"https://doi.org/10.1145/1142473.1142562","url":null,"abstract":"The interaction with the Web has historically evolved from static bookmarks to dynamic searches to the current usage of active notification mechanisms based on popular protocols like RSS or Atom. In the same time a large volume of important source data is still contained in relational databases. The talk will analyze the way the Oracle database participates to the activation of the data and opening the state changes in a standard and secure way for easy integrating with the rest of the push based Web protocols. We will study the declarative specification of RSS feeds generated based on the state changes detected in the data stored in the Oracle database. On the opposite, external RSS feeds can be injected to the database and processed declaratively in conjunction with the rest of the data. Most of the technical pieces required for such a solution are already supported by the database engine (e.g. declarative XML processing, state change notifications, queues, crawlers, continuous queries), effectively turning the database into a declarative XML hub. The advantages of using database solutions for such problems in an enterprise context are security, scalability and reliability.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127720760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

GPUTeraSort: high performance graphics co-processor sorting for large database management GPUTeraSort:用于大型数据库管理的高性能图形协处理器排序

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI: 10.1145/1142473.1142511

N. Govindaraju, J. Gray, Ritesh Kumar, Dinesh Manocha

{"title":"GPUTeraSort: high performance graphics co-processor sorting for large database management","authors":"N. Govindaraju, J. Gray, Ritesh Kumar, Dinesh Manocha","doi":"10.1145/1142473.1142511","DOIUrl":"https://doi.org/10.1145/1142473.1142511","url":null,"abstract":"We present a novel external sorting algorithm using graphics processors (GPUs) on large databases composed of billions of records and wide keys. Our algorithm uses the data parallelism within a GPU along with task parallelism by scheduling some of the memory-intensive and compute-intensive threads on the GPU. Our new sorting architecture provides multiple memory interfaces on the same PC -- a fast and dedicated memory interface on the GPU along with the main memory interface for CPU computations. As a result, we achieve higher memory bandwidth as compared to CPU-based algorithms running on commodity PCs. Our approach takes into account the limited communication bandwidth between the CPU and the GPU, and reduces the data communication between the two processors. Our algorithm also improves the performance of disk transfers and achieves close to peak I/O performance. We have tested the performance of our algorithm on the SortBenchmark and applied it to large databases composed of a few hundred Gigabytes of data. Our results on a 3 GHz Pentium IV PC with $300 NVIDIA 7800 GT GPU indicate a significant performance improvement over optimized CPU-based algorithms on high-end PCs with 3.6 GHz Dual Xeon processors. Our implementation is able to outperform the current high-end PennySort benchmark and results in a higher performance to price ratio. Overall, our results indicate that using a GPU as a co-processor can significantly improve the performance of sorting algorithms on large databases.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121671555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 496

Constraint chaining: on energy-efficient continuous monitoring in sensor networks 约束链:传感器网络中节能连续监测的研究

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI: 10.1145/1142473.1142492

Adam Silberstein, R. Braynard, Jun Yang

{"title":"Constraint chaining: on energy-efficient continuous monitoring in sensor networks","authors":"Adam Silberstein, R. Braynard, Jun Yang","doi":"10.1145/1142473.1142492","DOIUrl":"https://doi.org/10.1145/1142473.1142492","url":null,"abstract":"Wireless sensor networks have created new opportunities for data collection in a variety of scenarios, such as environmental and industrial, where we expect data to be temporally and spatially correlated. Researchers may want to continuously collect all sensor data from the network for later analysis. Suppression, both temporal and spatial, provides opportunities for reducing the energy cost of sensor data collection. We demonstrate how both types can be combined for maximal benefit. We frame the problem as one of monitoring node and edge constraints. A monitored node triggers a report if its value changes. A monitored edge triggers a report if the difference between its nodes' values changes. The set of reports collected at the base station is used to derive all node values. We fully exploit the potential of this global inference in our algorithm, CONCH, short for constraint chaining. Constraint chaining builds a network of constraints that are maintained locally, but allow a global view of values to be maintained with minimal cost. Network failure complicates the use of suppression, since either causes an absence of reports. We add enhancements to CONCH to build in redundant constraints and provide a method to interpret the resulting reports in case of uncertainty. Using simulation we experimentally evaluate CONCH's effectiveness against competing schemes in a number of interesting scenarios.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126452144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 187

Paper-based mobile access to databases 基于纸张的移动数据库访问

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI: 10.1145/1142473.1142581

B. Signer, M. Norrie, Michael Grossniklaus, R. Belotti, C. Decurtins, N. Weibel

引用次数: 21

Modeling skew in data streams 数据流中的建模偏差

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI: 10.1145/1142473.1142495

Flip Korn, S. Muthukrishnan, Yihua Wu

{"title":"Modeling skew in data streams","authors":"Flip Korn, S. Muthukrishnan, Yihua Wu","doi":"10.1145/1142473.1142495","DOIUrl":"https://doi.org/10.1145/1142473.1142495","url":null,"abstract":"Data stream applications have made use of statistical summaries to reason about the data using nonparametric tools such as histograms, heavy hitters, and join sizes. However, relatively little attention has been paid to modeling stream data parametrically, despite the potential this approach has for mining the data. The challenges to do model fitting at streaming speeds are both technical -- how to continually find fast and reliable parameter estimates on high speed streams of skewed data using small space -- and conceptual -- how to validate the goodness-of-fit and stability of the model online.In this paper, we show how to fit hierarchical (binomial multifractal) and non-hierarchical (Pareto) power-law models on a data stream. We address the technical challenges using an approach that maintains a sketch of the data stream and fits least-squares straight lines; it yields algorithms that are fast, space-efficient, and provide approximations of parameter value estimates with a priori quality guarantees relative to those obtained offline. We address the conceptual challenge by designing fast methods for online goodness-of-fit measurements on a data stream; we adapt the statistical testing technique of examining the quantile-quantile (q-q) plot, to perform online model validation at streaming speeds.As a concrete application of our techniques, we focus on network traffic data which has been shown to exhibit skewed distributions. We complement our analytic and algorithmic results with experiments on IP traffic streams in AT&T's Gigascope® data stream management system, to demonstrate practicality of our methods at line speeds. We measured the stability and robustness of these models over weeks of operational packet data in an IP network. In addition, we study an intrusion detection application, and demonstrate the potential of online parametric modeling.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133112792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Efficient query processing on unstructured tetrahedral meshes 非结构化四面体网格的高效查询处理

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI: 10.1145/1142473.1142535

Stratos Papadomanolakis, A. Ailamaki, Julio C. López, Tiankai Tu, D. O'Hallaron, G. Heber

{"title":"Efficient query processing on unstructured tetrahedral meshes","authors":"Stratos Papadomanolakis, A. Ailamaki, Julio C. López, Tiankai Tu, D. O'Hallaron, G. Heber","doi":"10.1145/1142473.1142535","DOIUrl":"https://doi.org/10.1145/1142473.1142535","url":null,"abstract":"Modern scientific applications such as fluid dynamics and earthquake modeling heavily depend on massive volumes of data produced by computer simulations. Such applications require new data management capabilities in order to scale to terabyte-scale data volumes. The most common way to discretize the application domain is to decompose it into pyramids, forming an unstructured tetrahedral mesh. Modern simulations generate meshes of high resolution and precision, to be queried by a visualization or analysis tool. Tetrahedral meshes are extremely flexible and therefore vital to accurately model complex geometries, but also are difficult to index. To reduce query execution time, applications either use only subsets of the data or rely on different (less flexible) structures, thereby trading accuracy for speed.This paper presents efficient indexing techniques for common spatial (point and range) on tetrahedral meshes. Because the prevailing multidimensional indexing techniques attempt to approximate the tetrahedra using simpler shapes (primarily rectangles) the query performance deteriorates significantly as a function of the mesh's geometric complexity. We develop Directed Local Search (DLS), an efficient indexing algorithm based on mesh topology information that is practically insensitive to the geometric properties of meshes. We show how DLS can be easily and efficiently implemented within modern DBMS without requiring new exotic index structures and complex preprocessing. Finally, we present a new data layout approach for tetrahedral mesh datasets that provides better performance for scientific applications.compared to the traditional space filling curves. In our PostgreSQL implementation DLS reduces the number of disk page accesses by 26% to 4x, and improves the overall query execution time by 25% to 4.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129318754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

Quark: an efficient XQuery full-text implementation Quark:一个高效的XQuery全文实现

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI: 10.1145/1142473.1142588

A. Bhaskar, C. Botev, M. Chettiar, Lin Guo, J. Shanmugasundaram, F. Shao, Fan Yang

引用次数: 7

Ranking objects based on relationships 根据关系对对象进行排序

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI: 10.1145/1142473.1142516

K. Chakrabarti, Venkatesh Ganti, Jiawei Han, Dong Xin

{"title":"Ranking objects based on relationships","authors":"K. Chakrabarti, Venkatesh Ganti, Jiawei Han, Dong Xin","doi":"10.1145/1142473.1142516","DOIUrl":"https://doi.org/10.1145/1142473.1142516","url":null,"abstract":"In many document collections, documents are related to objects such as document authors, products described in the document, or persons referred to in the document. In many applications, the goal is to find these objects that best match a set of keywords. However, the keywords may not necessarily occur in the target objects; they occur only in the documents. For example, in a product review database, a user might search for names of products (say, laptops) using keywords like \"lightweight\" and \"business use\" that occur only in the reviews but not in the names of laptops. In order to answer these queries, we need to exploit relationships between documents containing the keywords and the target objects related to those documents. Current keyword query paradigms do not exploit these relationships effectively and hence are inefficient for these queries.In this paper, we consider a class of queries called the \"object finder\" queries. Our main intuition is to exploit the relationships between searchable documents and related objects and further \"aggregate\" the document scores from these relationships in order to find the best ranking target objects. Building upon existing keyword search engines such as full text search, we design efficient algorithms that exploit the requirement of only the best k target objects to terminate early. The main challenge here is to push early termination through blocking operators such as group by and aggregation. Our experiments with real datasets and workloads demonstrate the effectiveness of our techniques. Although we present our techniques in the context of keyword search, our techniques apply to other types of ranked searches (e.g., multimedia search) as well.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121407957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 63

Declarative networking: language, execution and optimization 声明式网络:语言、执行和优化

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI: 10.1145/1142473.1142485

B. T. Loo, Tyson Condie, M. Garofalakis, David E. Gay, J. Hellerstein, Petros Maniatis, R. Ramakrishnan, Timothy Roscoe, I. Stoica

{"title":"Declarative networking: language, execution and optimization","authors":"B. T. Loo, Tyson Condie, M. Garofalakis, David E. Gay, J. Hellerstein, Petros Maniatis, R. Ramakrishnan, Timothy Roscoe, I. Stoica","doi":"10.1145/1142473.1142485","DOIUrl":"https://doi.org/10.1145/1142473.1142485","url":null,"abstract":"The networking and distributed systems communities have recently explored a variety of new network architectures, both for application-level overlay networks, and as prototypes for a next-generation Internet architecture. In this context, we have investigated declarative networking: the use of a distributed recursive query engine as a powerful vehicle for accelerating innovation in network architectures [23, 24, 33]. Declarative networking represents a significant new application area for database research on recursive query processing. In this paper, we address fundamental database issues in this domain. First, we motivate and formally define the Network Datalog (NDlog) language for declarative network specifications. Second, we introduce and prove correct relaxed versions of the traditional semi-naïve query evaluation technique, to overcome fundamental problems of the traditional technique in an asynchronous distributed setting. Third, we consider the dynamics of network state, and formalize the iheventual consistencyl. of our programs even when bursts of updates can arrive in the midst of query execution. Fourth, we present a number of query optimization opportunities that arise in the declarative networking context, including applications of traditional techniques as well as new optimizations. Last, we present evaluation results of the above ideas implemented in our P2 declarative networking system, running on 100 machines over the Emulab network testbed.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114263657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 312