{"title":"SCADDAR: an efficient randomized technique to reorganize continuous media blocks","authors":"Ashish Goel, C. Shahabi, S. Yao, Roger Zimmermann","doi":"10.1109/ICDE.2002.994760","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994760","url":null,"abstract":"Scalable storage architectures allow for the addition of disks to increase storage capacity and/or bandwidth. In its general form, disk scaling also refers to disk removals when either capacity needs to be conserved or old disk drives are retired. Assuming random placement of blocks on multiple nodes of a continuous media server, our optimization objective is to redistribute a minimum number of media blocks after disk scaling. This objective should be met under two restrictions. First, uniform distribution and hence a balanced load should be ensured after redistribution. Second, the redistributed blocks should be retrieved at the normal mode of operation in one disk access and through low complexity computation. We propose a technique that meets the objective, while we prove that it also satisfies both restrictions. The SCADDAR approach is based on using a series of REMAP functions which can derive the location of a new block using only its original location as a basis.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116167608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Processing reporting function views in a data warehouse environment","authors":"Wolfgang Lehner, W. Hümmer, L. Schlesinger","doi":"10.1109/ICDE.2002.994707","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994707","url":null,"abstract":"Reporting functions reflect a novel technique to formulate sequence-oriented queries in SQL. They extend the classical way of grouping and applying aggregation functions by additionally providing a column-based ordering, partitioning, and windowing mechanism. The application area of reporting functions ranges from simple ranking queries (TOP(n)-analyses) over cumulative (Year-To-Date-analyses) to sliding window queries. We discuss the problem of deriving reporting function queries from materialized reporting function views, which is one of the most important issues in efficiently processing queries in a data warehouse environment. Two different derivation algorithms, including their relational mappings are introduced and compared in a test scenario.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131369452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring aggregate effect with weighted transcoding graphs for efficient cache replacement in transcoding proxies","authors":"Cheng-Yue Chang, Ming-Syan Chen","doi":"10.1109/ICDE.2002.994752","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994752","url":null,"abstract":"This paper explores the aggregate effect when caching multiple versions of the same Web object in the transcoding proxy. Explicitly, the aggregate profit from caching multiple versions of an object is not simply the sum of the profits from caching individual versions, but rather, depends on the transcoding relationships among them. Hence, to evaluate the profit from caching each version of an object efficiently, we devise the notion of a weighted transcoding graph and formulate a generalized profit function which explicitly considers the aggregate effect and several new emerging factors in the transcoding proxy. Based on the weighted transcoding graph and the generalized profit function, an innovative cache replacement algorithm for transcoding proxies is proposed in this paper. Experimental results show that the algorithm proposed consistently outperforms companion schemes in terms of the delay saving ratios and cache hit ratios.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130164063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lossy reduction for very high dimensional data","authors":"C. Jermaine, E. Omiecinski","doi":"10.1109/ICDE.2002.994783","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994783","url":null,"abstract":"We consider the use of data reduction techniques for the problem of approximate query answering. We focus on applications for which accurate answers to selective queries are required, and for which the data are very high dimensional (having hundreds of attributes). We present a new data reduction method for this type of application, called the RS kernel. We demonstrate the effectiveness of this method for answering difficult, highly selective queries over high dimensional data using several real datasets.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128729706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Condensed cube: an effective approach to reducing data cube size","authors":"Wei Wang, Hongjun Lu, Jianlin Feng, J. Yu","doi":"10.1109/ICDE.2002.994705","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994705","url":null,"abstract":"Pre-computed data cube facilitates OLAP (on-line analytical processing). It is well-known that data cube computation is an expensive operation. While most algorithms have been devoted to optimizing memory management and reducing computation costs, less work has addressed a fundamental issue: the size of a data cube is huge when a large base relation with a large number of attributes is involved. In this paper, we propose a new concept, called a condensed data cube. The condensed cube is of much smaller size than a complete non-condensed cube. More importantly, it is a fully pre-computed cube without compression, and, hence, it requires neither decompression nor further aggregation when answering queries. Several algorithms for computing a condensed cube are proposed. Results of experiments on the effectiveness of condensed data cube are presented, using both synthetic and real-world data. The results indicate that the proposed condensed cube can reduce both the cube size and therefore its computation time.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133659751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Bukhres, Srinivasan Sikkupparbathyam, K. Nagendra, Zina Ben-Miled, M. Areal, L. Olsen, Chris Gokey, David Kendig, Rosy Cordova, G. Major, J. Savage
{"title":"Demonstration: active asynchronous transaction management in high-autonomy federated environment using data agents: Global Change Master Directory v8.0","authors":"O. Bukhres, Srinivasan Sikkupparbathyam, K. Nagendra, Zina Ben-Miled, M. Areal, L. Olsen, Chris Gokey, David Kendig, Rosy Cordova, G. Major, J. Savage","doi":"10.1109/ICDE.2002.994744","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994744","url":null,"abstract":"The Global Change Master Directory (GCMD) is an Earth science information repository that specifically tracks research data on global climatic change. Building a directory of Earth science metadata that allows the exchange of metadata content among partner organizations is challenging due to the complex issues involved in supporting heterogeneous metadata schema, database schema, database implementation and platforms. This demonstration presents the design of the MD8 (Master Directory v8.0), which allows automated exchange of metadata content among Earth science collaborators through a proposed asynchronous distributed transaction protocol. Specifically, the demonstration focuses on the local data agent that captures local database updates and broadcasts them to other cooperating nodes asynchronously using an announcer.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"273 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115892962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Al-Khalifa, H. Jagadish, Nick Koudas, J. Patel, D. Srivastava, Yuqing Wu
{"title":"Structural joins: a primitive for efficient XML query pattern matching","authors":"S. Al-Khalifa, H. Jagadish, Nick Koudas, J. Patel, D. Srivastava, Yuqing Wu","doi":"10.1109/ICDE.2002.994704","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994704","url":null,"abstract":"XML queries typically specify patterns of selection predicates on multiple elements that have some specified tree structured relationships. The primitive tree structured relationships are parent-child and ancestor-descendant, and finding all occurrences of these relationships in an XML database is a core operation for XML query processing. We develop two families of structural join algorithms for this task: tree-merge and stack-tree. The tree-merge algorithms are a natural extension of traditional merge joins and the multi-predicate merge joins, while the stack-tree algorithms have no counterpart in traditional relational join processing. We present experimental results on a range of data and queries using the TIMBER native XML query engine built on top of SHORE. We show that while, in some cases, tree-merge algorithms can have performance comparable to stack-tree algorithms, in many cases they are considerably worse. This behavior is explained by analytical results that demonstrate that, on sorted inputs, the stack-tree algorithms have worst-case I/O and CPU complexities linear in the sum of the sizes of inputs and output, while the tree-merge algorithms do not have the same guarantee.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126385261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data reduction by partial preaggregation","authors":"P. Larson","doi":"10.1109/ICDE.2002.994787","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994787","url":null,"abstract":"Partial preaggregation is a simple data reduction operator that can be applied to aggregation queries. Whenever we group and aggregate on a column set G, we can preaggregate on any column set that functionally determines G. Preaggregation can be used, for example, to reduce the input size to a join. Regular aggregation reduces the input to one record per group. Partial preaggregation exploits the fact that preaggregation need not be complete-if multiple records happen to be output for a group, they will be combined into the same group by the final aggregation. This paper describes a straightforward hash-based algorithm for partial preaggregation, discusses where it can be applied, and derives a mathematical model for estimating the output size. The effectiveness of the technique and the accuracy of the model are shown on both artificial and real data. It is also shown how to reduce memory requirements by combining partial preaggregation with the input phase of a subsequent join or sort operator. Partial preaggregation has been implemented, in part, in Microsoft SQL Server.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124966629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple query optimization by cache-aware middleware using query teamwork","authors":"K. O'Gorman, D. Agrawal, A. E. Abbadi","doi":"10.1109/ICDE.2002.994728","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994728","url":null,"abstract":"Queries with common sequences of disk accesses can make maximal use of a buffer pool. We developed middleware to promote the necessary conditions in concurrent query streams, and achieved a speedup of 2.99 in executing a workload derived from the TCP-H benchmark.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133723870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Specification-based data reduction in dimensional data warehouses","authors":"Janne Skyt, Christian S. Jensen, T. Pedersen","doi":"10.1109/ICDE.2002.994732","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994732","url":null,"abstract":"Presents a powerful and easy-to-use technique for aggregation-based data reduction that enables the gradual change of the data from being detailed to being increasingly aggregated. The technique enables huge storage gains while retaining the data that is essential to the users, and it preserves the ability to query original and reduced data in an integrated manner.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132698813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}