Latha S. Colby, R. Cole, E. Haslam, N. Jazayeri, Galt Johnson, William J. McKenna, L. Schumacher, David Wilhite
{"title":"Red Brick Vista/sup TM/: aggregate computation and management","authors":"Latha S. Colby, R. Cole, E. Haslam, N. Jazayeri, Galt Johnson, William J. McKenna, L. Schumacher, David Wilhite","doi":"10.1109/ICDE.1998.655773","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655773","url":null,"abstract":"Aggregate query processing in large data warehouses is computationally intensive. Precomputation is an approach that can be used to speed up aggregate queries. However, in order to make precomputation a truly viable solution to the aggregate query processing problem, it is important to identify the best set of aggregates to precompute and to use these precomputed aggregates effectively. The Red Brick aggregate computation and management system (Red Brick Vista) provides a complete server integrated solution to these problems.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132162207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing regular path expressions using graph schemas","authors":"M. Fernández, Dan Suciu","doi":"10.1109/ICDE.1998.655753","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655753","url":null,"abstract":"Query languages for data with irregular structure use regular path expressions for navigation. This feature is useful for querying data where parts of the structure is either unknown, unavailable to the user, or changes frequently. Naive execution of regular path expressions is inefficient however, because it ignores any structure in the data. We describe two optimization techniques for queries with regular path expressions. Both rely on graph schemas for specifying partial knowledge about the data's structure. Query pruning uses this structure to restrict navigation to only a fragment of the data; we give an efficient algorithm for rewriting any regular path expression query into a pruned one. Query rewriting using state extents can eliminate or reduce navigation altogether; it is reminiscent of optimizing relational queries using indices. There may be several ways to optimize a query using state extents; we give a polynomial space algorithm that finds all such optimizations. For restricted forms of regular path expressions, the algorithm is provably efficient. We also give an efficient approximation algorithm that works on all regular path expressions.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115350915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Back to the future: dynamic hierarchical clustering","authors":"Chendong Zou, B. Salzberg, R. Ladin","doi":"10.1109/ICDE.1998.655821","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655821","url":null,"abstract":"Describes a new method for dynamically clustering hierarchical data which maintains good clustering within disk pages in the presence of insertions and deletions. This simple but effective method, which we call Enc, encodes the insertion order of children with respect to their parents and concatenates the insertion numbers to form a compact key for the data. This compact key is stored only in the indexing structure and does not affect the logical database schema. Experimental results show that our Enc method is very efficient for hierarchical queries and performs reasonably well for random access queries.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115693512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining optimized association rules with categorical and numeric attributes","authors":"R. Rastogi, Kyuseok Shim","doi":"10.1109/ICDE.1998.655813","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655813","url":null,"abstract":"Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial and retail sectors. Furthermore, optimized association rules are an effective way to focus on the most interesting characteristics involving certain attributes. Optimized association rules are permitted to contain uninstantiated attributes and the problem is to determine instantiations such that either the support or confidence of the rule is maximized. We generalize the optimized association rules problem in three ways: (1) association rules are allowed to contain disjunctions over uninstantiated attributes; (2) association rules are permitted to contain an arbitrary number of uninstantiated attributes; and (3) uninstantiated attributes can be either categorical or numeric. Our generalized association rules enable us to extract more useful information about seasonal and local patterns involving multiple attributes. We present effective techniques for pruning the search space when computing optimized association rules for both categorical and numeric attributes. Finally, we report the results of our experiments that indicate that our pruning algorithms are efficient for a large number of uninstantiated attributes, disjunctions and values in the domain of the attributes.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123935349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Failure handling and coordinated execution of concurrent workflows","authors":"M. Kamath, K. Ramamritham","doi":"10.1109/ICDE.1998.655796","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655796","url":null,"abstract":"Workflow management systems (WFMSs) coordinate the execution of applications distributed over networks. In WFMSs, data inconsistencies can arise due to: the interaction between steps of concurrent threads within a workflow (intra-workflow coordination); the interaction between steps of concurrent workflows (inter-workflow coordination); and the presence of failures. Since these problems have not received adequate attention, this paper focuses on developing the necessary concepts and infrastructure to handle them. First, to deal with inter- and intra-workflow coordination requirements we have identified a set of high level building blocks. Secondly, to handle failures we propose a novel and pragmatic approach called opportunistic compensation and re-execution that allows a workflow designer to customize workflow recovery from correctness as well as performance perspectives. Thirdly based on these concepts we have designed a workflow specification language that expresses new requirements for workflow executions and implemented a run-time system for managing workflow executions while satisfying the new requirements. These ideas are geared towards improving the modeling and correctness properties offered by WFMSs and making them more robust and flexible.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124796380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Compressing relations and indexes","authors":"J. Goldstein, R. Ramakrishnan, U. Shaft","doi":"10.1109/ICDE.1998.655800","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655800","url":null,"abstract":"We propose a new compression algorithm that is tailored to database applications. It can be applied to a collection of records, and is especially effective for records with many low to medium cardinality fields and numeric fields. In addition, this new technique supports very fast decompression. Promising application domains include decision support systems (DSS), since fact tables, which are by far the largest tables in these applications, contain many low and medium cardinality fields and typically no text fields. Further, our decompression rates are faster than typical disk throughputs for sequential scans; in contrast, gzip is slower. This is important in DSS applications, which often scan large ranges of records. An important distinguishing characteristic of our algorithm, in contrast to compression algorithms proposed earlier, is that we can decompress individual tuples (even individual fields), rather than a full page (or an entire relation) at a time. Also, all the information needed for tuple decompression resides on the same page with the tuple. This means that a page can be stored in the buffer pool and used in compressed form, simplifying the job of the buffer manager and improving memory utilization. Our compression algorithm also improves index structures such as B-trees and R-trees significantly by reducing the number of leaf pages and compressing index entries, which greatly increases the fan-out. We can also use lossy compression on the internal nodes of an index.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130030753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalizing \"search\" in generalized search trees","authors":"Paul M. Aoki","doi":"10.1109/ICDE.1998.655801","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655801","url":null,"abstract":"The generalized search tree, or GiST, defines a framework of basic interfaces required to construct a hierarchical access method for database systems. As originally specified, GiST only supports record selection. We show how a small number of additional interfaces enable GiST to support a much larger class of operations. Members of this class, which includes, nearest-neighbor and ranked search, user-defined aggregation and index-assisted selectivity estimation, are increasingly common in new database applications. The advantages of implementing these operations in the GiST framework include reduction of user development effort and the ability to use industrial strength concurrency and recovery mechanisms provided by expert implementers.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128897742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph structured views and their incremental maintenance","authors":"Yue Zhuge, H. Garcia-Molina","doi":"10.1109/ICDE.1998.655767","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655767","url":null,"abstract":"Studies the problem of maintaining materialized views of graph structured data. The base data consists of records containing identifiers of other records. The data could represent traditional objects (with methods, attributes and a class hierarchy), but it could also represent a lower-level data structure. We define simple views and materialized views for such graph structured data, analyzing options for representing record identity and references in the view. We develop incremental maintenance algorithms for these views.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131066536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flattening an object algebra to provide performance","authors":"P. Boncz, A. N. Wilschut, M. Kersten","doi":"10.1109/ICDE.1998.655820","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655820","url":null,"abstract":"Algebraic transformation and optimization techniques have been the method of choice in relational query execution, but applying them in object-oriented (OO) DBMSs is difficult due to the complexity of OO query languages. This paper demonstrates that the problem can be simplified by mapping an OO data model to the binary relational model implemented by Monet, a state-of-the-art database kernel. We present a generic mapping scheme to flatten data models and study the case of straightforward OO model. We show how flattening enabled us to implement a query algebra, using only a very limited set of simple operations. The required primitives and query execution strategies are discussed, and their performance is evaluated on the 1-GByte TPC-D (Transaction-processing Performance Council's Benchmark D), showing that our divide-and-conquer approach yields excellent results.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132023736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fuzzy triggers: incorporating imprecise reasoning into active databases","authors":"A. Wolski, T. Bouaziz","doi":"10.1109/ICDE.1998.655766","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655766","url":null,"abstract":"Traditional event-condition-action triggers (active database rules) include a Boolean predicate as a trigger condition. We propose fuzzy triggers whereby fuzzy inference is utilized in the condition evaluation. In this way, approximate reasoning may be integrated with a traditional crisp database. The new approach paves the way for intuitive expression of application semantics of imprecise nature, in database-bound applications. Two fuzzy trigger models are proposed. Firstly, a set of fuzzy rules is encapsulated into a Boolean-valued function called a rule set function, leading to the C-fuzzy trigger model. Subsequently, actions are expressed also in fuzzy terms, and the corresponding CA-fuzzy trigger model is proposed. Examples are provided to illustrate how fuzzy triggers can be applied to a real-life drive control system in an industrial installation.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"445 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116584881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}