Proceedings 17th International Conference on Data Engineering最新文献_第3页

Differential logging: a commutative and associative logging scheme for highly parallel main memory database 差分日志:一种交换和关联的日志记录方案，用于高度并行的主存数据库

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914826

Juchang Lee, Kihong Kim, S. Cha

{"title":"Differential logging: a commutative and associative logging scheme for highly parallel main memory database","authors":"Juchang Lee, Kihong Kim, S. Cha","doi":"10.1109/ICDE.2001.914826","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914826","url":null,"abstract":"With a GByte of memory priced at less than $2000, main-memory DBMSs (MMDBMSs) are emerging as an economically viable alternative to disk-resident DBMSs (DRDBMSs) in many problem domains. The MMDBMS can show significantly higher performance than the DRDBMS by reducing disk accesses to the sequential form of log writing and occasional checkpointing. Upon a system crash, the recovery process begins by accessing the disk-resident log and checkpoint data to restore a consistent state. With increasing CPU speed, however, such disk access is still the dominant bottleneck in MMDBMSs. To overcome this bottleneck, this paper explores alternatives of parallel logging and recovery. The major contribution of this paper is the so-called differential logging scheme that permits unrestricted parallelism in logging and recovery. Using the bit-wise XOR operation both to compute the differential log between the before and after images and to recover the consistent database state, this scheme offers the room for significant performance improvement in the MMDBMS. First, with logging done on the difference, the log volume is reduced to almost half compared with the conventional physical logging. Second, the commutativity and associativity of XOR enables processing of log records in an arbitrary order. This means that we can freely distribute log records to multiple disks to improve the logging performance. During the recovery time, we can do a parallel restart independently for each log disk. This paper shows the superior performance of the differential logging compared to the physical logging in a shared-memory multiprocessor environment.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122405580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

Prefetching based on the type-level access pattern in object-relational DBMSs 对象关系dbms中基于类型级访问模式的预取

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914880

Wook-Shin Han, Yang-Sae Moon, K. Whang, I. Song

{"title":"Prefetching based on the type-level access pattern in object-relational DBMSs","authors":"Wook-Shin Han, Yang-Sae Moon, K. Whang, I. Song","doi":"10.1109/ICDE.2001.914880","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914880","url":null,"abstract":"Prefetching is an effective method for minimizing the number of round-trips between the client and the server in database management systems. We propose new notions of the type-level access locality and the type-level access pattern. We also formally define the notions of capturing and prefetching to help understand the underlying mechanisms. We then develop an efficient prefetching policy based on these notions and the framework. The type-level access locality is a phenomenon that repetitive patterns exist in the attributes referenced. The type-level access pattern is a pattern of attributes that are referenced in accessing the objects. Existing prefetching methods are based on object-level or page-level access patterns, which consist of object-ids or page-ids of the objects accessed. However the drawback of these methods is that they work only when exactly the same objects or pages are accessed repeatedly. In contrast even though the same objects are not accessed repeatedly our technique effectively prefetches objects if the same attributes are referenced repeatedly, i.e., if there is type-level access locality. Many navigational applications in object-relational database management systems (ORDBMSs) have type-level access locality. Therefore, our technique can be employed in ORDBMSs to effectively reduce the number of round trips, thereby significantly enhancing the performance.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114906851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Cache-aware query routing in a cluster of databases 数据库集群中的缓存感知查询路由

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914879

Uwe Röhm, Klemens Böhm, H. Schek

引用次数: 37

Querying XML documents made easy: nearest concept queries 查询XML文档变得容易:最近的概念查询

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914844

A. Schmidt, M. Kersten, Menzo Windhouwer

{"title":"Querying XML documents made easy: nearest concept queries","authors":"A. Schmidt, M. Kersten, Menzo Windhouwer","doi":"10.1109/ICDE.2001.914844","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914844","url":null,"abstract":"Due to the ubiquity and popularity of XML, users often are in the following situation: they want to query XML documents which contain potentially interesting information but they are unaware of the mark-up structure that is used. For example, it is easy to guess the contents of an XML bibliography file whereas the mark-up depends on the methodological, cultural and personal background of the author(s). None the less, it is this hierarchical structure that forms the basis of XML query languages. We exploit the tree structure of XML documents to equip users with a powerful tool, the meet operator that lets them query databases with whose content they are familiar, but without requiring knowledge of tags and hierarchies. Our approach is based on computing the lowest common ancestor of nodes in the XML syntax tree: e.g., given two strings, we are looking for nodes whose offspring contains these two strings. The novelty of this approach is that the result type is unknown at query formulation time and dependent on the database instance. If the two strings are an author's name and a year mainly publications of the author in this year are returned. If the two strings are numbers the result mostly consists of publications that have the numbers as year or page numbers. Because the result type of a query is not specified by the user we refer to the lowest common ancestor as nearest concept. We also present a running example taken from the bibliography domain, and demonstrate that the operator can be implemented efficiently.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131594572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 126

Infrastructure for Web-based application integration 基于web的应用程序集成的基础设施

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914860

D. Gawlick

引用次数: 8

PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth PrefixSpan，通过前缀投影模式增长有效地挖掘序列模式

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914830

J. Pei, Jiawei Han, B. Mortazavi-Asl, Helen Pinto, Qiming Chen, U. Dayal, M. Hsu

{"title":"PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth","authors":"J. Pei, Jiawei Han, B. Mortazavi-Asl, Helen Pinto, Qiming Chen, U. Dayal, M. Hsu","doi":"10.1109/ICDE.2001.914830","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914830","url":null,"abstract":"Sequential pattern mining is an important data mining problem with broad applications. It is challenging since one may need to examine a combinatorially explosive number of possible subsequence patterns. Most of the previously developed sequential pattern mining methods follow the methodology of A priori which may substantially reduce the number of combinations to be examined. Howeve6 Apriori still encounters problems when a sequence database is large andor when sequential patterns to be mined are numerous ano we propose a novel sequential pattern mining method, called Prefixspan (i.e., Prefix-projected - Ettern_ mining), which explores prejxprojection in sequential pattern mining. Prefixspan mines the complete set of patterns but greatly reduces the efforts of candidate subsequence generation. Moreover; prefi-projection substantially reduces the size of projected databases and leads to efJicient processing. Our performance study shows that Prefixspan outperforms both the Apriori-based GSP algorithm and another recently proposed method; Frees pan, in mining large sequence data bases.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128649668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2158

A graph-based approach for extracting terminological properties of elements of XML documents 一种基于图的方法，用于提取XML文档元素的术语属性

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914845

L. Palopoli, G. Terracina, D. Ursino

引用次数: 37

Block oriented processing of relational database operations in modern computer architectures 现代计算机体系结构中关系数据库操作的面向块处理

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914871

S. Padmanabhan, Timothy Malkemus, R. Agarwal, A. Jhingran

引用次数: 101

An automated change-detection algorithm for HTML documents based on semantic hierarchies 基于语义层次结构的HTML文档的自动更改检测算法

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914842

S. Lim, Yiu-Kai Ng

{"title":"An automated change-detection algorithm for HTML documents based on semantic hierarchies","authors":"S. Lim, Yiu-Kai Ng","doi":"10.1109/ICDE.2001.914842","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914842","url":null,"abstract":"The data at many Web sites is changing rapidly, and a significant amount of this data is presented in HTML documents that consist of markups and data contents. Although XML is becoming more popular for data exchange, the presentation of data contained in XML documents is given, by and large, in the HTML format using XSL(T). Since HTML was designed to \"display\" data from the human perspective, it is not trivial for a machine to detect (hierarchical) changes of data in an HTML document. In this paper, we propose a heuristic algorithm, called SCD (Semantic Change Detection), to detect semantic changes to the hierarchical data contents in any two HTML documents automatically. Semantic changes differ from syntactic changes since the latter refer to changes of data contents with respect to markup structures according to the HTML grammar. SCD does not require pre-processing, nor any knowledge of the internal structure of the source documents beforehand. The time complexity of SCD is O[(|X|/spl times/|Y|)log(|X|/spl times/|Y|)], where |X| and |Y| are the number of unique branches in the syntactic hierarchies of any two given HTML documents, respectively.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122366092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 51

Selectivity estimation for spatial joins 空间连接的选择性估计

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914849

N. An, Zhen-Yu Yang, A. Sivasubramaniam

{"title":"Selectivity estimation for spatial joins","authors":"N. An, Zhen-Yu Yang, A. Sivasubramaniam","doi":"10.1109/ICDE.2001.914849","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914849","url":null,"abstract":"Spatial joins are important and time consuming operations in spatial database management systems. It is crucial to be able to accurately estimate the performance of these operations so that one can derive efficient query execution plans, and even develop/refine data structures to improve their performance. While estimation techniques for analyzing the performance of other operations, such as range queries, on spatial data has come under scrutiny, the problem of estimating selectivity for spatial joins has been little explored. The limited forays into this area have used parametric techniques, which are largely restrictive on the datasets that they can be used for since they tend to make simplifying assumptions about the nature of the datasets to be joined. Sampling and histogram based techniques, on the other hand, are much less restrictive. However, there has been no prior attempt at understanding the accuracy of sampling techniques, or developing histogram based techniques to estimate the selectivity of spatial joins. Apart from extensively evaluating the accuracy of sampling techniques for the very first time, this paper presents two novel histogram based solutions for spatial join estimation. Using a wide spectrum of both real and synthetic datasets, it is shown that one of our proposed schemes, called Geometric Histograms (GH), can accurately quantify the selectivity of spatial joins.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124646885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 52