Proceedings 18th International Conference on Data Engineering最新文献_第6页

Geometric-similarity retrieval in large image bases 大型图像库的几何相似检索

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994757

I. Fudos, Leonidas Palios, E. Pitoura

{"title":"Geometric-similarity retrieval in large image bases","authors":"I. Fudos, Leonidas Palios, E. Pitoura","doi":"10.1109/ICDE.2002.994757","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994757","url":null,"abstract":"We propose a novel approach to shape-based image retrieval that builds upon a similarity criterion which is based on the average point set distance. Compared to traditional techniques, such as dimensionality reduction, our method exhibits better behavior in that it maintains the average topology of shapes independently of the number of points used to represent them and is more resilient to noise. An efficient algorithm is presented based on an incremental \"fattening,\" of the query shape until the best match is discovered. The algorithm uses simplex range search techniques and fractional cascading to provide an average polylogarithmic time complexity on the total number of shape vertices. The algorithm is extended to perform additional fast approximate matching, when there is no image sufficiently similar to the query image. We present techniques for the efficient external storage of the shape base and of the auxiliary geometric data structures used by the algorithm. Finally, we show how our approach can be used for processing queries, containing pairwise relations of object boundaries such as contain, tangent, and overlap. Such queries are either extracted from some user drafted sketch or defined explicitly by the user. Alternative methods are presented for forming query execution plans.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116748449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Approximating a data stream for querying and estimation: algorithms and performance evaluation 用于查询和估计的近似数据流:算法和性能评估

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994775

S. Guha, Nick Koudas

{"title":"Approximating a data stream for querying and estimation: algorithms and performance evaluation","authors":"S. Guha, Nick Koudas","doi":"10.1109/ICDE.2002.994775","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994775","url":null,"abstract":"Obtaining fast and good-quality approximations to data distributions is a problem of central interest to database management. A variety of popular database applications, including approximate querying, similarity searching and data mining in most application domains, rely on such good-quality approximations. Histogram-based approximation is a very popular method in database theory and practice to succinctly represent a data distribution in a space-efficient manner. In this paper, we place the problem of histogram construction into perspective and we generalize it by raising the requirement of a finite data set and/or known data set size. We consider the case of an infinite data set in which data arrive continuously, forming an infinite data stream. In this context, we present single-pass algorithms that are capable of constructing histograms of provable good quality. We present algorithms for the fixed-window variant of the basic histogram construction problem, supporting incremental maintenance of the histograms. The proposed algorithms trade accuracy for speed and allow for a graceful tradeoff between the two, based on application requirements. In the case of approximate queries on infinite data streams, we present a detailed experimental evaluation comparing our algorithms with other applicable techniques using real data sets, demonstrating the superiority of our proposal.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128397768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 109

YFilter: efficient and scalable filtering of XML documents YFilter:高效和可伸缩的XML文档过滤

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994748

Y. Diao, Peter M. Fischer, M. Franklin, Raymond To

引用次数: 293

Fjording the stream: an architecture for queries over streaming sensor data Fjording the stream:一种对流传感器数据进行查询的架构

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994774

S. Madden, M. Franklin

引用次数: 602

Techniques for storing XML 存储XML的技术

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994740

M. Fernández, S. Amer-Yahia

{"title":"Techniques for storing XML","authors":"M. Fernández, S. Amer-Yahia","doi":"10.1109/ICDE.2002.994740","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994740","url":null,"abstract":"XML is the de facto standard for data exchange between applications on the Web. Applications, such as electronic markets, will produce and consume large volumes of data and therefore will require efficient and reliable storage and retrieval of XML data. Many techniques for XML storage have been proposed, including flat files, relational database management systems, object-oriented database systems, LDAP directories, and native XML database systems. To better understand the requirements of XML storage systems, we first review various classes of XML documents including highly structured data as stored in relational databases, \"mixed\" content from document-processing applications, and \"streams-oriented\" data from ecommerce and transactional applications. We also consider the types of queries typically applied to these classes of documents. In the second part, we present features of the XQuery and XPath data model that must be supported by an XML storage system and then we describe in detail a variety of storage alternatives from industry and research. We focus on techniques that use relational storage. Typically, these techniques produce a logical relational schema for the XML data and treat the storage system as an \"black box\". In the last part of the tutorial, we consider new techniques that open the storage system's \"black box\" so that we can take advantage of physical-layout features.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115517024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Efficient temporal join processing using indices 使用索引进行有效的临时连接处理

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994701

Donghui Zhang, V. Tsotras, B. Seeger

{"title":"Efficient temporal join processing using indices","authors":"Donghui Zhang, V. Tsotras, B. Seeger","doi":"10.1109/ICDE.2002.994701","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994701","url":null,"abstract":"We examine the problem of processing temporal joins in the presence of indexing schemes. Previous work on temporal joins has concentrated on non-indexed relations which were fully scanned. Given the large data volumes created by the ever increasing time dimension, sequential scanning is prohibitive. This is especially true when the temporal join involves only parts of the joining relations (e.g., a given time interval instead of the whole timeline). Utilizing an index becomes then beneficial as it directs the join to the data of interest. We consider temporal join algorithms for three representative indexing schemes, namely a B+-tree, an R*-tree and a temporal index, the Multiversion B+-tree (MVBT). Both the B+-tree and R*-tree result in simple but not efficient join algorithms because neither index achieves good temporal data clustering. Better clustering is maintained by the MVBT through record copying. Nevertheless, copies can greatly affect the correctness and effectiveness of the join algorithms. We identify these problems and propose efficient solutions and optimizations. An extensive comparison of all index based temporal joins, using a variety of datasets and query characteristics shows that the MVBT based join algorithms are consistently faster. In particular the link-based algorithm has the most robust behavior. In our experiments it showed a ten fold improvement over the R*-tree joins while it was between six and thirty times faster than the B+-tree joins.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115182316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 70

DBXplorer: a system for keyword-based search over relational databases DBXplorer:一个基于关键字的关系数据库搜索系统

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994693

S. Agrawal, S. Chaudhuri, Gautam Das

引用次数: 879

Mixing querying and navigation in MIX 混合MIX中的查询和导航

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994714

Pratik Mukhopadhyay, Y. Papakonstantinou

引用次数: 20

Streaming-data algorithms for high-quality clustering 用于高质量聚类的流数据算法

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994785

Liadan O'Callaghan, A. Meyerson, R. Motwani, Nina Mishra, S. Guha

引用次数: 681

Design and implementation of a high-performance distributed Web crawler 高性能分布式Web爬虫的设计与实现

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994750

Vladislav Shkapenyuk, Torsten Suel

引用次数: 410