Proceedings 18th International Conference on Data Engineering最新文献_第2页

A framework towards efficient and effective sequence clustering 一种实现高效序列聚类的框架

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994736

Wei Wang, Jiong Yang

引用次数: 0

Content-based video indexing for the support of digital library search 支持数字图书馆检索的基于内容的视频索引

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994766

M. Petkovic, R. V. Zwol, H. Blok, W. Jonker, P. Apers, Menzo Windhouwer, M. Kersten

引用次数: 14

Exploiting local similarity for indexing paths in graph-structured data 利用图结构数据中索引路径的局部相似性

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994703

R. Kaushik, P. Shenoy, P. Bohannon, E. Gudes

{"title":"Exploiting local similarity for indexing paths in graph-structured data","authors":"R. Kaushik, P. Shenoy, P. Bohannon, E. Gudes","doi":"10.1109/ICDE.2002.994703","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994703","url":null,"abstract":"XML and other semi-structured data may have partially specified or missing schema information, motivating the use of a structural summary which can be automatically computed from the data. These summaries also serve as indices for evaluating the complex path expressions common to XML and semi-structured query languages. However, to answer all path queries accurately, summaries must encode information about long, seldom-queried paths, leading to increased size and complexity with little added value. We introduce the A(k)-indices, a family of approximate structural summaries. They are based on the concept of k-bisimilarity, in which nodes are grouped based on local structure, i.e., the incoming paths of length up to k. The parameter k thus smoothly varies the level of detail (and accuracy) of the A(k)-index. For small values of k, the size of the index is substantially reduced. While smaller, the A(k) index is approximate, and we describe techniques for efficiently extracting exact answers to regular path queries. Our experiments show that, for moderate values of k, path evaluation using the A(k)-index ranges from being very efficient for simple queries to competitive for most complex queries, while using significantly less space than comparable structures.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125912970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 295

Out from under the trees [linear file template] 从树下出来[线性文件模板]

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994719

C. Jermaine, E. Omiecinski, Wai Gen Yee

引用次数: 0

Predator-Miner: ad hoc mining of associations rules within a database management system 捕食者-挖掘者:在数据库管理系统中对关联规则进行特别挖掘

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994741

W. Tok, Twee-Hee Ong, Wai Lup Low, I. Atmosukarto, S. Bressan

引用次数: 0

How good are association-rule mining algorithms? 关联规则挖掘算法有多好?

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994730

Vikram Pudi, J. Haritsa

{"title":"How good are association-rule mining algorithms?","authors":"Vikram Pudi, J. Haritsa","doi":"10.1109/ICDE.2002.994730","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994730","url":null,"abstract":"Addresses the question of how much space remains for performance improvement over current association rule mining algorithms. Our approach is to compare their performance against an \"Oracle algorithm\" that knows in advance the identities of all frequent item sets in the database and only needs to gather the actual supports of these item sets, in one scan over the database, to complete the mining process. Clearly, any practical algorithm has to do at least this much work in order to generate mining rules. While the notion of the Oracle is conceptually simple, its construction is not equally straightforward. In particular, it is critically dependent on the choice of data structures and database organizations used during the counting process. We present a carefully engineered implementation of Oracle that makes the best choices for these design parameters at each stage of the counting process. We also present anew mining algorithm, called ARMOR (Association Rule Mining based on ORacle), whose structure is derived by making minimal changes to Oracle, and is guaranteed to complete in two passes over the database. This is in marked contrast to the earlier approaches which designed new algorithms by trying to address the limitations of previous online algorithms. Although ARMOR is derived from Oracle, it shares the positive features of a variety of previous algorithms such as PARTITION, CARMA, AS-CPA, VIPER and DELTA. Our empirical study shows that ARMOR consistently performs within a factor of two of Oracle, over both real and synthetic databases.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127799743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

BestPeer: a self-configurable peer-to-peer system BestPeer:一个自配置的点对点系统

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994726

W. Ng, B. Ooi, K. Tan

引用次数: 83

Fast mining of massive tabular data via approximate distance computations 通过近似距离计算快速挖掘大量表格数据

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994778

Graham Cormode, P. Indyk, Nick Koudas, S. Muthukrishnan

{"title":"Fast mining of massive tabular data via approximate distance computations","authors":"Graham Cormode, P. Indyk, Nick Koudas, S. Muthukrishnan","doi":"10.1109/ICDE.2002.994778","DOIUrl":"https://doi.org/10.1109/ICDE.2002.994778","url":null,"abstract":"Tabular data abound in many data stores: traditional relational databases store tables, and new applications also generate massive tabular datasets. We present methods for determining similar regions in massive tabular data. Our methods are for computing the \"distance\" between any two subregions of tabular data: they are approximate, but highly accurate as we prove mathematically, and they are fast, running in time nearly linear in the table size. Our methods are general since these distance computations can be applied to any mining or similarity algorithms that use L/sub p/ norms. A novelty of our distance computation procedures is that they work for any L/sub p/ norms, not only the traditional p = 2 or p = 1, but for all p /spl les/ 2; the choice of p, say fractional p, provides an interesting alternative similarity behavior! We use our algorithms in a detailed experimental study of the clustering patterns in real tabular data obtained from one of AT&T's data stores and show that our methods are substantially faster than straightforward methods while remaining highly accurate, and able to detect interesting patterns by varying the value of p.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131854694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33

A non-blocking parallel spatial join algorithm 一种非阻塞并行空间连接算法

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994786

Gang Luo, J. Naughton, Curt J. Ellmann

引用次数: 58

Database replication for the mobile era 移动时代的数据库复制

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-08-07 DOI: 10.1109/ICDE.2002.994761

A. Wolski

引用次数: 5