Proceedings 17th International Conference on Data Engineering最新文献

筛选
英文 中文
An index structure for efficient reverse nearest neighbor queries 用于高效反向最近邻查询的索引结构
Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914862
Congjun Yang, King-Ip Lin
{"title":"An index structure for efficient reverse nearest neighbor queries","authors":"Congjun Yang, King-Ip Lin","doi":"10.1109/ICDE.2001.914862","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914862","url":null,"abstract":"The Reverse Nearest Neighbor (RNN) problem is to find all points in a given data set whose nearest neighbor is a given query point. Just like the Nearest Neighbor (NN) queries, the RNN queries appear in many practical situations such as marketing and resource management. Thus, efficient methods for the RNN queries in databases are required. The paper introduces a new index structure, the Rdnn-tree, that answers both RNN and NN queries efficiently. A single index structure is employed for a dynamic database, in contrast to the use of multiple indexes in previous work. This leads to significant savings in dynamically maintaining the index structure. The Rdnn-tree outperforms existing methods in various aspects. Experiments on both synthetic and real world data show that our index structure outperforms previous methods by a significant margin (more than 90% in terms of number of leaf nodes accessed) in RNN queries. It also shows improvement in NN queries over standard techniques. Furthermore, performance in insertion and deletion is significantly enhanced by the ability to combine multiple queries (NN and RNN) in one traversal of the tree. These facts make our index structure extremely preferable in both static and dynamic cases.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115789428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 186
An efficient approximation scheme for data mining tasks 数据挖掘任务的有效逼近方案
Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914858
G. Kollios, D. Gunopulos, Nick Koudas, Stefan Berchtold
{"title":"An efficient approximation scheme for data mining tasks","authors":"G. Kollios, D. Gunopulos, Nick Koudas, Stefan Berchtold","doi":"10.1109/ICDE.2001.914858","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914858","url":null,"abstract":"We investigate the use of biased sampling according to the density of the dataset, to speed up the operation of general data mining tasks, such as clustering and outlier detection in large multidimensional datasets. In density biased sampling, the probability that a given point will be included in the sample depends on the local density of the dataset. We propose a general technique for density-biased sampling that can factor in user requirements to sample for properties of interest, and can be tuned for specific data mining tasks. This allows great flexibility and improved accuracy of the results over simple random sampling. We describe our approach in detail, we analytically evaluate it, and show how it can be optimized for approximate clustering and outlier detection. Finally we present a thorough experimental evaluation of the proposed method, applying density-biased sampling on real and synthetic data sets, and employing clustering and outlier detection algorithms, thus highlighting the utility of our approach.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116407752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Tuning an SQL-based PDM system in a worldwide client/server environment 在全球客户机/服务器环境中调优基于sql的PDM系统
Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914818
Erich Müller, P. Dadam, Jost Enderle, M. Feltes
{"title":"Tuning an SQL-based PDM system in a worldwide client/server environment","authors":"Erich Müller, P. Dadam, Jost Enderle, M. Feltes","doi":"10.1109/ICDE.2001.914818","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914818","url":null,"abstract":"The management of product-related data in a uniform and consistent way is a big challenge for many manufacturing enterprises, especially the large ones, like DaimlerChrysler. So-called product data management (PDM) systems are a promising way to achieve this goal. For various reasons, PDM systems often sit on top of a relational DBMS, using it (more or less) as a simple record manager. User interactions with the PDM systems are translated into a series of SQL queries. This does not cause too much harm when the DBMS and PDM system are located in the same local area network, with high bandwidth and short latency times. The picture may change dramatically, however, if the users are working in geographically distributed environments. Response times may rise by orders of magnitude, e.g. from 1-2 minutes in the local context to 30 minutes and even more in the \"inter-continental\" context. This paper shows how a more sophisticated utilization of the (advanced) SQL features coming along with SQL:1999 can help to cut down response times significantly.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122236158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Measuring and optimizing a system for persistent database sessions 测量和优化持久化数据库会话的系统
Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914810
R. Barga, D. Lomet
{"title":"Measuring and optimizing a system for persistent database sessions","authors":"R. Barga, D. Lomet","doi":"10.1109/ICDE.2001.914810","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914810","url":null,"abstract":"High availability for both data and applications is rapidly becoming a business requirement. While database systems support recovery, providing high database availability, applications may still lose work because of server outages. When a server crashes, any volatile state associated with the application's database session is lost and the application may require an operator-assisted restart. This exposes server failures to end-users and always degrades application availability. Our Phoenix/ODBC system supports persistent database sessions that can survive a database crash without the application being aware of the outage, except for possible timing considerations. This improves application availability and eliminates the application programming needed to cope with database crashes. Phoenix/ODBC requires no changes to the database system, data access routines or applications. Hence, it can be deployed in any application that uses ODBC to access a database. Further, our generic approach can be exploited for a variety of data access protocols. In this paper, we describe the design of Phoenix/ODBC and introduce an extension to optimize the response time and to reduce overhead for OLTP workloads. We present a performance evaluation using the TPC-C and TPC-H benchmarks that demonstrate Phoenix/ODBC's extra overhead is modest.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131508968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Mining frequent itemsets with convertible constraints 挖掘具有可转换约束的频繁项集
Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914856
J. Pei, Jiawei Han, L. Lakshmanan
{"title":"Mining frequent itemsets with convertible constraints","authors":"J. Pei, Jiawei Han, L. Lakshmanan","doi":"10.1109/ICDE.2001.914856","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914856","url":null,"abstract":"Recent work has highlighted the importance of the constraint based mining paradigm in the context of frequent itemsets, associations, correlations, sequential patterns, and many other interesting patterns in large databases. The authors study constraints which cannot be handled with existing theory and techniques. For example, avg(S) /spl theta/ /spl nu/, median(S) /spl theta/ /spl nu/, sum(S) /spl theta/ /spl nu/ (S can contain items of arbitrary values) (/spl theta//spl isin/{/spl ges/, /spl les/}), are customarily regarded as \"tough\" constraints in that they cannot be pushed inside an algorithm such as a priori. We develop a notion of convertible constraints and systematically analyze, classify, and characterize this class. We also develop techniques which enable them to be readily pushed deep inside the recently developed FP-growth algorithm for frequent itemset mining. Results from our detailed experiments show the effectiveness of the techniques developed.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128992169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 385
On dual mining: from patterns to circumstances, and back 关于双重挖掘:从模式到环境,再回来
Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914828
G. Grahne, L. Lakshmanan, Xiaohong Wang, M. Xie
{"title":"On dual mining: from patterns to circumstances, and back","authors":"G. Grahne, L. Lakshmanan, Xiaohong Wang, M. Xie","doi":"10.1109/ICDE.2001.914828","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914828","url":null,"abstract":"Previous work on frequent item set mining has focused on finding all itemsets that are frequent in a specified part of a database. We motivate the dual question of finding under what circumstances a given item set satisfies a pattern of interest (e.g., frequency) in a database. Circumstances form a lattice that generalizes the instance lattice associated with datacube. Exploiting this, we adapt known cube algorithms and propose our own, minCirc, for mining the strongest (e.g., minimal) circumstances under which an itemset satisfies a pattern. Our experiments show that minCirc is competitive with the adapted algorithms. We motivate mining queries involving migration between item set and circumstance lattices and propose the notion of Armstrong Basis as a structure that provides efficient support for such migration queries, as well as a simple algorithm for computing it.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"5 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121016035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Exactly-once semantics in a replicated messaging system 复制消息传递系统中的一次语义
Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914808
Yongqiang Huang, H. Garcia-Molina
{"title":"Exactly-once semantics in a replicated messaging system","authors":"Yongqiang Huang, H. Garcia-Molina","doi":"10.1109/ICDE.2001.914808","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914808","url":null,"abstract":"A wide-area distributed message delivery system can use replication to improve performance and availability. However, without safeguards, replicated messages may be delivered to a mobile device more than once, making the device's user repeat actions (e.g. making unnecessary phone calls, firing weapons repeatedly). Alternatively, they may not be delivered at all, making the user miss important messages. In this paper, we address the problem of exactly-once delivery to mobile clients when messages are replicated globally. We define exactly-once semantics and propose algorithms to guarantee it. We also propose and define a relaxed version of exactly-once semantics which is appropriate for limited-capability mobile devices. We study the relative performance of our algorithms compared to the weaker at-least-once semantics, and find that the performance overhead of exactly-once can be minimized in most cases by careful design of the system.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123736243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
An index-based approach for similarity search supporting time warping in large sequence databases 一种支持大型序列数据库时间规整的基于索引的相似性搜索方法
Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914875
Sang-Wook Kim, Sanghyun Park, W. Chu
{"title":"An index-based approach for similarity search supporting time warping in large sequence databases","authors":"Sang-Wook Kim, Sanghyun Park, W. Chu","doi":"10.1109/ICDE.2001.914875","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914875","url":null,"abstract":"This paper proposes a new novel method for similarity search that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. Previous methods for processing similarity search that supports time warping fail to employ multi-dimensional indexes without false dismissal since the time warping distance does not satisfy the triangular inequality. Our primary goal is to innovate on search performance without permitting any false dismissal. To attain this goal, we devise a new distance function D/sub tw-lb/ that consistently underestimates the time warping distance and also satisfies the triangular inequality D/sub tw-lb/ uses a 4-tuple feature vector that is extracted from each sequence and is invariant to time warping. For efficient processing of similarity search, we employ a multi-dimensional index that uses the 4-tuple feature vector as indexing attributes and D/sub tw-lb/ as a distance function. The extensive experimental results reveal that our method achieves significant speedup up to 43 times with real-world S&P 500 stock data and up to 720 times with very large synthetic data.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129251051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 324
Rewriting OLAP queries using materialized views and dimension hierarchies in data warehouses 在数据仓库中使用物化视图和维度层次重写OLAP查询
Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914865
Chang-Sup Park, Myoung-Ho Kim, Yoon-Joon Lee
{"title":"Rewriting OLAP queries using materialized views and dimension hierarchies in data warehouses","authors":"Chang-Sup Park, Myoung-Ho Kim, Yoon-Joon Lee","doi":"10.1109/ICDE.2001.914865","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914865","url":null,"abstract":"OLAP queries involve a lot of aggregations on a large amount of data in data warehouses. To process expensive OLAP queries efficiently, we propose a new method for rewriting a given OLAP query using the various kinds of materialized aggregate views which already exist in data warehouses. We first define the normal forms of OLAP queries and materialized views based on the lattice of dimension hierarchies and the semantic information in data warehouses. Conditions for the usability of a materialized view in rewriting a given query are specified by relationships between the components of their normal forms. We present a rewriting algorithm for OLAP queries that effectively utilizes existing materialized views. The proposed algorithm can make use of materialized views having different selection granularities, selection regions and aggregation granularities together, to generate an efficient rewritten query.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125981912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Bringing the Internet to your database: using SQL server 2000 and XML to build loosely-coupled systems 将Internet引入数据库:使用SQL server 2000和XML构建松耦合系统
Proceedings 17th International Conference on Data Engineering Pub Date : 2001-03-07 DOI: 10.1109/ICDE.2001.914859
M. Rys
{"title":"Bringing the Internet to your database: using SQL server 2000 and XML to build loosely-coupled systems","authors":"M. Rys","doi":"10.1109/ICDE.2001.914859","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914859","url":null,"abstract":"Loosely-coupled, distributed system architectures need to be flexible enough to allow individual components to join or leave the heterogeneous conglomerate of services and components and to change their internal design and data models without jeopardizing the whole architecture. A well-established approach is to use XML as the lingua franca for the integration layer that hides the heterogeneity among the components and provides the glue that allows the individual components to take part in the loosely integrated system. The article focuses on how to provide the basic technology to enable a relational database to become a component in such loosely-coupled systems and it provides an overview of the features that are needed to provide access via HTTP and XML.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133294532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信