21st International Conference on Data Engineering (ICDE'05)最新文献

筛选
英文 中文
Odysseus: a high-performance ORDBMS tightly-coupled with IR features 奥德修斯:一个高性能ORDBMS与IR特性紧密耦合
21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.95
K. Whang, Min-Jae Lee, Jae-Gil Lee, Min-Soo Kim, Wook-Shin Han
{"title":"Odysseus: a high-performance ORDBMS tightly-coupled with IR features","authors":"K. Whang, Min-Jae Lee, Jae-Gil Lee, Min-Soo Kim, Wook-Shin Han","doi":"10.1109/ICDE.2005.95","DOIUrl":"https://doi.org/10.1109/ICDE.2005.95","url":null,"abstract":"We propose the notion of tight-coupling [K. Whang et al., (1999)] to add new data types into the DBMS engine. In this paper, we introduce the Odysseus ORDBMS and present its tightly-coupled IR features (US patented). We demonstrate a Web search engine capable of managing 20 million Web pages in a non-parallel configuration using Odysseus.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121487947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Reverse nearest neighbors in large graphs 在大图中反转最近邻
21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.124
Man Lung Yiu, D. Papadias, N. Mamoulis, Yufei Tao
{"title":"Reverse nearest neighbors in large graphs","authors":"Man Lung Yiu, D. Papadias, N. Mamoulis, Yufei Tao","doi":"10.1109/ICDE.2005.124","DOIUrl":"https://doi.org/10.1109/ICDE.2005.124","url":null,"abstract":"A reverse nearest neighbor query returns the data objects that have a query point as their nearest neighbor. Although such queries have been studied quite extensively in Euclidean spaces, there is no previous work in the context of large graphs. In this paper, we propose algorithms and optimization techniques for RNN queries by utilizing some characteristics of networks.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122388789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Data privacy through optimal k-anonymization 通过最优k-匿名化实现数据隐私
21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.42
R. Bayardo, R. Agrawal
{"title":"Data privacy through optimal k-anonymization","authors":"R. Bayardo, R. Agrawal","doi":"10.1109/ICDE.2005.42","DOIUrl":"https://doi.org/10.1109/ICDE.2005.42","url":null,"abstract":"Data de-identification reconciles the demand for release of data for research purposes and the demand for privacy from individuals. This paper proposes and evaluates an optimization algorithm for the powerful de-identification procedure known as k-anonymization. A k-anonymized dataset has the property that each record is indistinguishable from at least k - 1 others. Even simple restrictions of optimized k-anonymity are NP-hard, leading to significant computational challenges. We present a new approach to exploring the space of possible anonymizations that tames the combinatorics of the problem, and develop data-management strategies to reduce reliance on expensive operations such as sorting. Through experiments on real census data, we show the resulting algorithm can find optimal k-anonymizations under two representative cost measures and a wide range of k. We also show that the algorithm can produce good anonymizations in circumstances where the input data or input parameters preclude finding an optimal solution in reasonable time. Finally, we use the algorithm to explore the effects of different coding approaches and problem variations on anonymization quality and performance. To our knowledge, this is the first result demonstrating optimal k-anonymization of a non-trivial dataset under a general model of the problem.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122838802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1327
Filter based directory replication and caching 基于过滤器的目录复制和缓存
21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.67
Apurva Kumar
{"title":"Filter based directory replication and caching","authors":"Apurva Kumar","doi":"10.1109/ICDE.2005.67","DOIUrl":"https://doi.org/10.1109/ICDE.2005.67","url":null,"abstract":"This paper describes a novel filter based replication model for lightweight directory access protocol (LDAP) directories. Instead of replicating entire subtrees from the directory information tree (DIT), only entries matching a filter specification are replicated Advantages of the filter based replication framework over existing subtree based mechanisms have been demonstrated for a real enterprise directory using real workloads.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128075533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
On the optimal ordering of maps and selections under factorization 分解下映射的最优排序与选择
21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.97
Thomas Neumann, S. Helmer, G. Moerkotte
{"title":"On the optimal ordering of maps and selections under factorization","authors":"Thomas Neumann, S. Helmer, G. Moerkotte","doi":"10.1109/ICDE.2005.97","DOIUrl":"https://doi.org/10.1109/ICDE.2005.97","url":null,"abstract":"The query optimizer of a database system is confronted with two aspects when handling user-defined functions (UDFs) in query predicates: the vast differences in evaluation costs between UDFs (and other functions) and multiple calls of the same (expensive) UDF The former is dealt with by ordering the evaluation of the predicates optimally, the latter by identifying common subexpressions and thereby avoiding costly recomputation. Current approaches order n predicates optimally (neglecting factorization) in O(nlogn). Their result may deviate significantly from the optimal solution under factorization. We formalize the problem of finding optimal orderings under factorization and prove that it is NP-hard. Furthermore, we show how to improve on the run time of the brute-force algorithm (which computes all possible orderings) by presenting different enhanced algorithms. Although in the worst case these algorithms obviously still behave exponentially, our experiments demonstrate that for real-life examples their performance is much better.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117187879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Adaptive overlapped declustering: a highly available data-placement method balancing access load and space utilization 自适应重叠聚类:一种平衡访问负载和空间利用率的高可用数据放置方法
21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.16
Akitsugu Watanabe, H. Yokota
{"title":"Adaptive overlapped declustering: a highly available data-placement method balancing access load and space utilization","authors":"Akitsugu Watanabe, H. Yokota","doi":"10.1109/ICDE.2005.16","DOIUrl":"https://doi.org/10.1109/ICDE.2005.16","url":null,"abstract":"This paper proposes a new data-placement method named adaptive overlapped declustering, which can be applied to a parallel storage system using a value range partitioning-based distributed directory and primary-backup data replication, to improve the space utilization by balancing their access loads. The proposed method reduces data skews generated by data migration for balancing access load. While some data-placement methods capable of balancing access load or reducing data skew have been proposed, both requirements satisfied simultaneously. The proposed method also improves the reliability and availability of the system because it reduces recovery time for damaged backups after a disk failure. The method achieves this acceleration by reducing a large amount of network communications and disk I/O. Mathematical analysis shows the efficiency of space utilization under skewed access workloads. Queuing simulations demonstrated that the proposed method halves backup restoration time, compared with the traditional chained declustering method.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"261 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115012446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Adaptive process management with ADEPT2 使用ADEPT2进行自适应流程管理
21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.17
M. Reichert, S. Rinderle-Ma, U. Kreher, P. Dadam
{"title":"Adaptive process management with ADEPT2","authors":"M. Reichert, S. Rinderle-Ma, U. Kreher, P. Dadam","doi":"10.1109/ICDE.2005.17","DOIUrl":"https://doi.org/10.1109/ICDE.2005.17","url":null,"abstract":"In the ADEPT project we have been working on the design and implementation of next generation process management software. Based on a conceptual framework for dynamic process changes, on novel process support functions, and on advanced implementation concepts, the developed system enables the realization of adaptive, process-aware information systems (PAIS). Basically, process changes can take place at the type as well as the instance level: changes of single process instances may have to be carried out in an ad-hoc manner and must not affect system robustness and consistency. Process type changes, in turn, must be quickly accomplished in order to adapt the PAIS to business process changes. ADEPT2 offers powerful concepts for modeling, analyzing, and verifying process schemes. Particularly, it ensures schema correctness, like the absence of deadlock-causing cycles or erroneous data flows. This, in turn, constitutes an important prerequisite for dynamic process changes as well. ADEPT2 supports both ad-hoc changes of single process instances and the propagation of process type changes to running instances.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115485948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 187
Corpus-based schema matching 基于语料库的模式匹配
21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.39
J. Madhavan, P. Bernstein, A. Doan, A. Halevy
{"title":"Corpus-based schema matching","authors":"J. Madhavan, P. Bernstein, A. Doan, A. Halevy","doi":"10.1109/ICDE.2005.39","DOIUrl":"https://doi.org/10.1109/ICDE.2005.39","url":null,"abstract":"Schema matching is the problem of identifying corresponding elements in different schemas. Discovering these correspondences or matches is inherently difficult to automate. Past solutions have proposed a principled combination of multiple algorithms. However, these solutions sometimes perform rather poorly due to the lack of sufficient evidence in the schemas being matched. In this paper we show how a corpus of schemas and mappings can be used to augment the evidence about the schemas being matched, so they can be matched better. Such a corpus typically contains multiple schemas that model similar concepts and hence enables us to learn variations in the elements and their properties. We exploit such a corpus in two ways. First, we increase the evidence about each element being matched by including evidence from similar elements in the corpus. Second, we learn statistics about elements and their relationships and use them to infer constraints that we use to prune candidate mappings. We also describe how to use known mappings to learn the importance of domain and generic constraints. We present experimental results that demonstrate corpus-based matching outperforms direct matching (without the benefit of a corpus) in multiple domains.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123330488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 435
An enhanced query model for soccer video retrieval using temporal relationships 基于时间关系的足球视频检索增强查询模型
21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.20
Shu‐Ching Chen, M. Shyu, Na Zhao
{"title":"An enhanced query model for soccer video retrieval using temporal relationships","authors":"Shu‐Ching Chen, M. Shyu, Na Zhao","doi":"10.1109/ICDE.2005.20","DOIUrl":"https://doi.org/10.1109/ICDE.2005.20","url":null,"abstract":"The focal goal of our research is to develop a general framework which can automatically analyze the sports video, detect the sports events, and finally offer an efficient and user-friendly system for sports video retrieval. In our earlier work, a novel multimedia data mining technique was proposed for automatic soccer event extraction by adopting multimodal feature analysis. Until now, this framework has been performed on the detection of goal and corner kick events and the results are quite impressive. Correspondingly, in this work, the detected video events are modeled and effectively stored in the database. A temporal query model is designed to satisfy the comprehensive temporal query requirements, and the corresponding graphical query language is developed. The advanced characteristics make our model particularly well suited for searching events in a large scale video database.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125959158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Exploiting correlated attributes in acquisitional query processing 在获取查询处理中利用相关属性
21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.63
A. Deshpande, Carlos Guestrin, W. Hong, S. Madden
{"title":"Exploiting correlated attributes in acquisitional query processing","authors":"A. Deshpande, Carlos Guestrin, W. Hong, S. Madden","doi":"10.1109/ICDE.2005.63","DOIUrl":"https://doi.org/10.1109/ICDE.2005.63","url":null,"abstract":"Sensor networks and other distributed information systems (such as the Web) must frequently access data that has a high per-attribute acquisition cost, in terms of energy, latency, or computational resources. When executing queries that contain several predicates over such expensive attributes, we observe that it can be beneficial to use correlations to automatically introduce low-cost attributes whose observation will allow the query processor to better estimate die selectivity of these expensive predicates. In particular, we show how to build conditional plans that branch into one or more sub-plans, each with a different ordering for the expensive query predicates, based on the runtime observation of low-cost attributes. We frame the problem of constructing the optimal conditional plan for a given user query and set of candidate low-cost attributes as an optimization problem. We describe an exponential time algorithm for finding such optimal plans, and describe a polynomial-time heuristic for identifying conditional plans that perform well in practice. We also show how to compactly model conditional probability distributions needed to identify correlations and build these plans. We evaluate our algorithms against several real-world sensor-network data sets, showing several-times performance increases for a variety of queries versus traditional optimization techniques.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129358476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 138
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信