Proceedings. 20th International Conference on Data Engineering最新文献

筛选
英文 中文
Algebraic signatures for scalable distributed data structures 可扩展分布式数据结构的代数签名
Proceedings. 20th International Conference on Data Engineering Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320015
W. Litwin, T. Schwarz
{"title":"Algebraic signatures for scalable distributed data structures","authors":"W. Litwin, T. Schwarz","doi":"10.1109/ICDE.2004.1320015","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320015","url":null,"abstract":"Signatures detect changes to data objects. Numerous schemes are in use, especially the cryptographically secure standards SHA-1. We propose a novel signature scheme which we call algebraic signatures. The scheme uses the Galois field calculations. Its major property is the sure detection of any changes up to a parameterized size. More precisely, we detect for sure any changes that do not exceed n-symbols for an n-symbol algebraic signature. This property is new for any known signature scheme. For larger changes, the collision probability is typically negligible, as for the other known schemes. We apply the algebraic signatures to the scalable distributed data structures (SDDS). We filter at the SDDS client node the updates that do not actually change the records. We also manage the concurrent updates to data stored in the SDDS RAM buckets at the server nodes. We further use the scheme for the fast disk backup of these buckets. We sign our objects with 4-byte signatures, instead of 20-byte standard SHA-1 signatures. Our algebraic calculus is then also about twice as fast.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121149513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Continuously maintaining quantile summaries of the most recent N elements over a data stream 持续维护数据流中最近N个元素的分位数摘要
Proceedings. 20th International Conference on Data Engineering Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320011
Xuemin Lin, Hongjun Lu, Jian Xu, J. Yu
{"title":"Continuously maintaining quantile summaries of the most recent N elements over a data stream","authors":"Xuemin Lin, Hongjun Lu, Jian Xu, J. Yu","doi":"10.1109/ICDE.2004.1320011","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320011","url":null,"abstract":"Statistics over the most recently observed data elements are often required in applications involving data streams, such as intrusion detection in network monitoring, stock price prediction in financial markets, Web log mining for access prediction, and user click stream mining for personalization. Among various statistics, computing quantile summary is probably most challenging because of its complexity. We study the problem of continuously maintaining quantile summary of the most recently observed N elements over a stream so that quantile queries can be answered with a guaranteed precision of /spl epsiv/N. We developed a space efficient algorithm for predefined N that requires only one scan of the input data stream and O(log(/spl epsiv//sup 2/N)//spl epsiv/+1//spl epsiv//sup 2/) space in the worst cases. We also developed an algorithm that maintains quantile summaries for most recent N elements so that quantile queries on any most recent n elements (n /spl les/ N) can be answered with a guaranteed precision of /spl epsiv/n. The worst case space requirement for this algorithm is only O(log/sup 2/(/spl epsiv/N)//spl epsiv//sup 2/). Our performance study indicated that not only the actual quantile estimation error is far below the guaranteed precision but the space requirement is also much less than the given theoretical bound.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"358 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122746675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
Database kernel research: what, if anything, is left to do? 数据库内核研究:如果有的话,还剩下什么要做?
Proceedings. 20th International Conference on Data Engineering Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320095
D. Lomet
{"title":"Database kernel research: what, if anything, is left to do?","authors":"D. Lomet","doi":"10.1109/ICDE.2004.1320095","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320095","url":null,"abstract":"Data Engineering deals with the use of engineering techniques and methodologies in the design, development and assessment of information systems for different computing platforms and application environments. The 20th International Conference on Data Engineering will be held in Boston, Massachusetts, USA-an academic and technological center with a variety of historical and cultural attractions of international prominence within walking distance.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133814423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BEA liquid data for WebLogic: XML-based enterprise information integration 用于WebLogic的BEA流动数据:基于xml的企业信息集成
Proceedings. 20th International Conference on Data Engineering Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320051
M. Carey
{"title":"BEA liquid data for WebLogic: XML-based enterprise information integration","authors":"M. Carey","doi":"10.1109/ICDE.2004.1320051","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320051","url":null,"abstract":"This presentation provides a technical overview of BEA Liquid Data for WebLogic, a relatively new product from BEA Systems that provides enterprise information integration capabilities to enterprise applications that are built and deployed using the BEA WebLogic Platform. Liquid Data takes an XML-centric approach to tackling the long-standing problem of integrating data from disparate data sources and making that information easily accessible to applications. In particular, Liquid Data uses the forthcoming XQuery language standard as the basis for defining integrated views of enterprise data and querying over those views. We provide a brief overview of the Liquid Data product architecture and then discuss some of the query processing technology that lies at the heart of the product.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115196792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Using stream semantics for continuous queries in media stream processors 在媒体流处理器中使用流语义进行连续查询
Proceedings. 20th International Conference on Data Engineering Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320083
Amarnath Gupta, B. Liu, Pilho Kim, R. Jain
{"title":"Using stream semantics for continuous queries in media stream processors","authors":"Amarnath Gupta, B. Liu, Pilho Kim, R. Jain","doi":"10.1109/ICDE.2004.1320083","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320083","url":null,"abstract":"In the case of media and feature streams, explicit inter-stream constraints exist and can be exploited in the evaluation of continuous queries in the spirit of semantic query optimization. We express these properties using a media stream declaration language MSDL. In the demonstration, we present IMMERSI-MEET, an application built around an immersive environment. The IMMERSI-MEET system distinguishes between continuous streams, where values of different types come at a specified data rate, and discrete streams where sources push values intermittently. In MSDL, any dependence declaration must have at least one dependency specifying predicate in the body. As stream declarations are registered, the stream constraints are interpreted to construct a set of evaluation directives.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123361791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Meta data management 元数据管理
Proceedings. 20th International Conference on Data Engineering Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320101
Philip A. Bernstein, Sergey Melnik
{"title":"Meta data management","authors":"Philip A. Bernstein, Sergey Melnik","doi":"10.1109/ICDE.2004.1320101","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320101","url":null,"abstract":"By meta data management, we mean techniques for manipulating schemas and schema-like objects (such as interface definitions and web site maps) and mappings between them. Work on meta data problems goes back to at least the early 1970s, when data translation was the hot database research topic, even before relational databases caught on. Many popular research problems in the past five years are primarily meta data problems, such as data warehouse tools (e.g., ETL – to extract, transform and load), data integration, the semantic web, generation of XML or object-oriented wrappers for SQL databases, and generation of wrappers for web sites. Other classical meta data problems are information resource management, design tool support and integration, and schema evolution and data migration. Despite its longevity and continued importance, there is no widely-accepted conceptual framework for the meta data field, as there is for many other database topics, such as access methods, query processing, and transaction management. In this seminar, we propose such a conceptual framework. It consists of three layers: applications, design patterns, and basic operators. Applications are the end-user problems to be solved, like those listed in the previous paragraph. Design patterns are generic problems that need to be solved in support of many different applications, such as meta modeling (for all meta data problems), answering queries using views (for data integration and the semantic web), and change propagation (for data translation, schema evolution, and round-trip engineering). Basic operators are procedures that are needed to support multiple design patterns and applications, such as matching schemas to produce a mapping, merging schemas based on a mapping, and composing mappings. We will describe several meta data management problems, and for each, we will explain the design patterns and operators that are needed to solve it. We will summarize the main approaches to each design pattern and operator – the main choices of language, data structures, and algorithms – and will highlight the relevant papers that address it. This seminar is targeted at both practicing engineers and researchers. The former will learn about the latest solutions to important meta data problems and the many difficult unsolved problems that are best to avoid. Database researchers, especially professors, will benefit from considering the conceptual framework that we propose, since no database textbooks treat meta data management as a separate topic as far as we know.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122740915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
Scalable multimedia disk scheduling 可扩展的多媒体磁盘调度
Proceedings. 20th International Conference on Data Engineering Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320022
M. Mokbel, Walid G. Aref, Khaled M. Elbassioni, I. Kamel
{"title":"Scalable multimedia disk scheduling","authors":"M. Mokbel, Walid G. Aref, Khaled M. Elbassioni, I. Kamel","doi":"10.1109/ICDE.2004.1320022","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320022","url":null,"abstract":"A new multimedia disk-scheduling algorithm, termed Cascaded-SFC, is presented. The Cascaded-SFC multimedia disk scheduler is applicable in environments where multimedia data requests arrive with different quality of service (QoS) requirements such as real-time deadline and user priority. Previous work on disk scheduling has focused on optimizing the seek times and/or meeting the real-time deadlines. The Cascaded-SFC disk scheduler provides a unified framework for multimedia disk scheduling that scales with the number of scheduling parameters. The general idea is based on modeling the multimedia disk requests as points in multiple multidimensional subspaces, where each of the dimensions represents one of the parameters (e.g., one dimension represents the request deadline, another represents the disk cylinder number, and a third dimension represents the priority of the request, etc.). Each multidimensional subspace represents a subset of the QoS parameters that share some common scheduling characteristics. Then the multimedia disk scheduling problem reduces to the problem of finding a linear order to traverse the multidimensional points in each subspace. Multiple space-filling curves are selected to fit the scheduling needs of the QoS parameters in each subspace. The orders in each subspace are integrated in a cascaded way to provide a total order for the whole space. Comprehensive experiments demonstrate the efficiency and scalability of the Cascaded-SFC disk scheduling algorithm over other disk schedulers.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129703912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
NEXSORT: sorting XML in external memory NEXSORT:在外部内存中对XML排序
Proceedings. 20th International Conference on Data Engineering Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320038
Adam Silberstein, Jun Yang
{"title":"NEXSORT: sorting XML in external memory","authors":"Adam Silberstein, Jun Yang","doi":"10.1109/ICDE.2004.1320038","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320038","url":null,"abstract":"XML plays an important role in delivering data over the Internet, and the need to store and manipulate XML in its native format has become increasingly relevant. This growing need necessitates work on developing native XML operators, especially for one as fundamental as sort. We present NEXSORT, an algorithm that leverages the hierarchical nature of XML to efficiently sort an XML document in external memory. In a fully sorted XML document, children of every nonleaf element are ordered according to a given sorting criterion. Among NEXSORT's uses is in combination with structural merge as the XML version of sort-merge join, which allows us to merge large XML documents using only a single pass once they are sorted. The hierarchical structure of an XML document limits the number of possible legal orderings among its elements, which means that sorting XML is fundamentally \"easier\" than sorting a flat file. We prove that the I/O lower bound for sorting XML in external memory is /spl Theta/(max{n,nlog/sub m/(k/B)}), where n is the number of blocks in the input XML document, m is the number of main memory blocks available for sorting, B is the number of elements that can fit in one block, and k is the maximum fan-out of the input document tree. We show that NEXSORT performs within a constant factor of this theoretical lower bound. In practice we demonstrate, even with a naive implementation, NEXSORT significantly outperforms a regular external merge sort of all elements by their key paths, unless the XML document is nearly flat, in which case NEXSORT degenerates essentially to external merge sort.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116938160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Recursive XML schemas, recursive XML queries, and relational storage: XML-to-SQL query translation 递归XML模式、递归XML查询和关系存储:XML到sql查询的转换
Proceedings. 20th International Conference on Data Engineering Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1319983
R. Krishnamurthy, Venkatesan T. Chakaravarthy, R. Kaushik, J. Naughton
{"title":"Recursive XML schemas, recursive XML queries, and relational storage: XML-to-SQL query translation","authors":"R. Krishnamurthy, Venkatesan T. Chakaravarthy, R. Kaushik, J. Naughton","doi":"10.1109/ICDE.2004.1319983","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1319983","url":null,"abstract":"We consider the problem of translating XML queries into SQL when XML documents have been stored in an RDBMS using a schema-based relational decomposition. Surprisingly, there is no published XML-to-SQL query translation algorithm for this scenario that handles recursive XML schemas. We present a generic algorithm to translate path expression queries into SQL in the presence of recursion in the schema and queries. This algorithm handles a general class of XML-to-relational mappings, which includes all techniques proposed in literature. Some of the salient features of this algorithm are: (i) It translates a path expression query into a single SQL query, irrespective of how complex the XML schema is, (ii) It uses the \"with\" clause in SQL99 to handle recursive queries even over nonrecursive schemas, (iii) It reconstructs recursive XML subtrees with a single SQL query and (iv) It shows that the support for linear recursion in SQL99 is sufficient for handling path expression queries over arbitrarily complex recursive XML schema.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134573817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
An efficient algorithm for mining frequent sequences by a new strategy without support counting 一种不支持计数的新策略挖掘频繁序列的高效算法
Proceedings. 20th International Conference on Data Engineering Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320012
D. Chiu, Yi-Hung Wu, Arbee L. P. Chen
{"title":"An efficient algorithm for mining frequent sequences by a new strategy without support counting","authors":"D. Chiu, Yi-Hung Wu, Arbee L. P. Chen","doi":"10.1109/ICDE.2004.1320012","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320012","url":null,"abstract":"Mining sequential patterns in large databases is an important research topic. The main challenge of mining sequential patterns is the high processing cost due to the large amount of data. We propose a new strategy called direct sequence comparison (abbreviated as DISC), which can find frequent sequences without having to compute the support counts of nonfrequent sequences. The main difference between the DISC strategy and the previous works is the way to prune nonfrequent sequences. The previous works are based on the antimonotone property, which prune the nonfrequent sequences according to the frequent sequences with shorter lengths. On the contrary, the DISC strategy prunes the nonfrequent sequences according to the other sequences with the same length. Moreover, we summarize three strategies used in the previous works and design an efficient algorithm called DISC-all to take advantages of all the four strategies. The experimental results show that the DISC-all algorithm outperforms the PrefixSpan algorithm on mining frequent sequences in large databases. In addition, we analyze these strategies to design the dynamic version of our algorithm, which achieves a much better performance.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133138604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 90
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信