{"title":"A type-safe object-oriented solution for the dynamic construction of queries","authors":"Peter Rosenthal","doi":"10.1109/ICDE.2004.1320073","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320073","url":null,"abstract":"Many object-oriented applications use large numbers of structurally different database queries. With current technology, writing applications that generate queries at runtime is difficult and error-prone. FROQUE, a framework for object-oriented queries, provides a secure and purely object-oriented solution to access relational databases. As such, it is easy to use for object-oriented programmers and with the help of object-oriented compilers it guarantees that queries formulated in the object-oriented world at execution time result in correct SQL queries. Thus, FROQUE is an improvement over existing database frameworks such as Apache OJB, the object relational bridge, which are not strongly typed and can lead to runtime errors.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121653131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bitmap-tree indexing for set operations on free text","authors":"Ilias Nitsos, Georgios Evangelidis, D. Dervos","doi":"10.1109/ICDE.2004.1320067","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320067","url":null,"abstract":"Here we report on our implementation of a hybrid-indexing scheme (bitmap-tree) that combines the advantages of bitmap indexing and file inversion. The results we obtained are compared to those of the compressed inverted file index. Both storage overhead and query processing efficiency are taken into consideration. The proposed new method is shown to excel in handling queries involving set operations. For general-purpose user queries, the bitmap-tree is shown to perform as good as the compressed inverted file index.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132218395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Approximate selection queries over imprecise data","authors":"Iosif Lazaridis, S. Mehrotra","doi":"10.1109/ICDE.2004.1319991","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1319991","url":null,"abstract":"We examine the problem of evaluating selection queries over imprecisely represented objects. Such objects are used either because they are much smaller in size than the precise ones (e.g., compressed versions of time series), or as imprecise replicas of fast-changing objects across the network (e.g., interval approximations for time-varying sensor readings). It may be impossible to determine whether an imprecise object meets the selection predicate. Additionally, the objects appearing in the output are also imprecise. Retrieving the precise objects themselves (at additional cost) can be used to increase the quality of the reported answer. We allow queries to specify their own answer quality requirements. We show how the query evaluation system may do the minimal amount of work to meet these requirements. Our work presents two important contributions: first, by considering queries with set-based answers, rather than the approximate aggregate queries over numerical data examined in the literature; second, by aiming to minimize the combined cost of both data processing and probe operations in a single framework. Thus, we establish that the answer accuracy/performance tradeoff can be realized in a more general setting than previously seen.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129171172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Probe, cluster, and discover: focused extraction of QA-Pagelets from the deep Web","authors":"James Caverlee, Ling Liu, David J. Buttler","doi":"10.1109/ICDE.2004.1319988","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1319988","url":null,"abstract":"We introduce the concept of a QA-Pagelet to refer to the content region in a dynamic page that contains query matches. We present THOR, a scalable and efficient mining system for discovering and extracting QA-Pagelets from the deep Web. A unique feature of THOR is its two-phase extraction framework. In the first phase, pages from a deep Web site are grouped into distinct clusters of structurally-similar pages. In the second phase, pages from each page cluster are examined through a subtree filtering algorithm that exploits the structural and content similarity at subtree level to identify the QA-Pagelets.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131813972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Web-services architecture for efficient XML data exchange","authors":"S. Amer-Yahia, Y. Kotidis","doi":"10.1109/ICDE.2004.1320024","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320024","url":null,"abstract":"Business applications often exchange large amounts of enterprise data stored in legacy systems. The advent of XML as a standard specification format has improved applications interoperability. However, optimizing the performance of XML data exchange, in particular, when data volumes are large, is still in its infancy. Quite often, the target system has to undo some of the work the source did to assemble documents in order to map XML elements into its own data structures. This publish&map process is both resource and time consuming. In this paper, we develop a middle-tier Web services architecture to optimize the exchange of large XML data volumes. The key idea is to allow systems to negotiate the data exchange process using an extension to WSDL. The source (target) can specify document fragments that it is willing to produce (consume). Given these fragmentations, the middleware instruments the data exchange process between the two systems to minimize the number of necessary operations and optimize the distributed processing between the source and the target systems. We show that our new exchange paradigm outperforms publish&map and enables more flexible scenarios without necessitating substantial modifications to the underlying systems.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131584361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Denilson Barbosa, A. Mendelzon, L. Libkin, L. Mignet, M. Arenas
{"title":"Efficient incremental validation of XML documents","authors":"Denilson Barbosa, A. Mendelzon, L. Libkin, L. Mignet, M. Arenas","doi":"10.1109/ICDE.2004.1320036","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320036","url":null,"abstract":"We discuss incremental validation of XML documents with respect to DTDs and XML schema definitions. We consider insertions and deletions of subtrees, as opposed to leaf nodes only, and we also consider the validation of ID and IDREF attributes. For arbitrary schemas, we give a worst-case n log n time and linear space algorithm, and show that it often is far superior to revalidation from scratch. We present two classes of schemas, which capture most real-life DTDs, and show that they admit a logarithmic time incremental validation algorithm that, in many cases, requires only constant auxiliary space. We then discuss an implementation of these algorithms that is independent of, and can be customized for different storage mechanisms for XML. Finally, we present extensive experimental results showing that our approach is highly efficient and scalable.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115739248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Brecheisen, H. Kriegel, Peer Kröger, M. Pfeifle, Maximilian Viermetz, Marco Pötke
{"title":"BOSS: browsing OPTICS-plots for similarity search","authors":"S. Brecheisen, H. Kriegel, Peer Kröger, M. Pfeifle, Maximilian Viermetz, Marco Pötke","doi":"10.1109/ICDE.2004.1320086","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320086","url":null,"abstract":"An increasing number of database applications have emerged for which efficient and effective support for similarity search is substantial. Particularly, the task of finding similar shapes in 2D and 3D becomes more and more important. Examples for new applications that require the retrieval of similar 3D objects include databases for molecular biology, medical imaging and computer aided design. Hierarchical clustering was shown to be effective for evaluating similarity models. Furthermore, visually analyzing cluster hierarchies helps the user, e.g. an engineer, to find and group similar objects. We present an interactive browsing tool called BOSS (browsing OPTICS-plots for similarity search), which utilizes solid automatic cluster recognition and extraction of meaningful cluster representatives in order to provide the user with significant and quick information.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121785110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Bruder, A. Zeitz, Holger Meyer, B. Hänsel, A. Heuer
{"title":"FLYINGDOC: an architecture for distributed, user-friendly, and personalized information systems","authors":"I. Bruder, A. Zeitz, Holger Meyer, B. Hänsel, A. Heuer","doi":"10.1109/ICDE.2004.1320078","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320078","url":null,"abstract":"The need for personal information management using distributed, user-friendly, and personalized document management systems is obvious. State of the art document management systems such as digital libraries provide support for the whole document lifecycle. To enhance such document management systems to get a personalized, distributed and user-friendly information system we present techniques for a simple import of collections, documents, and data, for generic and concrete data modeling, replication, and, personalization. These techniques were employed for the implementation of a personal conference assistant, which was used for the first time at the VLDB conference 2003 in Berlin, Germany. Our client-server architecture provides an information server with different services and different kinds of clients. These services comprise a distribution and replication service, a collection integration service, a data management unit, and, a query processing service.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116933605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SQLCM: a continuous monitoring framework for relational database engines","authors":"S. Chaudhuri, A. König, Vivek R. Narasayya","doi":"10.1109/ICDE.2004.1320020","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320020","url":null,"abstract":"The ability to monitor a database server is crucial for effective database administration. Today's commercial database systems support two basic mechanisms for monitoring: (a) obtaining a snapshot of counters to capture current state, and (b) logging events in the server to a table/file to capture history. We show that for a large class of important database administration tasks the above mechanisms are inadequate in functionality or performance. We present an infrastructure called SQLCM that enables continuous monitoring inside the database server and that has the ability to automatically take actions based on monitoring. We describe the implementation of SQLCM in Microsoft SQL Server and show how several common and important monitoring tasks can be easily specified in SQLCM. Our experimental evaluation indicates that SQLCM imposes low overhead on normal server execution end enables monitoring tasks on a production server that would be too expensive using today's monitoring mechanisms.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117069682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proving ownership over categorical data","authors":"R. Sion","doi":"10.1109/ICDE.2004.1320029","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320029","url":null,"abstract":"This paper introduces a novel method of rights protection for categorical data through watermarking. We discover new watermark embedding channels for relational data with categorical types. We design novel watermark encoding algorithms and analyze important theoretical bounds including mark vulnerability. While fully preserving data quality requirements, our solution survives important attacks, such as subset selection and random alterations. Mark detection is fully \"blind\" in that it doesn't require the original data, an important characteristic especially in the case of massive data. We propose various improvements and alternative encoding methods. We perform validation experiments by watermarking the outsourced Wal-Mart sales data available at our institute. We prove (experimentally and by analysis) our solution to be extremely resilient to both alteration and data loss attacks, for example tolerating up to 80% data loss with a watermark alteration of only 25%.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125917393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}