{"title":"ContextMetrics/sup /spl trade//: semantic and syntactic interoperability in cross-border trading systems","authors":"Chito Jovellanos","doi":"10.1109/ICDE.2004.1320053","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320053","url":null,"abstract":"We describe a method and system for quantifying the variances in the semantics and syntax of electronic transactions exchanged between business counterparties. ContextMetrics/sup /spl trade// enables (a) dynamic transformations of outbound and inbound transactions needed to effect 'straight-through-processing' (STP); (b) unbiased assessments of counterparty systems' capabilities to support STP; and (c) modeling of operational risks and financial exposures stemming from an enterprise's transactional systems.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130026961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chung-Min Chen, H. Agrawal, M. Cochinwala, D. Rosenbluth
{"title":"Stream query processing for healthcare bio-sensor applications","authors":"Chung-Min Chen, H. Agrawal, M. Cochinwala, D. Rosenbluth","doi":"10.1109/ICDE.2004.1320048","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320048","url":null,"abstract":"The need of a data stream management system (DSMS), with the capability of querying continuous data streams, has been well understood by the database research community. We provide an overview on a DSMS prototype called T2. T2 inherits some of the concepts of an early prototype, Tribeca [M. Sullivan et al. (1998)], developed also at Telcordia, but with complete new design and implementation in Java with an SQL-like query language. Our goal is to build a framework that provides a programming infrastructure as well as useful operators to support stream processing in different applications. We set our first targeted application to healthcare biosensor networks, where we applied T2 to monitoring and analyzing electrocardiogram (ECG) data streams, arriving via wireless networks from mobile subjects wearing ECG sensors. Monitoring remote patients via wireless sensors not only provides convenience and safety assurance to the patients, but also saves health care cost in many aspects.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116986137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Approximate temporal aggregation","authors":"Yufei Tao, D. Papadias, C. Faloutsos","doi":"10.1109/ICDE.2004.1319996","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1319996","url":null,"abstract":"Temporal aggregate queries retrieve summarized information about records with time-evolving attributes. Existing approaches have at least one of the following shortcomings: (i) they incur large space requirements, (ii) they have high processing cost and (iii) they are based on complex structures, which are not available in commercial systems. We solve these problems by approximation techniques with bounded error. We propose two methods: the first one is based on multiversion B-trees and has logarithmic worst-case query cost, while the second technique uses off-the-shelf B- and R-trees, and achieves the same performance in the expected case. We experimentally demonstrate that the proposed methods consume an order of magnitude less space than their competitors and are significantly faster, even for cases that the permissible error bound is very small.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"4 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123731980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data mining for intrusion detection: techniques, applications and systems","authors":"J. Pei, S. Upadhyaya, F. Farooq, V. Govindaraju","doi":"10.1109/ICDE.2004.1320103","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320103","url":null,"abstract":"An intrusion is defined as any set of actions that compromise the integrity, confidentiality or availability of a resource. Intrusion detection is an important task for information infrastructure security. One major challenge in intrusion detection is that we have to identify the camouflaged intrusions from a huge amount of normal communication activities. Data mining is to identify valid, novel, potentially useful, and ultimately understandable patterns in massive data. It is demanding to apply data mining techniques to detect various intrusions. In the last several years, some exciting and important advances have been made in intrusion detection using data mining techniques. Research results have been published and some prototype systems have been established. Inspired by the huge demands from applications, the interactions and collaborations between the communities of security and data mining have been boosted substantially. This seminar will present an interdisciplinary survey of data mining techniques for intrusion detection so that the researchers from computer security and data mining communities can share the experiences and learn from each other. Some data mining based intrusion detection systems will also be reviewed briefly. Moreover, research challenges and problems will be discussed so that future collaborations may be stimulated. For data mining/database researchers and practitioners, the seminar will provide background knowledge and opportunities for applying data mining techniques to intrusion detection and computer security. For computer security researchers and practitioners, it provides knowledge on how data mining can benefit and enhance computer security. We will try to understand and appreciate the following technical issues.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122773524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RACCOON: a peer-based system for data integration and sharing","authors":"Chen Li, Jia Li, Qi Zhong","doi":"10.1109/ICDE.2004.1320081","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320081","url":null,"abstract":"Recent database applications see the emerging need to support data integration in distributed, peer-to-peer environments, in which autonomous peers (sources) connected by a network are willing to exchange data and services with each other. To address related research challenges, we are developing a system called \"RACCOON\", which allows different sources to integrate and share their data. We use an application to show several important features of the RACCOON system. The system also suggests semantic mappings for the user to choose. We show the two different querying modes, particularly how a query is expanded using the semantic mappings in the extended querying mode to compute as many answers to the query as possible.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122895643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"wmdb.*: rights protection for numeric relational data","authors":"R. Sion, M. Atallah, Sunil Prabhakar","doi":"10.1109/ICDE.2004.1320091","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320091","url":null,"abstract":"We introduce wmdb.*, a solution for numeric relational data rights protection through watermarking. Rights protection for relational data is important in areas where sensitive, valuable content is to be outsourced. A good example is a data mining application, where data is sold in pieces to parties specialized in mining it. We show how various higher level semantic constraints such as classification preservation and maximum absolute change bounds are naturally handled and how random alteration attacks are well survived.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130203708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved file synchronization techniques for maintaining large replicated collections over slow networks","authors":"Torsten Suel, P. Noel, Dimitre Trendafilov","doi":"10.1109/ICDE.2004.1319992","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1319992","url":null,"abstract":"We study the problem of maintaining large replicated collections of files or documents in a distributed environment with limited bandwidth. This problem arises in a number of important applications, such as synchronization of data between accounts or devices, content distribution and Web caching networks, Web site mirroring, storage networks, and large scale Web search and mining. At the core of the problem lies the following challenge, called the file synchronization problem: given two versions of a file on different machines, say an outdated and a current one, how can we update the outdated version with minimum communication cost, by exploiting the significant similarity between the versions? While a popular open source tool for this problem called rsync is used in hundreds of thousands of installations, there have been only very few attempts to improve upon this tool in practice. We propose a framework for remote file synchronization and describe several new techniques that result in significant bandwidth savings. Our focus is on applications where very large collections have to be maintained over slow connections. We show that a prototype implementation of our framework and techniques achieves significant improvements over rsync. As an example application, we focus on the efficient synchronization of very large Web page collections for the purpose of search, mining, and content distribution.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130458580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PRIX: indexing and querying XML using prufer sequences","authors":"P. Rao, Bongki Moon","doi":"10.1109/ICDE.2004.1320005","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320005","url":null,"abstract":"We propose a new way of indexing XML documents and processing twig patterns in an XML database. Every XML document in the database can be transformed into a sequence of labels by Prufer's method that constructs a one-to-one correspondence between trees and sequences. During query processing, a twig pattern is also transformed into its Prufer sequence. By performing subsequence matching on the set of sequences in the database, and performing a series of refinement phases that we have developed, we can find all the occurrences of a twig pattern in the database. Our approach allows holistic processing of a twig pattern without breaking the twig into root-to-leaf paths and processing these paths individually. Furthermore, we show that all correct answers are found without any false dismissals or false alarms. Experimental results demonstrate the performance benefits of our proposed techniques.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"8 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131471604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extending XML database to support open XML","authors":"Jinyu Wang, Kongyi Zhou, K. Karun, Mark Scardina","doi":"10.1109/ICDE.2004.1320054","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320054","url":null,"abstract":"XML is a widely accepted standard for exchanging business data. To optimize the management of XML and help companies build up their business partner networks over the Internet, database servers have introduced new XML storage and query features. However, each enterprise defines its own data elements in XML and modifies the XML documents to handle the evolving business needs. This makes XML data conform to heterogeneous schemas or schemas that evolve over time, which is not suitable for XML database storage. We provide an overview of the current XML database strategies and presents a streaming metadata-processing approach, enabling databases to handle multiple XML formats seamlessly.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131821855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic extensible query processing in super-peer based P2P systems","authors":"C. Wiesner, A. Kemper, Stefan Brandl","doi":"10.1109/ICDE.2004.1320077","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320077","url":null,"abstract":"To enable dynamic, extensible, and distributed query processing in super-peer based P2P networks, where standard query operators and user-defined code can be executed nearby the data, we distribute query processing to (super-) peers. Therefore, super-peers provide functionality for the management of the indices, query optimization, and query processing. Additionally, we expect that peers provide query processing capabilities to be full members of the P2P network. To enable this, super-peers have to provide an optimizer for generating efficient query plans from the queries they receive. The distribution process is guided by the routing index which is dynamic and corresponds to the data allocation schema in traditional distributed DBMSs.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131887913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}