Steffen Ulsø Knudsen, T. Pedersen, Christian Thomsen, K. Torp
{"title":"RelaXML: bidirectional transfer between relational and XML data","authors":"Steffen Ulsø Knudsen, T. Pedersen, Christian Thomsen, K. Torp","doi":"10.1109/IDEAS.2005.48","DOIUrl":"https://doi.org/10.1109/IDEAS.2005.48","url":null,"abstract":"In modern enterprises, almost all data is stored in relational databases. Additionally, most enterprises increasingly collaborate with other enterprises in long-running read-write workflows, primarily through XML-based data exchange technologies such as Web services. However, bidirectional XML data exchange is cumbersome and must often be hand-coded, at considerable expense. This paper remedies the situation by proposing RELAXML, an automatic and effective approach to bidirectional XML-based exchange of relational data. RELAXML supports re-use through multiple inheritance, and handles both export of relational data to XML documents and (re-)import of XML documents with a large degree of flexibility in terms of the SQL statements and XML document structures supported. Import and export are formally defined so as to avoid semantic problems, and algorithms to implement both are given. A performance study shows that the approach has a reasonable overhead compared to hand-coded programs.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"266 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120883214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pattern-based information integration in dynamic environments","authors":"Jürgen Göres","doi":"10.1109/IDEAS.2005.42","DOIUrl":"https://doi.org/10.1109/IDEAS.2005.42","url":null,"abstract":"The convenient availability of information is an essential factor in science and business. While Internet technology has made large amounts of data available to the general public, the data is largely provided in human-readable format only. New technologies are now making direct access to millions of structured or semi-structured databases possible, but only through integration of these data sources maximum benefit can be gained. Traditional approaches to information integration, which involve human development teams and work in a controlled environment with a stable set of data sources, are not applicable due to the dynamic nature of such an environment. Therefore a higher degree of automation of this process is required. We present the PALADIN project (Pattern-based Architecture for LArge-scale Dynamic INformation integration), that uses machine-understandable patterns to capture and apply expert experience in the integration planning process.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127162044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-learning histograms for changing workloads","authors":"Xiaojing Li, Bo Zhou, Jinxiang Dong","doi":"10.1109/IDEAS.2005.50","DOIUrl":"https://doi.org/10.1109/IDEAS.2005.50","url":null,"abstract":"The increasing complexity of DBMSs and their workloads has made it a difficult and time-consuming task to manage their performance manually. Autonomic computing has emerged as a promising approach to deal with this complexity by making DBMSs self-managed. Automatic statistics management, as an important part of autonomic computing, is especially necessary in decision-support systems. In this paper, we introduce a novel technique for automatic statistics management called Self-Learning Histograms (SLH), which can adapt to workload and data distribution changes by automatically building and maintaining itself using query feedback information. Query feedback is encoded as deducible rules and the histogram can be viewed as a set of these rules. Through deducing among rules, more accurate statistics can be inferred and damages to results of former tunings are avoided. Selectivity estimation based on validity of rules greatly lowered estimation errors. Extensive experiments showed the effectiveness of SLH.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127743965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating and improving integration quality for heterogeneous data sources using statistical analysis","authors":"Evguenia Altareva, Stefan Conrad","doi":"10.1109/IDEAS.2005.25","DOIUrl":"https://doi.org/10.1109/IDEAS.2005.25","url":null,"abstract":"This paper considers the problem of integrating heterogeneous semi-structured data sources with the purpose of estimating integration quality (IQ). Integration of such data sources leads to results with unpredictable trustworthiness and none of the existing methods is capable of accounting for the uncertainty which is accumulated over all of the integration steps and which affects integration quality. To compute the uncertainties we suggest using a well-established statistical method Latent Class Analysis (LCA). This method allows to analyze the influence of the latent factors associated with the real-world entities on the set of data. We show on examples how the proposed approach can be used for evaluating and improving IQ giving an important tool to the users concerned with the data's trustworthiness.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134546151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient relational joins with arithmetic constraints on multiple attributes","authors":"Chuang Liu, Lingyun Yang, Ian T Foster","doi":"10.1109/IDEAS.2005.24","DOIUrl":"https://doi.org/10.1109/IDEAS.2005.24","url":null,"abstract":"We introduce and study a new class of queries that we refer to as ACMA (arithmetic constraints on multiple attributes) queries. Such combinatorial queries require the simultaneous satisfaction of arithmetic constraints on three or more attributes from different relations, and thus often involve expensive multi-join operations. Building on techniques from constraint programming, we develop preprocessing methods, algorithms, and a new constrained join operator that allow ACMA queries to be evaluated efficiently within a conventional relational database engine. We present the results of a careful performance evaluation of both our new approach and the conventional nested-loop join algorithm. Measurements of tuples read, intermediate tuples generated, and execution time shows that our approach achieves superior performance for ACMA joins.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133239347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fine-granularity access control in 3-tier laboratory information systems","authors":"Xue-Ping Li, Nomair A. Naeem, Bettina Kemme","doi":"10.1109/IDEAS.2005.30","DOIUrl":"https://doi.org/10.1109/IDEAS.2005.30","url":null,"abstract":"Laboratory information systems (LIMS) are used in life science research to manage complex experiments. Since LIMS systems are often shared by different research groups, powerful access control is needed to allow different access rights to different records of the same table. Traditional access control models that define a permission as the right of a user/role to perform a specific operation on a specific object cannot handle the enormous amount of objects and user/roles. In this paper, we propose an enhancement to role-based access control by introducing conditions that can be added to the traditional concept of permissions in order to keep the number of permissions small. Furthermore, we present an implementation of our access control model at the application programming level. Although access control is performed for every single database access, our solution completely separates access control from the application logic by using aspect-oriented programming. With this, access control can be integrated into a legacy 3-tier information system without changing the application programs.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121564867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Khalid Belhajjame, Genoveva Vargas-Solar, C. Collet
{"title":"Building information systems by orchestrating open services","authors":"Khalid Belhajjame, Genoveva Vargas-Solar, C. Collet","doi":"10.1109/IDEAS.2005.14","DOIUrl":"https://doi.org/10.1109/IDEAS.2005.14","url":null,"abstract":"Service oriented computing has gained a considerable momentum as a new paradigm for building enterprise information systems. Notable efforts have been made recently from both researchers and industrials to support the construction of service-based applications, nevertheless several issues still need to be tackled including service definition and adaptation, and services orchestration. This work proposes an approach for building and finely orchestrating open and adaptable services. An open service is represented by a workflow that coordinates calls to service provider methods. Thereby component activities and the way they are synchronized are rendered visible. Service adaptability refers to the possibility to modify an open service. Through adaptation operations a service can be customized according to given user (application) requirements. In order to finely orchestrate services, they are associated with entry points. An entry point acts as a gateway for inserting and getting information about the progress of service execution. Defined services and orchestration are verified to ensure a correct behaviour of the resulting application. The paper details our approach for building and orchestrating services, and presents associated architectural choices.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122143752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rewriting-based optimization for XQuery transformational queries","authors":"Maxim N. Grinev, Peter Pleshachkov","doi":"10.1109/IDEAS.2005.49","DOIUrl":"https://doi.org/10.1109/IDEAS.2005.49","url":null,"abstract":"The modern XML query language called XQuery includes advanced facilities both to query and to transform XML data. An XQuery query optimizer should be able to optimize any query. For \"querying\" queries almost all techniques inherited from SQL-oriented DBMS may be applied. The XQuery transformation facilities are XML-specific and have no counterparts in other query languages. That is why XQuery transformational queries need to be optimized with novel techniques. In this paper two kinds of such techniques (namely push predicates down XML element constructors and projection of transformation) are considered. A subset of XQuery for which these techniques can be fully implemented is identified. This subset seems to be the most interesting from the practical viewpoint. Rewriting rules for this subset are proposed and the correctness of these rules is formally justified. For the rest of the language we propose solutions that work for the most of common cases or consider the problems we have encountered.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134025973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Categorizing and extracting information from multilingual HTML documents","authors":"S. Lim, Yiu-Kai Ng","doi":"10.1109/IDEAS.2005.15","DOIUrl":"https://doi.org/10.1109/IDEAS.2005.15","url":null,"abstract":"The amount of online information written in different natural languages and the number of non-English speaking Internet users have been increasing tremendously during the past decade. In order to provide high-performance access of multilingual information on the Internet, we have developed a data analysis and querying system (DatAQs) that: (i) analyzes, identifies, and categorizes languages used in HTML documents; (ii) extracts information from HTML documents of interest written in different languages; (iii) allows the user to submit queries for retrieving extracted information in the same natural language provided by the query engine of DatAQs using a menu-driven user interface; and (iv) processes the user's queries (as Boolean expressions) to generate the results. DatAQs extracts information from HTML documents that belong to various data-rich, narrow-in-breadth application domains, such as car ads, house rentals, job ads, stocks, university catalogs, etc. The average F-measure on identifying HTML documents written in a particular natural language correctly is 89%, whereas the F-measure on categorizing HTML documents belonged to the car-ads application domain is 94%.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133860111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A method of security improvement for privacy preserving association rule mining over vertically partitioned data","authors":"Yiqun Huang, Zhengding Lu, Heping Hu","doi":"10.1109/IDEAS.2005.6","DOIUrl":"https://doi.org/10.1109/IDEAS.2005.6","url":null,"abstract":"There have been growing interests in privacy preserving data mining. Secure multiparty computation (SMC) is often used to give a solution. When data is vertically partitioned scalar product is a feasible tool to securely discover frequent itemsets of association rule mining. However, there may be disparity among the securities of different parties. To obtain equal privacy, the security of some parties may be lowered. This paper discusses the disharmony between the securities of two parties. The scalar product of two parties from the point of view of matrix computation is described. We present one algorithm for completely two-party computation of scalar product. Then we give a method of security improvement for both parties.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130486375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}