{"title":"Solving summarizability problems in fact-dimension relationships for multidimensional models","authors":"J. Mazón, Jens Lechtenbörger, J. Trujillo","doi":"10.1145/1458432.1458443","DOIUrl":"https://doi.org/10.1145/1458432.1458443","url":null,"abstract":"Multidimensional analysis allows decision makers to efficiently and effectively use data analysis tools, which mainly depend on multidimensional (MD) structures of a data warehouse such as facts and dimension hierarchies to explore the information and aggregate it at different levels of detail in an accurate way. A conceptual model of such MD structures serves as abstract basis of the subsequent implementation according to one specific technology. However, there is a semantic gap between a conceptual model and its implementation which complicates an adequate treatment of summarizability issues, which in turn may lead to erroneous results of data analysis tools and cause the failure of the whole data warehouse project. To bridge this gap for relationships between facts and dimension, we present an approach at the conceptual level for (i) identifying problematic situations in fact-dimension relationships, (ii) defining these relationships in a conceptual MD model, and (iii) applying a normalization process to transform this conceptual MD model into a summarizability-compliant model that avoids erroneous analysis of data. Furthermore, we also describe our Eclipsebased implementation of this normalization process.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122958959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A framework for recommending OLAP queries","authors":"A. Giacometti, Patrick Marcel, E. Negre","doi":"10.1145/1458432.1458446","DOIUrl":"https://doi.org/10.1145/1458432.1458446","url":null,"abstract":"An OLAP analysis session can be defined as an interactive session during which a user launches queries to navigate within a cube. Very often choosing which part of the cube to navigate further, and thus designing the forthcoming query, is a difficult task. In this paper, we propose to use what the OLAP users did during their former exploration of the cube as a basis for recommending OLAP queries to the user. We present a generic framework that allows to recommend OLAP queries based on the OLAP server query log. This framework is generic in the sense that changing its parameters changes the way the recommendations are computed. We show how to use this framework for recommending simple MDX queries and we provide some experimental results to validate our approach.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130489803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transaction reordering with application to synchronized scans","authors":"Gang Luo, J. Naughton, Curt J. Ellmann, M. Watzke","doi":"10.1145/1458432.1458436","DOIUrl":"https://doi.org/10.1145/1458432.1458436","url":null,"abstract":"Traditional workload management methods mainly focus on the current system status while information about the interaction between queued and running transactions is largely ignored. An exception to this is the transaction reordering method, which reorders the transaction sequence submitted to the RDBMS and improves the transaction throughput by considering both the current system status and information about the interaction between queued and running transactions. The existing transaction reordering method only considers the reordering opportunities provided by analyzing the lock conflict information among multiple transactions. This significantly limits the applicability of the transaction reordering method. In this paper, we extend the existing transaction reordering method into a general transaction reordering framework that can incorporate various factors as the reordering criteria. We show that by analyzing the resource utilization information of transactions, the transaction reordering method can also improve the system throughput by increasing the resource sharing opportunities among multiple transactions. We provide a concrete example on synchronized scans and demonstrate the advantages of our method through experiments with a commercial parallel RDBMS.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134394456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimating and bounding aggregations in databases with referential integrity errors","authors":"Javier García-García, C. Ordonez","doi":"10.1145/1458432.1458442","DOIUrl":"https://doi.org/10.1145/1458432.1458442","url":null,"abstract":"Database integration builds on tables coming from multiple databases by creating a single view of all these data. Each database has different tables, columns with similar content across databases and different referential integrity constraints. Thus, a query in an integrated database is likely to involve tables and columns with referential integrity errors. In a data warehouse environment, even though the ETL processes take care of the referential integrity errors, in many scenarios this is generally done by including 'dummy' records in the dimension tables used to relate to the fact tables with referential errors. When two tables are joined, and aggregations are computed, the tuples with an undefined foreign key value are aggregated in a group marked as undefined effectively discarding potentially valuable information. With that motivation in mind, we extend aggregate functions computed over tables with referential integrity errors on OLAP databases to return complete answer sets in the sense that no tuple is excluded. We associate to each valid reference, the probability that an invalid reference may actually be a certain correct reference. The main idea of our work is that in certain contexts, it is possible to use tuples with invalid references by taking into account the probability that an invalid reference actually be a certain correct reference. This way, improved answer sets are obtained from aggregate queries in settings where a database violates referential integrity constraints.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126348675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantic enrichment of strategic datacubes","authors":"C. Diamantini, D. Potena","doi":"10.1145/1458432.1458447","DOIUrl":"https://doi.org/10.1145/1458432.1458447","url":null,"abstract":"In the information system view, the reference architecture for strategic and decision support is based on the Data Warehouse architecture, that enables flexible and multidimensional analysis of strategic indexes by means of OLAP tools and reports. In this paper we propose a novel model for semantic annotation of Data Warehouse schema that takes into account domain ontologies as well as a mathematical ontology. Such an ontology describes mathematical formulas underlying elements of the datacube schema, including the semantics of operands and operators. In particular, we discuss and apply the proposed model for the semantic annotation of the schema of a datacube, that is the basis for OLAP analysis and contains information derived from Data Warehouse schema. In the paper, an illustrative case study together with some examples of analysis based on this kind of annotation are provided.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129242710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Natural language reporting for ETL processes","authors":"A. Simitsis, Dimitrios Skoutas, M. Castellanos","doi":"10.1145/1458432.1458444","DOIUrl":"https://doi.org/10.1145/1458432.1458444","url":null,"abstract":"The conceptual design of the Extract -- Transform -- Load (ETL) processes is a crucial, burdensome, and challenging procedure that takes places at the early phases of a Data Warehouse project. Several models have been proposed for the conceptual design and representation of ETL processes, but all share two inconveniences: they require intensive human effort from the designers to create them, as well as technical knowledge from the business people to understand them. In a previous work, we have relaxed the former difficulty by working on the automation of the conceptual design leveraging Semantic Web technology. In this paper, we built upon our previous results and we tackle the second issue by investigating the application of natural language generation techniques to the ETL environment. In particular, we provide a method for the representation of a conceptual ETL design as a narrative, which is the most natural means of communication and does not require knowledge of any specific model. We discuss how linguistic techniques can be used for the establishment of a common application vocabulary. Finally, we present a flexible and customizable template-based mechanism for generating natural language representations for the ETL process requirements and operations.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125525396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bridging the semantic gap in OLAP models: platform-independent queries","authors":"Jesús Pardillo, J. Mazón, J. Trujillo","doi":"10.1145/1458432.1458448","DOIUrl":"https://doi.org/10.1145/1458432.1458448","url":null,"abstract":"The development of data warehouses is based on a three-stage process that starts specifying both the static and dynamic properties of on-line analytical processing (OLAP) applications by means of an intuitive, semantically rich abstraction, namely the conceptual model. Then, developers design its logical counterpart where platform-specific details such as performance or storage are also considered. Nevertheless, it is well known the existence of a semantic gap between the conceptual and logical levels that decreases the feasibility of their mapping. In order to bridge this gap, we propose the use of conceptual OLAP queries, i.e., platform-independent, that can be automatically traced to their logical implementation in a coherent and integrated way. For this aim, in this paper, we focus on describing the specification of an OLAP algebra at the conceptual level by using the object-constraint language (OCL). Its operations are then translated into a particular OLAP system by using a model-driven architecture (MDA). The great advantage of our approach is that we allow analysts to query data warehouses without being aware of logical details.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114204289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deciding the physical implementation of ETL workflows","authors":"Vasiliki Tziovara, Panos Vassiliadis, A. Simitsis","doi":"10.1145/1317331.1317341","DOIUrl":"https://doi.org/10.1145/1317331.1317341","url":null,"abstract":"In this paper, we deal with the problem of determining the best possible physical implementation of an ETL workflow, given its logical-level description and an appropriate cost model as inputs. We formulate the problem as a state-space problem and provide a suitable solution for this task. We further extend this technique by intentionally introducing sorter activities in the workflow in order to search for alternative physical implementations with lower cost. We experimentally assess our method based on a principled organization of test suites.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114366258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving business intelligence speed and quality through the OODA concept","authors":"Morten Middelfart","doi":"10.1145/1317331.1317349","DOIUrl":"https://doi.org/10.1145/1317331.1317349","url":null,"abstract":"This article introduces the Observation-Orientation-Decision-Action (OODA) concept as a mean to identify three new desired technologies in business intelligence applications that improve the speed and quality in the decision making processes.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116316782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Plantevit, S. Goutier, F. Guisnel, Anne Laurent, M. Teisseire
{"title":"Mining unexpected multidimensional rules","authors":"M. Plantevit, S. Goutier, F. Guisnel, Anne Laurent, M. Teisseire","doi":"10.1145/1317331.1317347","DOIUrl":"https://doi.org/10.1145/1317331.1317347","url":null,"abstract":"Discovering unexpected rules is essential, particularly for industrial applications with marketing stakes. In this context, many works have been done for association rules. However, none of them addresses sequences. In this paper, we thus propose to discover unexpected multidimensional sequential rules in data cubes. We define the concept of multidimensional sequential rule, and then unexpectedness. We formalize these concepts and define an algorithm for mining this kind of rules. Experiments on a real data cube are reported and highlight the interest of our approach.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134264491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}