International Workshop on Data Warehousing and OLAP最新文献

筛选
英文 中文
Enhanced clustering of complex database objects in the clustcube framework 在clustercube框架中增强了复杂数据库对象的集群
International Workshop on Data Warehousing and OLAP Pub Date : 2012-11-02 DOI: 10.1145/2390045.2390066
A. Cuzzocrea, Paolo Serafino
{"title":"Enhanced clustering of complex database objects in the clustcube framework","authors":"A. Cuzzocrea, Paolo Serafino","doi":"10.1145/2390045.2390066","DOIUrl":"https://doi.org/10.1145/2390045.2390066","url":null,"abstract":"This paper significantly extends our previous research contribution [1], where we introduced the OLAP-based ClustCube framework for clustering and mining complex database objects extracted from distributed database settings. In particular, in this research we provide the following two novel contributions over [1]. First, we provide an innovative tree-based distance function over complex objects that takes into account the typical tree-like nature of these objects in distributed database settings. This novel distance is a relevant contribution over the simpler low-level-field-based distance presented in [1]. Second, we provide a comprehensive experimental campaign of ClustCube algorithms for computing ClustCube cubes, according to both performance metrics and accuracy metrics, against a well-known benchmark data set, and in comparison with a state-of-the-art subspace clustering algorithm for high-dimensional data. Retrieved results clearly demonstrate the superiority of our approach.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125694881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Discovering OLAP dimensions in semi-structured data 发现半结构化数据中的OLAP维度
International Workshop on Data Warehousing and OLAP Pub Date : 2012-11-02 DOI: 10.1145/2390045.2390048
Svetlana Mansmann, N. Rehman, Andreas Weiler, M. Scholl
{"title":"Discovering OLAP dimensions in semi-structured data","authors":"Svetlana Mansmann, N. Rehman, Andreas Weiler, M. Scholl","doi":"10.1145/2390045.2390048","DOIUrl":"https://doi.org/10.1145/2390045.2390048","url":null,"abstract":"With the standard OLAP technology, cubes are constructed from the input data based on the available data fields and known relationships between them. Structuring the data into a set of numeric measures distributed along a set of uniformly structured dimensions may be unrealistic for applications dealing with semi-structured data. We propose to extend the capabilities of OLAP via content-driven discovery of measures and dimensional characteristics in the original dataset. New structural elements are discovered by means of data mining and other techniques and are therefore prone to changes as the underlying dataset evolves. In this work we focus on the challenge of generating, maintaining, and querying such discovered elements of the cube.\u0000 We demonstrate the benefits of our approach by providing OLAP to the public stream of user-generated content of the popular microblogging service Twitter. We were able to enrich the original set by discovering dynamic characteristics such as user activity, popularity, messaging behavior, as well as classifying messages by topic, impact, origin, method of generation, etc. Application of knowledge discovery techniques coupled with human expertise enable structural enrichment of the original data beyond the scope of the existing methods for generating multidimensional models from relational or semi-structured data.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127216227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Towards ontology-based OLAP: datalog-based reasoning over multidimensional ontologies 迈向基于本体的OLAP:多维本体上基于数据的推理
International Workshop on Data Warehousing and OLAP Pub Date : 2012-11-02 DOI: 10.1145/2390045.2390053
B. Neumayr, Stefan Anderlik, M. Schrefl
{"title":"Towards ontology-based OLAP: datalog-based reasoning over multidimensional ontologies","authors":"B. Neumayr, Stefan Anderlik, M. Schrefl","doi":"10.1145/2390045.2390053","DOIUrl":"https://doi.org/10.1145/2390045.2390053","url":null,"abstract":"Understandability, reuse, and maintainability of analytical queries belong to the key challenges of Data Warehousing, especially in settings where a large number of business analysts work together and need to share knowledge. To tackle these challenges we propose Ontology-based OLAP where an ontology acts as superimposed conceptual layer between business analysts and multidimensional data. In Ontology-based OLAP, dimensions and facts are enriched by concept definitions capturing the semantics of relevant business terms used to define measures and to formulate analytical queries. Using traditional ontology languages, it is, however, very difficult to capture the hierarchical and multidimensional conceptualizations of business analysts. In this paper, we propose hierarchical and multidimensional ontologies to better capture these structural specificities. We define and implement the abstract structure and semantics of multidimensional ontologies as rules and constraints in Datalog with negation and represent multidimensional ontologies as Datalog facts. In addition to reasoning over multidimensional ontologies (open-world) we discuss their grounding in Data Warehouses (closed-world) as the fundament of Ontology-based OLAP.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"11 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128022256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Managing a fragmented XML data cube with oracle and timesten 使用oracle和timesten管理碎片化的XML数据立方体
International Workshop on Data Warehousing and OLAP Pub Date : 2012-11-02 DOI: 10.1145/2390045.2390061
Doulkifli Boukraâ, Omar Boussaïd, F. Bentayeb, D. Zegour
{"title":"Managing a fragmented XML data cube with oracle and timesten","authors":"Doulkifli Boukraâ, Omar Boussaïd, F. Bentayeb, D. Zegour","doi":"10.1145/2390045.2390061","DOIUrl":"https://doi.org/10.1145/2390045.2390061","url":null,"abstract":"In this paper, we cross two techniques for performance tuning of an XML cube. We analyze six configurations for managing the cube. The configurations result from storing two variants of the cube (unfragmented and fragmented) in different ways. First, we consider a disk-resident database. Then, we consider caching the frequent properties of the unfragmented cube and the frequent fragments of the fragmented cube. Finally, we load and manage the entire cube into the main memory. We show the benefits of vertical fragmentation and in-memory management of the XML cube through a set of experiments.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114214345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
High-performance online spatial and temporal aggregations on multi-core CPUs and many-core GPUs 在多核cpu和多核gpu上实现高性能在线时空聚合
International Workshop on Data Warehousing and OLAP Pub Date : 2012-11-02 DOI: 10.1145/2390045.2390060
Jianting Zhang, Simin You, L. Gruenwald
{"title":"High-performance online spatial and temporal aggregations on multi-core CPUs and many-core GPUs","authors":"Jianting Zhang, Simin You, L. Gruenwald","doi":"10.1145/2390045.2390060","DOIUrl":"https://doi.org/10.1145/2390045.2390060","url":null,"abstract":"Motivated by the practical needs for efficiently processing large-scale taxi trip data, we have developed techniques for high performance online spatial, temporal and spatiotemporal aggregations. These techniques include timestamp compression to reduce memory footprint, simple linear data structures for efficient in-memory scans and utilization of massively data parallel GPU accelerations for spatial joins. Our experiments have shown that the combined performance boosting techniques are able to perform various spatial, temporal and spatiotemporal aggregations on hundreds of millions of taxi trips in the order of a few seconds using commodity personal computers equipped with multi-core CPUs and many-core GPUs. The high throughputs in a personal computing environment are encouraging in the sense that high-performance OLAP queries on large-scale data is feasible when the parallel processing power of modern commodity hardware is fully utilized which is important for interactive OLAP applications.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117000982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
A rule-based tool for gradual granular data aggregation 一个基于规则的工具,用于逐步粒度数据聚合
International Workshop on Data Warehousing and OLAP Pub Date : 2011-10-28 DOI: 10.1145/2064676.2064678
N. Iftikhar, T. Pedersen
{"title":"A rule-based tool for gradual granular data aggregation","authors":"N. Iftikhar, T. Pedersen","doi":"10.1145/2064676.2064678","DOIUrl":"https://doi.org/10.1145/2064676.2064678","url":null,"abstract":"In order to keep more detailed data available for longer periods, old data has to be reduced gradually to save space and improve query performance, especially on resource-constrained systems with limited storage and query processing capabilities. In this regard, some hand-coded data aggregation solutions have been developed; however, their actual usage have been limited, for the reason that hand-coded data aggregation solutions have proven themselves too complex to maintain. Maintenance need to occur as requirements change frequently and the existing data aggregation techniques lack flexibility with regards to efficient requirements change management. This paper presents an effective rule-based tool for data reduction based on gradual granular data aggregation. With the proposed solution, data can be maintained at different levels of granularity. The solution is based on high-level data aggregation rules. Based on these rules, data aggregation code can be auto-generated. The solution is effective, easy-to-use and easy-to-maintain. In addition, the paper also demonstrates the use of the proposed tool based on a farming case study using standard database technologies. The results show productivity of the proposed tool-based solution in terms of initial development time, maintenance time and alteration time as compared to a hand-coded solution.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122410328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Optimization of operator partitions in stream data warehouse 流数据仓库中算子分区的优化
International Workshop on Data Warehousing and OLAP Pub Date : 2011-10-28 DOI: 10.1145/2064676.2064687
M. Gorawski, Aleksander Chrószcz
{"title":"Optimization of operator partitions in stream data warehouse","authors":"M. Gorawski, Aleksander Chrószcz","doi":"10.1145/2064676.2064687","DOIUrl":"https://doi.org/10.1145/2064676.2064687","url":null,"abstract":"Memory and time optimization is a key task of Stream Data Warehouses (SDWs). StrETL processes in those systems are similar to queries in Data Stream Management Systems (DSMSs). This fact allows us to migrate some methods from DSMS to SDW. We have observed that schedulers and algorithms introduced to create operator partitions are analyzed separately either in StrETL processes or in stream queries. The fact is, those two mechanisms affect each other and it is justified to study potential benefits of combining them together. In the paper we introduce a solution which cooperates with a scheduler in order to create more efficient operator partitions. Another noteworthy issue is that this algorithm is able to optimize a wider range of operator topologies. Finally, experimental evaluation show that our solution allows achieving a smaller memory consumption or a shorter response time in comparison with the competing strategies.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129432605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Column-oriented query processing for row stores 面向列的行存储查询处理
International Workshop on Data Warehousing and OLAP Pub Date : 2011-10-28 DOI: 10.1145/2064676.2064689
Amr El-Helw, K. A. Ross, Bishwaranjan Bhattacharjee, Christian A. Lang, G. Mihaila
{"title":"Column-oriented query processing for row stores","authors":"Amr El-Helw, K. A. Ross, Bishwaranjan Bhattacharjee, Christian A. Lang, G. Mihaila","doi":"10.1145/2064676.2064689","DOIUrl":"https://doi.org/10.1145/2064676.2064689","url":null,"abstract":"Column-oriented DBMSs have gained increasing interest due to their superior performance for analytical workloads. Prior efforts tried to determine the possibility of simulating the query processing techniques of column-oriented systems in row-oriented databases, in a hope to improve their performance, especially for OLAP and data warehousing applications. In this paper, we show that column-oriented query processing can significantly improve the performance of row-oriented DBMSs. We introduce new operators that take into account the unique characteristics of data obtained from indexes, and exploit new technologies such as flash SSDs and multi-core processors to boost the performance. We demonstrate our approach with an experimental study using a prototype built on a commercial row-oriented DBMS.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124042471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Enforcing strictness in integration of dimensions: beyond instance matching 加强维度集成的严格性:超越实例匹配
International Workshop on Data Warehousing and OLAP Pub Date : 2011-10-28 DOI: 10.1145/2064676.2064679
D. Riazati, J. Thom, Xiuzhen Zhang
{"title":"Enforcing strictness in integration of dimensions: beyond instance matching","authors":"D. Riazati, J. Thom, Xiuzhen Zhang","doi":"10.1145/2064676.2064679","DOIUrl":"https://doi.org/10.1145/2064676.2064679","url":null,"abstract":"Maintaining strictness in dimensions is important in integration of data warehouses. A dimension that satisfies all of its roll-up constraints is said to be strict, a property that is required for correct aggregation. Existing work on instance matching does not address the problem of enforcing the strictness of roll-up constraints. In this paper, we use a graph matching-based approach to dimension instance matching and propose an algorithm that enforces strictness and reduces false positives. Making use of similarity flooding, the graph matching algorithm can be greedy in identifying matching members, we propose heuristics to further reduce false positive matches and reduce false strictness. Experiments on real-world data demonstrates the effectiveness of our proposed approach.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127526898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Building cubes with MapReduce 用MapReduce构建多维数据集
International Workshop on Data Warehousing and OLAP Pub Date : 2011-10-28 DOI: 10.1145/2064676.2064680
A. Abelló, J. Ferrarons, Oscar Romero
{"title":"Building cubes with MapReduce","authors":"A. Abelló, J. Ferrarons, Oscar Romero","doi":"10.1145/2064676.2064680","DOIUrl":"https://doi.org/10.1145/2064676.2064680","url":null,"abstract":"In the last years, the problems of using generic storage techniques for very specific applications has been detected and outlined. Thus, some alternatives to relational DBMSs (e.g., BigTable) are blooming. On the other hand, cloud computing is already a reality that helps to save money by eliminating the hardware as well as software fixed costs and just pay per use. Indeed, specific software tools to exploit a cloud are also here. The trend in this case is toward using tools based on the MapReduce paradigm developed by Google. In this paper, we explore the possibility of having data in a cloud by using BigTable to store the corporate historical data and MapReduce as an agile mechanism to deploy cubes in ad-hoc Data Marts. Our main contribution is the comparison of three different approaches to retrieve data cubes from BigTable by means of MapReduce and the definition of criteria to choose among them.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"10 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114102963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信