Query's optimization in data warehouse on the cloud using fragmentation

2014 International Conference on Next Generation Networks and Services (NGNS) Pub Date : 2014-05-28 DOI:10.1109/NGNS.2014.6990243

Abdelaziz Ettaoufik, M. Ouzzif

{"title":"Query's optimization in data warehouse on the cloud using fragmentation","authors":"Abdelaziz Ettaoufik, M. Ouzzif","doi":"10.1109/NGNS.2014.6990243","DOIUrl":null,"url":null,"abstract":"Nowadays Cloud Computing occupies an advanced place in the field of service-oriented technologies. The cloud provides a flexible environment for customers to host and process their information through an outsourced infrastructure. This information was habitually located on local servers. Many applications dealing with massive data is routed to the cloud. Data Warehouse (DW) also benefit from this new paradigm to provide analytical data online and in real time. DW in the Cloud benefited of its advantages such flexibility, availability, adaptability, scalability, virtualization, etc. Improving the DW performance in the cloud requires the optimization of data processing time. The classical optimization techniques (indexing, materialized views and fragmentation) are still essential for DW in the cloud. The DW is partitioned before being distributed across multiple servers (nodes) in the Cloud. When queries containing multiple joins or ask voluminous data stored on multiple nodes, inter-node communication increases and consequently the DW performance degrades. In this paper we propose an approach for improving the performance of DW in the Cloud. Our approach is based on a mapping placed on nodes leased by the client. It consists to memorize: (i) the requests received by the node, (ii) information about DW; (iii) an algorithm of query processing. We use the data stored in the map for fragmenting the DW in order to minimize the inter-node communications.","PeriodicalId":138330,"journal":{"name":"2014 International Conference on Next Generation Networks and Services (NGNS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Next Generation Networks and Services (NGNS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NGNS.2014.6990243","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Nowadays Cloud Computing occupies an advanced place in the field of service-oriented technologies. The cloud provides a flexible environment for customers to host and process their information through an outsourced infrastructure. This information was habitually located on local servers. Many applications dealing with massive data is routed to the cloud. Data Warehouse (DW) also benefit from this new paradigm to provide analytical data online and in real time. DW in the Cloud benefited of its advantages such flexibility, availability, adaptability, scalability, virtualization, etc. Improving the DW performance in the cloud requires the optimization of data processing time. The classical optimization techniques (indexing, materialized views and fragmentation) are still essential for DW in the cloud. The DW is partitioned before being distributed across multiple servers (nodes) in the Cloud. When queries containing multiple joins or ask voluminous data stored on multiple nodes, inter-node communication increases and consequently the DW performance degrades. In this paper we propose an approach for improving the performance of DW in the Cloud. Our approach is based on a mapping placed on nodes leased by the client. It consists to memorize: (i) the requests received by the node, (ii) information about DW; (iii) an algorithm of query processing. We use the data stored in the map for fragmenting the DW in order to minimize the inter-node communications.

查看原文本刊更多论文

使用碎片对云数据仓库中的查询进行优化

如今，云计算在面向服务的技术领域占据了领先地位。云为客户提供了一个灵活的环境，通过外包基础设施托管和处理他们的信息。这些信息通常位于本地服务器上。许多处理大量数据的应用程序被路由到云。数据仓库(DW)也受益于这种新的模式，可以在线和实时地提供分析数据。云中的数据仓库得益于其灵活性、可用性、适应性、可伸缩性、虚拟化等优势。提高云中的数据仓库性能需要优化数据处理时间。经典的优化技术(索引、物化视图和碎片化)对于云中的数据仓库仍然是必不可少的。DW在分发到云中的多个服务器(节点)之前进行分区。当查询包含多个连接或询问存储在多个节点上的大量数据时，节点间通信增加，从而导致DW性能下降。本文提出了一种改进云环境下数据仓库性能的方法。我们的方法基于放置在客户端租用的节点上的映射。它包括记住:(i)节点收到的请求，(ii)关于DW的信息;(iii)查询处理算法。我们使用存储在映射中的数据来分割DW，以最小化节点间通信。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 International Conference on Next Generation Networks and Services (NGNS)

自引率

0.00%

发文量