Compression Methodologies for Columnar Database Optimization

Journal of Computational Science and Intelligent Technologies Pub Date : 1900-01-01 DOI:10.53409/mnaa/jcsit/e202203012432

Praveen Kumar Sadineni

{"title":"Compression Methodologies for Columnar Database Optimization","authors":"Praveen Kumar Sadineni","doi":"10.53409/mnaa/jcsit/e202203012432","DOIUrl":null,"url":null,"abstract":"Today’s life is completely dependent on data. Conventional relational databases take longer to respond to queries because they are built for row-wise data storage and retrieval. Due to their efficient read and write operations to and from hard discs, which reduce the time it takes for queries to produce results, columnar databases have recently overtaken traditional databases. To execute Business Intelligence and create decision-making systems, vast amounts of data gathered from various sources are required in data warehouses, where columnar databases are primarily created. Since the data are stacked closely together, and the seek time is reduced, columnar databases perform queries more quickly. With aggregation queries to remove unnecessary data, they allow several compression techniques for faster data access. To optimise the efficiency of columnar databases, various compression approaches, including NULL Suppression, Dictionary Encoding, Run Length Encoding, Bit Vector Encoding, and Lempel Ziv Encoding, are discussed in this work. Database operations are conducted on the compressed data to demonstrate the decrease in memory needs and speed improvements.","PeriodicalId":125707,"journal":{"name":"Journal of Computational Science and Intelligent Technologies","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Science and Intelligent Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.53409/mnaa/jcsit/e202203012432","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Today’s life is completely dependent on data. Conventional relational databases take longer to respond to queries because they are built for row-wise data storage and retrieval. Due to their efficient read and write operations to and from hard discs, which reduce the time it takes for queries to produce results, columnar databases have recently overtaken traditional databases. To execute Business Intelligence and create decision-making systems, vast amounts of data gathered from various sources are required in data warehouses, where columnar databases are primarily created. Since the data are stacked closely together, and the seek time is reduced, columnar databases perform queries more quickly. With aggregation queries to remove unnecessary data, they allow several compression techniques for faster data access. To optimise the efficiency of columnar databases, various compression approaches, including NULL Suppression, Dictionary Encoding, Run Length Encoding, Bit Vector Encoding, and Lempel Ziv Encoding, are discussed in this work. Database operations are conducted on the compressed data to demonstrate the decrease in memory needs and speed improvements.

查看原文本刊更多论文

列式数据库优化的压缩方法

今天的生活完全依赖于数据。传统的关系数据库响应查询所需的时间更长，因为它们是为逐行数据存储和检索而构建的。由于它们对硬盘进行高效的读写操作，从而减少了查询产生结果所需的时间，列式数据库最近已经超过了传统数据库。为了执行商业智能和创建决策系统，需要在数据仓库中收集从各种来源收集的大量数据，其中主要创建列式数据库。由于数据紧密地堆叠在一起，并且减少了寻道时间，因此列式数据库执行查询的速度更快。通过聚合查询来删除不必要的数据，它们允许使用几种压缩技术来实现更快的数据访问。为了优化柱状数据库的效率，本文讨论了各种压缩方法，包括NULL抑制、字典编码、运行长度编码、位矢量编码和Lempel Ziv编码。数据库操作是在压缩数据上进行的，以证明内存需求的减少和速度的提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Computational Science and Intelligent Technologies

自引率

0.00%

发文量