{"title":"Column-based RLE in row-oriented database","authors":"Mingyuan An","doi":"10.1109/CYBERC.2009.5342213","DOIUrl":null,"url":null,"abstract":"In database systems, disk I/O performance is usually the bottleneck of the whole query processing. Among many techniques, compression is one of the most important ones to reduce disk accesses so to improve system performance. RLE (Run-Length Encoding) is one light-weight compression algorithm which incurs negligible CPU cost. A lot of work show that, although RLE is one of the most effective compression techniques in column-oriented systems, it is very hard to use due to bad value locality in row-oriented systems where values from multiple attributes are stored in the same page. We propose CRLE (Column-based RLE), one compression algorithm to apply RLE to row-oriented data storage. On row-oriented storage page, CRLE can exploit value locality in individual column and encode values from the same column in run-length format. Experiments show that CRLE can lead to very good compression ratio and performance in spite of row-oriented data storage.","PeriodicalId":222874,"journal":{"name":"2009 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CYBERC.2009.5342213","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In database systems, disk I/O performance is usually the bottleneck of the whole query processing. Among many techniques, compression is one of the most important ones to reduce disk accesses so to improve system performance. RLE (Run-Length Encoding) is one light-weight compression algorithm which incurs negligible CPU cost. A lot of work show that, although RLE is one of the most effective compression techniques in column-oriented systems, it is very hard to use due to bad value locality in row-oriented systems where values from multiple attributes are stored in the same page. We propose CRLE (Column-based RLE), one compression algorithm to apply RLE to row-oriented data storage. On row-oriented storage page, CRLE can exploit value locality in individual column and encode values from the same column in run-length format. Experiments show that CRLE can lead to very good compression ratio and performance in spite of row-oriented data storage.