面向海量数据压缩的快速高效熵编码架构

Technologies (Basel) Pub Date : 2023-09-26 DOI:10.3390/technologies11050132

Francesc Auli-Llinas

{"title":"面向海量数据压缩的快速高效熵编码架构","authors":"Francesc Auli-Llinas","doi":"10.3390/technologies11050132","DOIUrl":null,"url":null,"abstract":"The compression of data is fundamental to alleviating the costs of transmitting and storing massive datasets employed in myriad fields of our society. Most compression systems employ an entropy coder in their coding pipeline to remove the redundancy of coded symbols. The entropy-coding stage needs to be efficient, to yield high compression ratios, and fast, to process large amounts of data rapidly. Despite their widespread use, entropy coders are commonly assessed for some particular scenario or coding system. This work provides a general framework to assess and optimize different entropy coders. First, the paper describes three main families of entropy coders, namely those based on variable-to-variable length codes (V2VLC), arithmetic coding (AC), and tabled asymmetric numeral systems (tANS). Then, a low-complexity architecture for the most representative coder(s) of each family is presented—more precisely, a general version of V2VLC, the MQ, M, and a fixed-length version of AC and two different implementations of tANS. These coders are evaluated under different coding conditions in terms of compression efficiency and computational throughput. The results obtained suggest that V2VLC and tANS achieve the highest compression ratios for most coding rates and that the AC coder that uses fixed-length codewords attains the highest throughput. The experimental evaluation discloses the advantages and shortcomings of each entropy-coding scheme, providing insights that may help to select this stage in forthcoming compression systems.","PeriodicalId":472933,"journal":{"name":"Technologies (Basel)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fast and Efficient Entropy Coding Architectures for Massive Data Compression\",\"authors\":\"Francesc Auli-Llinas\",\"doi\":\"10.3390/technologies11050132\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The compression of data is fundamental to alleviating the costs of transmitting and storing massive datasets employed in myriad fields of our society. Most compression systems employ an entropy coder in their coding pipeline to remove the redundancy of coded symbols. The entropy-coding stage needs to be efficient, to yield high compression ratios, and fast, to process large amounts of data rapidly. Despite their widespread use, entropy coders are commonly assessed for some particular scenario or coding system. This work provides a general framework to assess and optimize different entropy coders. First, the paper describes three main families of entropy coders, namely those based on variable-to-variable length codes (V2VLC), arithmetic coding (AC), and tabled asymmetric numeral systems (tANS). Then, a low-complexity architecture for the most representative coder(s) of each family is presented—more precisely, a general version of V2VLC, the MQ, M, and a fixed-length version of AC and two different implementations of tANS. These coders are evaluated under different coding conditions in terms of compression efficiency and computational throughput. The results obtained suggest that V2VLC and tANS achieve the highest compression ratios for most coding rates and that the AC coder that uses fixed-length codewords attains the highest throughput. The experimental evaluation discloses the advantages and shortcomings of each entropy-coding scheme, providing insights that may help to select this stage in forthcoming compression systems.\",\"PeriodicalId\":472933,\"journal\":{\"name\":\"Technologies (Basel)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Technologies (Basel)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/technologies11050132\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Technologies (Basel)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/technologies11050132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

数据压缩是降低传输和存储大量数据集的成本的基础，这些数据集应用于我们社会的无数领域。大多数压缩系统在其编码管道中使用熵编码器来消除编码符号的冗余。熵编码阶段需要高效，产生高压缩比，快速，快速处理大量数据。尽管它们被广泛使用，熵编码器通常被评估为一些特定的场景或编码系统。这项工作提供了一个通用的框架来评估和优化不同的熵编码器。首先，本文介绍了三种主要的熵编码器，即基于变到变长编码(V2VLC)、算术编码(AC)和表非对称数字系统(tANS)的熵编码器。然后，为每个系列中最具代表性的编码器提供了一种低复杂性的体系结构——更准确地说，是V2VLC的通用版本、MQ、M、AC的固定长度版本和tANS的两种不同实现。在不同的编码条件下，对这些编码器的压缩效率和计算吞吐量进行了评估。结果表明，V2VLC和tANS在大多数编码速率下实现了最高的压缩比，而使用固定长度码字的AC编码器实现了最高的吞吐量。实验评估揭示了每个熵编码方案的优点和缺点，提供了可能有助于在即将到来的压缩系统中选择这一阶段的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fast and Efficient Entropy Coding Architectures for Massive Data Compression

The compression of data is fundamental to alleviating the costs of transmitting and storing massive datasets employed in myriad fields of our society. Most compression systems employ an entropy coder in their coding pipeline to remove the redundancy of coded symbols. The entropy-coding stage needs to be efficient, to yield high compression ratios, and fast, to process large amounts of data rapidly. Despite their widespread use, entropy coders are commonly assessed for some particular scenario or coding system. This work provides a general framework to assess and optimize different entropy coders. First, the paper describes three main families of entropy coders, namely those based on variable-to-variable length codes (V2VLC), arithmetic coding (AC), and tabled asymmetric numeral systems (tANS). Then, a low-complexity architecture for the most representative coder(s) of each family is presented—more precisely, a general version of V2VLC, the MQ, M, and a fixed-length version of AC and two different implementations of tANS. These coders are evaluated under different coding conditions in terms of compression efficiency and computational throughput. The results obtained suggest that V2VLC and tANS achieve the highest compression ratios for most coding rates and that the AC coder that uses fixed-length codewords attains the highest throughput. The experimental evaluation discloses the advantages and shortcomings of each entropy-coding scheme, providing insights that may help to select this stage in forthcoming compression systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Technologies (Basel)

自引率

0.00%

发文量