一种适用于批量压缩的快速符号编码方案

Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225) Pub Date : 1998-03-30 DOI:10.1109/DCC.1998.672309

A. Palau, G. Mirchandani

{"title":"一种适用于批量压缩的快速符号编码方案","authors":"A. Palau, G. Mirchandani","doi":"10.1109/DCC.1998.672309","DOIUrl":null,"url":null,"abstract":"Summary form only given. Given a source alphabet of M symbols and associated probabilities or their estimates, we find a sub-optimal set of codewords using a simple prefix property type iterative algorithm to generate codewords lengths and a look-up table based mapping algorithm for assigning codewords. The expected codeword length L/sub f/ is slightly longer than that obtained for a Huffman code but may also be equal to it. When it is equal, the algorithm generates a larger set of applicable codewords. The time complexity for generating lengths and the associated codewords is less than that with the Huffman code, where these tasks have a complexity of O(M log M), while they are of order O(M) in the new algorithm. For bulk compression, where it is necessary to compress a large number of small files, the algorithm typically shows greater compression efficiency than that obtained with other standard UNIX based compressors.","PeriodicalId":191890,"journal":{"name":"Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A fast symbol coding scheme with specific application in bulk compression\",\"authors\":\"A. Palau, G. Mirchandani\",\"doi\":\"10.1109/DCC.1998.672309\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Summary form only given. Given a source alphabet of M symbols and associated probabilities or their estimates, we find a sub-optimal set of codewords using a simple prefix property type iterative algorithm to generate codewords lengths and a look-up table based mapping algorithm for assigning codewords. The expected codeword length L/sub f/ is slightly longer than that obtained for a Huffman code but may also be equal to it. When it is equal, the algorithm generates a larger set of applicable codewords. The time complexity for generating lengths and the associated codewords is less than that with the Huffman code, where these tasks have a complexity of O(M log M), while they are of order O(M) in the new algorithm. For bulk compression, where it is necessary to compress a large number of small files, the algorithm typically shows greater compression efficiency than that obtained with other standard UNIX based compressors.\",\"PeriodicalId\":191890,\"journal\":{\"name\":\"Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC.1998.672309\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.1998.672309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

只提供摘要形式。给定M个符号的源字母表及其相关概率或估计，我们使用简单的前缀属性类型迭代算法来生成码字长度，并使用基于查找表的映射算法来分配码字，从而找到次优的码字集。期望码字长度L/下标f/略长于霍夫曼码，但也可能等于它。当它相等时，算法生成更大的适用码字集。生成长度和相关码字的时间复杂度低于霍夫曼码，后者的复杂度为O(M log M)，而新算法的复杂度为O(M)阶。对于需要压缩大量小文件的批量压缩，该算法通常比其他基于UNIX的标准压缩器显示出更高的压缩效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A fast symbol coding scheme with specific application in bulk compression

Summary form only given. Given a source alphabet of M symbols and associated probabilities or their estimates, we find a sub-optimal set of codewords using a simple prefix property type iterative algorithm to generate codewords lengths and a look-up table based mapping algorithm for assigning codewords. The expected codeword length L/sub f/ is slightly longer than that obtained for a Huffman code but may also be equal to it. When it is equal, the algorithm generates a larger set of applicable codewords. The time complexity for generating lengths and the associated codewords is less than that with the Huffman code, where these tasks have a complexity of O(M log M), while they are of order O(M) in the new algorithm. For bulk compression, where it is necessary to compress a large number of small files, the algorithm typically shows greater compression efficiency than that obtained with other standard UNIX based compressors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225)

自引率

0.00%

发文量