一种快速且节省空间的最小冗余前缀码计算算法

6th International Symposium on String Processing and Information Retrieval. 5th International Workshop on Groupware (Cat. No.PR00268) Pub Date : 1999-09-21 DOI:10.1109/SPIRE.1999.796587

R. Milidiú, A. Pessoa, E. Laber

{"title":"一种快速且节省空间的最小冗余前缀码计算算法","authors":"R. Milidiú, A. Pessoa, E. Laber","doi":"10.1109/SPIRE.1999.796587","DOIUrl":null,"url":null,"abstract":"The minimum redundancy prefix code problem is to determine, for a given list W=[w/sub 1/,...,w/sub n/] of n positive symbol weights, a list L=[l/sub 1/,...,l/sub n/] of n corresponding integer codeword lengths such that /spl Sigma//sub i=1//sup n/2/sup -li//spl les/1 and /spl Sigma//sub i=1//sup n/w/sub i/l/sub i/ is minimized. With the optimal list of codeword lengths, an optimal canonical code can be easily obtained. If W is already sorted, then this optimal code can also be represented by the list M=[m/sub 1/,...,m/sub H/], where m/sub l/, for l=1,...,H, denotes the number of codewords with length l and H is the length of the longest codeword. Fortunately, H is proved to be O(min{log(1/p/sub 1/),n}, where p/sub 1/ is the smallest symbol probability, given by w/sub 1///spl Sigma//sub i=1//sup n/w/sub i/. The E-LazyHuff algorithm uses a lazy approach to calculate optimal codes in O(nlog(n/H)) time, requiring only O(H) additional space. In addition, the input weights are not destroyed during the code calculation. We propose a new technique, which we call homogenization, that can be used to improve the time efficiency of algorithms for constructing optimal prefix codes. Next, we introduce the Best LazyHuff algorithm (B-LazyHuff) as an application of this technique. B-LazyHuff is an O(n)-time variation of the E-LazyHuff algorithm. It also requires O(H) additional space and does not destroy the input data.","PeriodicalId":131279,"journal":{"name":"6th International Symposium on String Processing and Information Retrieval. 5th International Workshop on Groupware (Cat. No.PR00268)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A fast and space-economical algorithm for calculating minimum redundancy prefix codes\",\"authors\":\"R. Milidiú, A. Pessoa, E. Laber\",\"doi\":\"10.1109/SPIRE.1999.796587\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The minimum redundancy prefix code problem is to determine, for a given list W=[w/sub 1/,...,w/sub n/] of n positive symbol weights, a list L=[l/sub 1/,...,l/sub n/] of n corresponding integer codeword lengths such that /spl Sigma//sub i=1//sup n/2/sup -li//spl les/1 and /spl Sigma//sub i=1//sup n/w/sub i/l/sub i/ is minimized. With the optimal list of codeword lengths, an optimal canonical code can be easily obtained. If W is already sorted, then this optimal code can also be represented by the list M=[m/sub 1/,...,m/sub H/], where m/sub l/, for l=1,...,H, denotes the number of codewords with length l and H is the length of the longest codeword. Fortunately, H is proved to be O(min{log(1/p/sub 1/),n}, where p/sub 1/ is the smallest symbol probability, given by w/sub 1///spl Sigma//sub i=1//sup n/w/sub i/. The E-LazyHuff algorithm uses a lazy approach to calculate optimal codes in O(nlog(n/H)) time, requiring only O(H) additional space. In addition, the input weights are not destroyed during the code calculation. We propose a new technique, which we call homogenization, that can be used to improve the time efficiency of algorithms for constructing optimal prefix codes. Next, we introduce the Best LazyHuff algorithm (B-LazyHuff) as an application of this technique. B-LazyHuff is an O(n)-time variation of the E-LazyHuff algorithm. It also requires O(H) additional space and does not destroy the input data.\",\"PeriodicalId\":131279,\"journal\":{\"name\":\"6th International Symposium on String Processing and Information Retrieval. 5th International Workshop on Groupware (Cat. No.PR00268)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"6th International Symposium on String Processing and Information Retrieval. 5th International Workshop on Groupware (Cat. No.PR00268)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPIRE.1999.796587\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"6th International Symposium on String Processing and Information Retrieval. 5th International Workshop on Groupware (Cat. No.PR00268)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPIRE.1999.796587","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

最小冗余前缀码问题是确定，对于给定列表W=[W /sub 1/，…]，w/下标n/]的n个正符号权值，一个列表L=[L /下标1/，…]，l/下标n/]的n个对应的整数码字长度，使得/spl Sigma//下标i=1//sup n/2/sup -li//spl les/1和/spl Sigma//下标i=1//sup n/w/下标i/l/下标i/最小。有了最优码字长度列表，就可以很容易地得到最优规范码。如果W已经排序，那么这个最优代码也可以用列表M=[M /sub 1/，…]来表示，m/下标H/]，其中m/下标l/，对于l=1，…，H为长度为l的码字个数，H为最长码字的长度。幸运的是，H被证明为O(min{log(1/p/下标1/)，n}，其中p/下标1/是最小的符号概率，由w/下标1///spl Sigma//下标i=1//sup n/w/下标i/给出。E-LazyHuff算法使用惰性方法在O(nlog(n/H))时间内计算最优代码，只需要O(H)额外空间。此外，在代码计算过程中不会破坏输入权重。我们提出了一种新的技术，我们称之为均匀化，可以用来提高算法的时间效率，以构造最优的前缀码。接下来，我们介绍了最佳LazyHuff算法(B-LazyHuff)作为该技术的应用。B-LazyHuff算法是E-LazyHuff算法的O(n)时间变体。它还需要O(H)的额外空间，并且不会破坏输入数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A fast and space-economical algorithm for calculating minimum redundancy prefix codes

The minimum redundancy prefix code problem is to determine, for a given list W=[w/sub 1/,...,w/sub n/] of n positive symbol weights, a list L=[l/sub 1/,...,l/sub n/] of n corresponding integer codeword lengths such that /spl Sigma//sub i=1//sup n/2/sup -li//spl les/1 and /spl Sigma//sub i=1//sup n/w/sub i/l/sub i/ is minimized. With the optimal list of codeword lengths, an optimal canonical code can be easily obtained. If W is already sorted, then this optimal code can also be represented by the list M=[m/sub 1/,...,m/sub H/], where m/sub l/, for l=1,...,H, denotes the number of codewords with length l and H is the length of the longest codeword. Fortunately, H is proved to be O(min{log(1/p/sub 1/),n}, where p/sub 1/ is the smallest symbol probability, given by w/sub 1///spl Sigma//sub i=1//sup n/w/sub i/. The E-LazyHuff algorithm uses a lazy approach to calculate optimal codes in O(nlog(n/H)) time, requiring only O(H) additional space. In addition, the input weights are not destroyed during the code calculation. We propose a new technique, which we call homogenization, that can be used to improve the time efficiency of algorithms for constructing optimal prefix codes. Next, we introduce the Best LazyHuff algorithm (B-LazyHuff) as an application of this technique. B-LazyHuff is an O(n)-time variation of the E-LazyHuff algorithm. It also requires O(H) additional space and does not destroy the input data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

6th International Symposium on String Processing and Information Retrieval. 5th International Workshop on Groupware (Cat. No.PR00268)

自引率

0.00%

发文量