{"title":"Code and parse tree for lossless source encoding","authors":"J. Abrahams","doi":"10.4310/CIS.2001.V1.N2.A1","DOIUrl":null,"url":null,"abstract":"This paper surveys the theoretical literature on fixed-to-variable-length lossless source code trees, called code trees, and on variable-length-to-fixed lossless sounce code trees, called parse trees. Huffman coding [ l ] is the most well known code tree problem, but there are a number of interesting variants of the problem formulation which lead to other combinatorial optimization problems. Huffman coding as an instance of combinatorial search has been highlighted in the books by Ahlswede and Wegener [2] and Aigner [3]. See also the papers of Hinderer and Stieglitz [4] and Hassin and Henig [5] for overviews of the combinatorial search literature. Tunstall parsing [6] is the most well known parse tree problem for a probability-based source model, although parsing based directly on source data is very familiar as Lempel Ziv parsing [7-81, a family of techniques which is outside the scope of this survey. Similarly, adaptive, data-based variants of Huffman coding, e.g. [9-1:2] will not be treated here. Rather, the assumption here is that the source model is given as a sequence of independent and identically distributed (iid) random variables for some known discrete distribution, although on occasion it is possible that only partial information about the source is available. These lossless source encoding techniques comprise a subset of data compression techniques, and broader surveys of the data compression literature are available [ 13-21].","PeriodicalId":185710,"journal":{"name":"Commun. Inf. Syst.","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Commun. Inf. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4310/CIS.2001.V1.N2.A1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 54
Abstract
This paper surveys the theoretical literature on fixed-to-variable-length lossless source code trees, called code trees, and on variable-length-to-fixed lossless sounce code trees, called parse trees. Huffman coding [ l ] is the most well known code tree problem, but there are a number of interesting variants of the problem formulation which lead to other combinatorial optimization problems. Huffman coding as an instance of combinatorial search has been highlighted in the books by Ahlswede and Wegener [2] and Aigner [3]. See also the papers of Hinderer and Stieglitz [4] and Hassin and Henig [5] for overviews of the combinatorial search literature. Tunstall parsing [6] is the most well known parse tree problem for a probability-based source model, although parsing based directly on source data is very familiar as Lempel Ziv parsing [7-81, a family of techniques which is outside the scope of this survey. Similarly, adaptive, data-based variants of Huffman coding, e.g. [9-1:2] will not be treated here. Rather, the assumption here is that the source model is given as a sequence of independent and identically distributed (iid) random variables for some known discrete distribution, although on occasion it is possible that only partial information about the source is available. These lossless source encoding techniques comprise a subset of data compression techniques, and broader surveys of the data compression literature are available [ 13-21].