Lempel-Ziv算法的一些熵界

S. Rao Kosaraju, G. Manzini
{"title":"Lempel-Ziv算法的一些熵界","authors":"S. Rao Kosaraju, G. Manzini","doi":"10.1109/DCC.1997.582106","DOIUrl":null,"url":null,"abstract":"Summary form only given, as follows. We initiate a study of parsing-based compression algorithms such as LZ77 and LZ78 by considering the empirical entropy of the input string. For any string s, we define the k-th order entropy H/sub k/(s) by looking at the number of occurrences of each symbol following each k-length substring inside s. The value H/sub k/(s) is a lower bound to the compression ratio of a statistical modeling algorithm which predicts the probability of the next symbol by looking at the k most recently seen characters. Therefore, our analysis provides a means for comparing Lempel-Ziv methods with the more powerful, but slower, PPM algorithms. Our main contribution is a comparison of the compression ratio of Lempel-Ziv algorithms with the zeroth order entropy H/sub 0/. First we show that for low entropy strings LZ78 compression ratio can be much higher than H/sub 0/. Then, we present a modified algorithm which combines LZ78 with run length encoding and is able to compress efficiently also low entropy strings.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Some entropic bounds for Lempel-Ziv algorithms\",\"authors\":\"S. Rao Kosaraju, G. Manzini\",\"doi\":\"10.1109/DCC.1997.582106\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Summary form only given, as follows. We initiate a study of parsing-based compression algorithms such as LZ77 and LZ78 by considering the empirical entropy of the input string. For any string s, we define the k-th order entropy H/sub k/(s) by looking at the number of occurrences of each symbol following each k-length substring inside s. The value H/sub k/(s) is a lower bound to the compression ratio of a statistical modeling algorithm which predicts the probability of the next symbol by looking at the k most recently seen characters. Therefore, our analysis provides a means for comparing Lempel-Ziv methods with the more powerful, but slower, PPM algorithms. Our main contribution is a comparison of the compression ratio of Lempel-Ziv algorithms with the zeroth order entropy H/sub 0/. First we show that for low entropy strings LZ78 compression ratio can be much higher than H/sub 0/. Then, we present a modified algorithm which combines LZ78 with run length encoding and is able to compress efficiently also low entropy strings.\",\"PeriodicalId\":403990,\"journal\":{\"name\":\"Proceedings DCC '97. Data Compression Conference\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings DCC '97. Data Compression Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC.1997.582106\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings DCC '97. Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.1997.582106","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

摘要

仅给出摘要形式,如下。通过考虑输入字符串的经验熵,我们开始研究基于解析的压缩算法,如LZ77和LZ78。对于任何字符串s,我们定义k阶熵H/sub k/(s),通过查看s内每个k长度的子字符串后面的每个符号的出现次数。值H/sub k/(s)是统计建模算法的压缩比的下界,该算法通过查看最近看到的k个字符来预测下一个符号的概率。因此,我们的分析提供了一种将Lempel-Ziv方法与更强大但更慢的PPM算法进行比较的方法。我们的主要贡献是比较了零阶熵H/sub 0/下Lempel-Ziv算法的压缩比。首先,我们证明了低熵字符串的LZ78压缩比可以远远高于H/sub 0/。然后,我们提出了一种改进算法,该算法将LZ78与运行长度编码相结合,能够有效地压缩低熵字符串。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Some entropic bounds for Lempel-Ziv algorithms
Summary form only given, as follows. We initiate a study of parsing-based compression algorithms such as LZ77 and LZ78 by considering the empirical entropy of the input string. For any string s, we define the k-th order entropy H/sub k/(s) by looking at the number of occurrences of each symbol following each k-length substring inside s. The value H/sub k/(s) is a lower bound to the compression ratio of a statistical modeling algorithm which predicts the probability of the next symbol by looking at the k most recently seen characters. Therefore, our analysis provides a means for comparing Lempel-Ziv methods with the more powerful, but slower, PPM algorithms. Our main contribution is a comparison of the compression ratio of Lempel-Ziv algorithms with the zeroth order entropy H/sub 0/. First we show that for low entropy strings LZ78 compression ratio can be much higher than H/sub 0/. Then, we present a modified algorithm which combines LZ78 with run length encoding and is able to compress efficiently also low entropy strings.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信