LZ77-Like Compression with Fast Random Access

Sebastian Kreft, G. Navarro
{"title":"LZ77-Like Compression with Fast Random Access","authors":"Sebastian Kreft, G. Navarro","doi":"10.1109/DCC.2010.29","DOIUrl":null,"url":null,"abstract":"We introduce an alternative Lempel-Ziv text parsing, LZ-End, that converges to the entropy and in practice gets very close to LZ77. LZ-End forces sources to finish at the end of a previous phrase. Most Lempel-Ziv parsings can decompress the text only from the beginning. LZ-End is the only parsing we know of able of decompressing arbitrary phrases in optimal time, while staying closely competitive with LZ77, especially on highly repetitive collections, where LZ77 excells. Thus LZ-End is ideal as a compression format for highly repetitive sequence databases, where access to individual sequences is required, and it also opens the door to compressed indexing schemes for such collections.","PeriodicalId":299459,"journal":{"name":"2010 Data Compression Conference","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"82","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2010.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 82

Abstract

We introduce an alternative Lempel-Ziv text parsing, LZ-End, that converges to the entropy and in practice gets very close to LZ77. LZ-End forces sources to finish at the end of a previous phrase. Most Lempel-Ziv parsings can decompress the text only from the beginning. LZ-End is the only parsing we know of able of decompressing arbitrary phrases in optimal time, while staying closely competitive with LZ77, especially on highly repetitive collections, where LZ77 excells. Thus LZ-End is ideal as a compression format for highly repetitive sequence databases, where access to individual sequences is required, and it also opens the door to compressed indexing schemes for such collections.
类lz77压缩与快速随机访问
我们引入了另一种Lempel-Ziv文本解析,LZ-End,它收敛于熵,在实践中非常接近LZ77。LZ-End强制源在前一个短语的末尾结束。大多数Lempel-Ziv解析只能从头开始解压缩文本。LZ-End是我们所知道的唯一能够在最佳时间内解压缩任意短语的解析,同时与LZ77保持密切竞争,特别是在高度重复的集合上,这是LZ77所擅长的。因此,对于需要访问单个序列的高度重复序列数据库,LZ-End是理想的压缩格式,并且它还为此类集合的压缩索引方案打开了大门。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信