改进的基于反字典的压缩

M. Crochemore, G. Navarro
{"title":"改进的基于反字典的压缩","authors":"M. Crochemore, G. Navarro","doi":"10.1109/SCCC.2002.1173168","DOIUrl":null,"url":null,"abstract":"The compression of binary texts using antidictionaries is a novel technique based on the fact that some substrings (called \"antifactors\") never appear in the text. Let sb be an antifactor where b is its last bit. Every time s appears in the text we know that the next bit is b~ and hence omit its representation. Since building the set of all antifactors is space consuming at compression time, it is customary to limit the maximum length of antifactors considered up to a constant k. Larger k yields better compression of the text but requires more space at compression time. In this paper we introduce the notion of almost antifactors, which are strings that rarely appear in the text. More formally, almost antifactors are strings that, if we consider them as antifactors and separately code their occurrences as exceptions, the compression ratio improves. We show that almost antifactors permit improving compression with a limited amount of main memory to compress. Our experiments show that they obtain the same compression of the classical algorithm using only 30%-55% of its memory space.","PeriodicalId":130951,"journal":{"name":"12th International Conference of the Chilean Computer Science Society, 2002. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Improved antidictionary based compression\",\"authors\":\"M. Crochemore, G. Navarro\",\"doi\":\"10.1109/SCCC.2002.1173168\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The compression of binary texts using antidictionaries is a novel technique based on the fact that some substrings (called \\\"antifactors\\\") never appear in the text. Let sb be an antifactor where b is its last bit. Every time s appears in the text we know that the next bit is b~ and hence omit its representation. Since building the set of all antifactors is space consuming at compression time, it is customary to limit the maximum length of antifactors considered up to a constant k. Larger k yields better compression of the text but requires more space at compression time. In this paper we introduce the notion of almost antifactors, which are strings that rarely appear in the text. More formally, almost antifactors are strings that, if we consider them as antifactors and separately code their occurrences as exceptions, the compression ratio improves. We show that almost antifactors permit improving compression with a limited amount of main memory to compress. Our experiments show that they obtain the same compression of the classical algorithm using only 30%-55% of its memory space.\",\"PeriodicalId\":130951,\"journal\":{\"name\":\"12th International Conference of the Chilean Computer Science Society, 2002. Proceedings.\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"12th International Conference of the Chilean Computer Science Society, 2002. Proceedings.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCCC.2002.1173168\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"12th International Conference of the Chilean Computer Science Society, 2002. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCCC.2002.1173168","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

摘要

使用反字典压缩二进制文本是一种基于某些子字符串(称为“反因子”)从未出现在文本中的新技术。设sb是一个反因子,其中b是它的最后一位。每次s出现在文本中,我们都知道下一位是b~,因此省略了它的表示。由于构建所有反因子的集合在压缩时消耗空间,因此通常将反因子的最大长度限制为常数k。较大的k会产生更好的文本压缩,但在压缩时需要更多的空间。在本文中,我们引入了几乎反因子的概念,它是在文本中很少出现的字符串。更正式地说,几乎反因子是字符串,如果我们将它们视为反因子,并将它们的出现单独编码为异常,压缩比就会提高。我们表明,几乎反因子允许在压缩有限的主内存的情况下改进压缩。我们的实验表明,它们仅使用经典算法的30%-55%的内存空间就获得了与经典算法相同的压缩。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improved antidictionary based compression
The compression of binary texts using antidictionaries is a novel technique based on the fact that some substrings (called "antifactors") never appear in the text. Let sb be an antifactor where b is its last bit. Every time s appears in the text we know that the next bit is b~ and hence omit its representation. Since building the set of all antifactors is space consuming at compression time, it is customary to limit the maximum length of antifactors considered up to a constant k. Larger k yields better compression of the text but requires more space at compression time. In this paper we introduce the notion of almost antifactors, which are strings that rarely appear in the text. More formally, almost antifactors are strings that, if we consider them as antifactors and separately code their occurrences as exceptions, the compression ratio improves. We show that almost antifactors permit improving compression with a limited amount of main memory to compress. Our experiments show that they obtain the same compression of the classical algorithm using only 30%-55% of its memory space.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信