N. Brisaboa, A. Fariña, Adrián Gómez-Brandón, G. Navarro, Tirso V. Rodeiro
{"title":"Dv2v: A Dynamic Variable-to-Variable Compressor","authors":"N. Brisaboa, A. Fariña, Adrián Gómez-Brandón, G. Navarro, Tirso V. Rodeiro","doi":"10.1109/DCC.2019.00016","DOIUrl":null,"url":null,"abstract":"We present D-v2v, a new dynamic (one-pass) variable-to-variable compressor. Variable-to-variable compression aims at using a modeler that gathers variable-length input symbols and a variable-length statistical coder that assigns shorter codewords to the more frequent symbols. In D-v2v, we process the input text word-wise to gather variable-length symbols that can be either terminals (new words) or non-terminals, subsequences of words seen before in the input text. Those input symbols are set in a vocabulary that is kept sorted by frequency. Therefore, those symbols can be easily encoded with dense codes. Our D-v2v permits real-time transmission of data, i.e. compression/transmission can begin as soon as data become available. Our experiments show thatD-v2vis able to overcome the compression ratios of the v2vDC, the state-of-the-art semi-static variable-to-variable compressor, and to almost reach p7zip values. It also draws a competitive performance at both compression and decompression.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"142 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Data Compression Conference (DCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2019.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We present D-v2v, a new dynamic (one-pass) variable-to-variable compressor. Variable-to-variable compression aims at using a modeler that gathers variable-length input symbols and a variable-length statistical coder that assigns shorter codewords to the more frequent symbols. In D-v2v, we process the input text word-wise to gather variable-length symbols that can be either terminals (new words) or non-terminals, subsequences of words seen before in the input text. Those input symbols are set in a vocabulary that is kept sorted by frequency. Therefore, those symbols can be easily encoded with dense codes. Our D-v2v permits real-time transmission of data, i.e. compression/transmission can begin as soon as data become available. Our experiments show thatD-v2vis able to overcome the compression ratios of the v2vDC, the state-of-the-art semi-static variable-to-variable compressor, and to almost reach p7zip values. It also draws a competitive performance at both compression and decompression.