{"title":"通过子字符串枚举进行无损数据压缩","authors":"Danny Dubé, V. Beaudoin","doi":"10.1109/DCC.2010.28","DOIUrl":null,"url":null,"abstract":"We present a technique that compresses a string $w$ by enumerating all the substrings of $w$. The substrings are enumerated from the shortest to the longest and in lexicographic order. Compression is obtained from the fact that the set of the substrings of a particular length gives a lot of information about the substrings that are one bit longer. A linear-time, linear-space algorithm is presented. Experimental results show that the compression efficiency comes close to that of the best PPM variants. Other compression techniques are compared to ours.","PeriodicalId":299459,"journal":{"name":"2010 Data Compression Conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Lossless Data Compression via Substring Enumeration\",\"authors\":\"Danny Dubé, V. Beaudoin\",\"doi\":\"10.1109/DCC.2010.28\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a technique that compresses a string $w$ by enumerating all the substrings of $w$. The substrings are enumerated from the shortest to the longest and in lexicographic order. Compression is obtained from the fact that the set of the substrings of a particular length gives a lot of information about the substrings that are one bit longer. A linear-time, linear-space algorithm is presented. Experimental results show that the compression efficiency comes close to that of the best PPM variants. Other compression techniques are compared to ours.\",\"PeriodicalId\":299459,\"journal\":{\"name\":\"2010 Data Compression Conference\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 Data Compression Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC.2010.28\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2010.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Lossless Data Compression via Substring Enumeration
We present a technique that compresses a string $w$ by enumerating all the substrings of $w$. The substrings are enumerated from the shortest to the longest and in lexicographic order. Compression is obtained from the fact that the set of the substrings of a particular length gives a lot of information about the substrings that are one bit longer. A linear-time, linear-space algorithm is presented. Experimental results show that the compression efficiency comes close to that of the best PPM variants. Other compression techniques are compared to ours.