{"title":"具有快速直接访问的压缩无序整数序列","authors":"I. Zavadskyi","doi":"10.1109/DCC55655.2023.00053","DOIUrl":null,"url":null,"abstract":"A compressed representation of integer sequences is the key element of different data compression techniques. The variable-length Reverse Multi-Delimiter codes [1] provide a simple and space-efficient solution to the given problem, combining a good compression ratio with fast decoding. In this research, we investigate another property of RMD-codes - the ability of direct access to codewords in the encoded bitstream. If integers are sorted and the deltas between them are small enough, the problem of direct access is reduced to performing the select operation on a bitmap. However, RMD-codes allow us to address the more general problem of direct access to elements of an unordered integer sequence given in a compressed form. We developed the method of extracting and decoding a codeword from an RMD-bitstream in almost constant time. In text compression, the solution is highly space-saving as the RMD-code size is close to the entropy and extra data structures are tiny.","PeriodicalId":209029,"journal":{"name":"2023 Data Compression Conference (DCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Compressed unordered integer sequences with fast direct access\",\"authors\":\"I. Zavadskyi\",\"doi\":\"10.1109/DCC55655.2023.00053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A compressed representation of integer sequences is the key element of different data compression techniques. The variable-length Reverse Multi-Delimiter codes [1] provide a simple and space-efficient solution to the given problem, combining a good compression ratio with fast decoding. In this research, we investigate another property of RMD-codes - the ability of direct access to codewords in the encoded bitstream. If integers are sorted and the deltas between them are small enough, the problem of direct access is reduced to performing the select operation on a bitmap. However, RMD-codes allow us to address the more general problem of direct access to elements of an unordered integer sequence given in a compressed form. We developed the method of extracting and decoding a codeword from an RMD-bitstream in almost constant time. In text compression, the solution is highly space-saving as the RMD-code size is close to the entropy and extra data structures are tiny.\",\"PeriodicalId\":209029,\"journal\":{\"name\":\"2023 Data Compression Conference (DCC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 Data Compression Conference (DCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC55655.2023.00053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Data Compression Conference (DCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC55655.2023.00053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Compressed unordered integer sequences with fast direct access
A compressed representation of integer sequences is the key element of different data compression techniques. The variable-length Reverse Multi-Delimiter codes [1] provide a simple and space-efficient solution to the given problem, combining a good compression ratio with fast decoding. In this research, we investigate another property of RMD-codes - the ability of direct access to codewords in the encoded bitstream. If integers are sorted and the deltas between them are small enough, the problem of direct access is reduced to performing the select operation on a bitmap. However, RMD-codes allow us to address the more general problem of direct access to elements of an unordered integer sequence given in a compressed form. We developed the method of extracting and decoding a codeword from an RMD-bitstream in almost constant time. In text compression, the solution is highly space-saving as the RMD-code size is close to the entropy and extra data structures are tiny.