SPDP: An Automatically Synthesized Lossless Compression Algorithm for Floating-Point Data

S. Claggett, S. Azimi, Martin Burtscher
{"title":"SPDP: An Automatically Synthesized Lossless Compression Algorithm for Floating-Point Data","authors":"S. Claggett, S. Azimi, Martin Burtscher","doi":"10.1109/DCC.2018.00042","DOIUrl":null,"url":null,"abstract":"Scientific computing produces, transfers, and stores massive amounts of single- and double-precision floating-point data, making this a domain that can greatly benefit from data compression. To gain insight into what makes an effective lossless compression algorithm for such data, we generated over nine million algorithms and selected the one that yields the highest compression ratio on 26 datasets. The resulting algorithm, called SPDP, comprises four data transformations that operate exclusively at word or byte granularity. Nevertheless, SPDP delivers the highest compression ratio on eleven datasets and, on average, outperforms all but one of the seven compared compressors. An analysis of SPDP's internals reveals how to build effective compression algorithms for scientific data.","PeriodicalId":137206,"journal":{"name":"2018 Data Compression Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2018.00042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

Abstract

Scientific computing produces, transfers, and stores massive amounts of single- and double-precision floating-point data, making this a domain that can greatly benefit from data compression. To gain insight into what makes an effective lossless compression algorithm for such data, we generated over nine million algorithms and selected the one that yields the highest compression ratio on 26 datasets. The resulting algorithm, called SPDP, comprises four data transformations that operate exclusively at word or byte granularity. Nevertheless, SPDP delivers the highest compression ratio on eleven datasets and, on average, outperforms all but one of the seven compared compressors. An analysis of SPDP's internals reveals how to build effective compression algorithms for scientific data.
SPDP:一种自动合成的浮点数据无损压缩算法
科学计算产生、传输和存储大量的单精度和双精度浮点数据,这使得这个领域可以从数据压缩中受益匪浅。为了深入了解是什么让这些数据成为有效的无损压缩算法,我们生成了超过900万个算法,并选择了在26个数据集上产生最高压缩比的算法。生成的算法称为SPDP,包含四个数据转换,它们只在字或字节粒度上操作。尽管如此,SPDP在11个数据集上提供了最高的压缩比,并且平均优于7个被比较的压缩器之一。对SPDP内部的分析揭示了如何为科学数据构建有效的压缩算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信