一种通用的浮点数据流压缩方法

2013 Fourth International Conference on Networking and Distributed Computing Pub Date : 2013-12-21 DOI:10.1109/ICNDC.2013.32

Songbin Liu, Xiaomeng Huang, Yufang Ni, H. Fu, Guangwen Yang

{"title":"一种通用的浮点数据流压缩方法","authors":"Songbin Liu, Xiaomeng Huang, Yufang Ni, H. Fu, Guangwen Yang","doi":"10.1109/ICNDC.2013.32","DOIUrl":null,"url":null,"abstract":"With the rapid advances in supercomputing and numerical simulations, the output data of scientific computing is expanding rapidly, bringing tough challenges for data sharing and data archiving. Data compression can mitigate these challenges by reducing the size of the data to be stored or transferred. However, data compression has to achieve a good balance between compression ratios and throughput, before it can be employed in the high-end computing environments. In this paper, we propose and evaluate a versatile compression method for floating-point data. Firstly, it can achieve much better compression ratios than existing general purpose compression methods with promising throughputs. Secondly, it supports asymmetric decompression: losslessly compressed data can be decompressed lossily, thus facilitating data analysis in different precision requirements. Thirdly, it can leverage existing different kinds of general purpose compressors (zlib, lz4, for instance), and provide more flexible trade-offs between compression ratios and throughputs. Evaluations demonstrate that our compressor can achieve comparable compression ratios with the best compressors, while the compression and decompression throughputs can be 10 times higher than them. The single thread compression throughputs can be 135 MB/s, and the decompression throughputs can be 194 MB/s.","PeriodicalId":152234,"journal":{"name":"2013 Fourth International Conference on Networking and Distributed Computing","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Versatile Compression Method for Floating-Point Data Stream\",\"authors\":\"Songbin Liu, Xiaomeng Huang, Yufang Ni, H. Fu, Guangwen Yang\",\"doi\":\"10.1109/ICNDC.2013.32\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid advances in supercomputing and numerical simulations, the output data of scientific computing is expanding rapidly, bringing tough challenges for data sharing and data archiving. Data compression can mitigate these challenges by reducing the size of the data to be stored or transferred. However, data compression has to achieve a good balance between compression ratios and throughput, before it can be employed in the high-end computing environments. In this paper, we propose and evaluate a versatile compression method for floating-point data. Firstly, it can achieve much better compression ratios than existing general purpose compression methods with promising throughputs. Secondly, it supports asymmetric decompression: losslessly compressed data can be decompressed lossily, thus facilitating data analysis in different precision requirements. Thirdly, it can leverage existing different kinds of general purpose compressors (zlib, lz4, for instance), and provide more flexible trade-offs between compression ratios and throughputs. Evaluations demonstrate that our compressor can achieve comparable compression ratios with the best compressors, while the compression and decompression throughputs can be 10 times higher than them. The single thread compression throughputs can be 135 MB/s, and the decompression throughputs can be 194 MB/s.\",\"PeriodicalId\":152234,\"journal\":{\"name\":\"2013 Fourth International Conference on Networking and Distributed Computing\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 Fourth International Conference on Networking and Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNDC.2013.32\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Fourth International Conference on Networking and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNDC.2013.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

随着超级计算和数值模拟技术的飞速发展，科学计算的输出数据迅速膨胀，给数据共享和数据存档带来了严峻的挑战。数据压缩可以通过减少要存储或传输的数据的大小来缓解这些挑战。然而，数据压缩必须在压缩比和吞吐量之间取得良好的平衡，才能在高端计算环境中使用。在本文中，我们提出并评估了一种通用的浮点数据压缩方法。首先，它可以获得比现有的通用压缩方法更好的压缩比，并且具有良好的吞吐量。其次，支持非对称解压缩:无损压缩后的数据可以进行有损解压缩，方便不同精度要求的数据分析。第三，它可以利用现有的不同类型的通用压缩器(例如zlib、lz4)，并在压缩比和吞吐量之间提供更灵活的权衡。评估表明，我们的压缩机可以达到与最好的压缩机相当的压缩比，而压缩和解压吞吐量可以比他们高10倍。单线程压缩吞吐量可达135 MB/s，解压吞吐量可达194 MB/s。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Versatile Compression Method for Floating-Point Data Stream

With the rapid advances in supercomputing and numerical simulations, the output data of scientific computing is expanding rapidly, bringing tough challenges for data sharing and data archiving. Data compression can mitigate these challenges by reducing the size of the data to be stored or transferred. However, data compression has to achieve a good balance between compression ratios and throughput, before it can be employed in the high-end computing environments. In this paper, we propose and evaluate a versatile compression method for floating-point data. Firstly, it can achieve much better compression ratios than existing general purpose compression methods with promising throughputs. Secondly, it supports asymmetric decompression: losslessly compressed data can be decompressed lossily, thus facilitating data analysis in different precision requirements. Thirdly, it can leverage existing different kinds of general purpose compressors (zlib, lz4, for instance), and provide more flexible trade-offs between compression ratios and throughputs. Evaluations demonstrate that our compressor can achieve comparable compression ratios with the best compressors, while the compression and decompression throughputs can be 10 times higher than them. The single thread compression throughputs can be 135 MB/s, and the decompression throughputs can be 194 MB/s.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 Fourth International Conference on Networking and Distributed Computing

自引率

0.00%

发文量