{"title":"HDFS中基于fpga的并行多核GZIP压缩器","authors":"Haoxin Luo, Ye Cai, Qiuming Luo, Rui Mao","doi":"10.1109/PDCAT46702.2019.00017","DOIUrl":null,"url":null,"abstract":"With the development of Big Data, data storage has been exposed to more challenges. Data compression which can save both storage and network bandwidth, is a very important technology to deal with the challenges. In this paper, we present an end-to-end, complete, high-throughput parallel multi-core GZIP compressor in FPGA for HDFS. The GZIP compressor is designed by the scalable architecture, which supports to increase throughput by expanding multiple compression cores based on systolic array architecture. We implemented and evaluated the hardware compressor in Alpha Data Adm-Pcie-KU3 FPGA board, utilizing RIFFA for data transfers over PCI Express. According to the evaluation results, up to 16-cores compressor can be implemented and the peak compression throughput exceeds 1.1 GB/s. It is 70X speedup compared with the software compression solution. When we load the hardware compressor into HDFS, the performance of HDFS is twice as much as that without loading the compressor.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"FPGA-Based Parallel Multi-Core GZIP Compressor in HDFS\",\"authors\":\"Haoxin Luo, Ye Cai, Qiuming Luo, Rui Mao\",\"doi\":\"10.1109/PDCAT46702.2019.00017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the development of Big Data, data storage has been exposed to more challenges. Data compression which can save both storage and network bandwidth, is a very important technology to deal with the challenges. In this paper, we present an end-to-end, complete, high-throughput parallel multi-core GZIP compressor in FPGA for HDFS. The GZIP compressor is designed by the scalable architecture, which supports to increase throughput by expanding multiple compression cores based on systolic array architecture. We implemented and evaluated the hardware compressor in Alpha Data Adm-Pcie-KU3 FPGA board, utilizing RIFFA for data transfers over PCI Express. According to the evaluation results, up to 16-cores compressor can be implemented and the peak compression throughput exceeds 1.1 GB/s. It is 70X speedup compared with the software compression solution. When we load the hardware compressor into HDFS, the performance of HDFS is twice as much as that without loading the compressor.\",\"PeriodicalId\":166126,\"journal\":{\"name\":\"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT46702.2019.00017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT46702.2019.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
随着大数据的发展,数据存储面临着更多的挑战。数据压缩技术可以节省存储空间和网络带宽,是应对这些挑战的一项重要技术。在本文中,我们提出了一个端到端、完整的、高吞吐量的并行多核GZIP压缩器。GZIP压缩器采用可扩展架构设计,支持在收缩阵列架构的基础上通过扩展多个压缩核来提高吞吐量。我们在Alpha Data Adm-Pcie-KU3 FPGA板上实现并评估了硬件压缩器,利用RIFFA在PCI Express上进行数据传输。根据评估结果,最多可实现16核压缩,峰值压缩吞吐量超过1.1 GB/s。与软件压缩方案相比,速度提高了70倍。当我们将硬件压缩器加载到HDFS时,HDFS的性能是没有加载压缩器时的两倍。
FPGA-Based Parallel Multi-Core GZIP Compressor in HDFS
With the development of Big Data, data storage has been exposed to more challenges. Data compression which can save both storage and network bandwidth, is a very important technology to deal with the challenges. In this paper, we present an end-to-end, complete, high-throughput parallel multi-core GZIP compressor in FPGA for HDFS. The GZIP compressor is designed by the scalable architecture, which supports to increase throughput by expanding multiple compression cores based on systolic array architecture. We implemented and evaluated the hardware compressor in Alpha Data Adm-Pcie-KU3 FPGA board, utilizing RIFFA for data transfers over PCI Express. According to the evaluation results, up to 16-cores compressor can be implemented and the peak compression throughput exceeds 1.1 GB/s. It is 70X speedup compared with the software compression solution. When we load the hardware compressor into HDFS, the performance of HDFS is twice as much as that without loading the compressor.