{"title":"FPGA-Based Parallel Multi-Core GZIP Compressor in HDFS","authors":"Haoxin Luo, Ye Cai, Qiuming Luo, Rui Mao","doi":"10.1109/PDCAT46702.2019.00017","DOIUrl":null,"url":null,"abstract":"With the development of Big Data, data storage has been exposed to more challenges. Data compression which can save both storage and network bandwidth, is a very important technology to deal with the challenges. In this paper, we present an end-to-end, complete, high-throughput parallel multi-core GZIP compressor in FPGA for HDFS. The GZIP compressor is designed by the scalable architecture, which supports to increase throughput by expanding multiple compression cores based on systolic array architecture. We implemented and evaluated the hardware compressor in Alpha Data Adm-Pcie-KU3 FPGA board, utilizing RIFFA for data transfers over PCI Express. According to the evaluation results, up to 16-cores compressor can be implemented and the peak compression throughput exceeds 1.1 GB/s. It is 70X speedup compared with the software compression solution. When we load the hardware compressor into HDFS, the performance of HDFS is twice as much as that without loading the compressor.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT46702.2019.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
With the development of Big Data, data storage has been exposed to more challenges. Data compression which can save both storage and network bandwidth, is a very important technology to deal with the challenges. In this paper, we present an end-to-end, complete, high-throughput parallel multi-core GZIP compressor in FPGA for HDFS. The GZIP compressor is designed by the scalable architecture, which supports to increase throughput by expanding multiple compression cores based on systolic array architecture. We implemented and evaluated the hardware compressor in Alpha Data Adm-Pcie-KU3 FPGA board, utilizing RIFFA for data transfers over PCI Express. According to the evaluation results, up to 16-cores compressor can be implemented and the peak compression throughput exceeds 1.1 GB/s. It is 70X speedup compared with the software compression solution. When we load the hardware compressor into HDFS, the performance of HDFS is twice as much as that without loading the compressor.