An FPGA-Based BWT Accelerator for Bzip2 Data Compression

2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2019-04-01 DOI:10.1109/FCCM.2019.00023

W. Qiao, Zhenman Fang, Mau-Chung Frank Chang, J. Cong

引用次数: 13

Abstract

The Burrows-Wheeler Transform (BWT) has played an important role in lossless data compression algorithms. To achieve a good compression ratio, the BWT block size needs to be several hundreds of kilobytes, which requires a large amount of on-chip memory resources and limits effective hardware implementations. In this paper, we analyze the bottleneck of the BWT acceleration and present a novel design to map the anti-sequential suffix sorting algorithm to FPGAs. Our design can perform BWT with a block size of up to 500KB (i.e., bzip2 level 5 compression) on the Xilinx Virtex UltraScale+ VCU1525 board, while the state-of-art FPGA implementation can only support 4KB block size. Experiments show our FPGA design can achieve ~2x speedup compared to the best CPU implementation using standard large Corpus benchmarks.

查看原文本刊更多论文

基于fpga的Bzip2数据压缩BWT加速器

在无损数据压缩算法中，Burrows-Wheeler变换(BWT)起着重要的作用。为了获得良好的压缩比，BWT块大小需要达到几百kb，这需要大量的片上内存资源，并且限制了有效的硬件实现。本文分析了BWT加速的瓶颈，提出了一种将反顺序后缀排序算法映射到fpga的新设计。我们的设计可以在Xilinx Virtex UltraScale+ VCU1525板上以高达500KB的块大小执行BWT(即bzip2级5压缩)，而最先进的FPGA实现只能支持4KB块大小。实验表明，与使用标准大型语料库基准测试的最佳CPU实现相比，我们的FPGA设计可以实现约2倍的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

自引率

0.00%

发文量