FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations

The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays Pub Date : 2020-12-22 DOI:10.1145/3431920.3439296

Yichi Zhang, Junhao Pan, Xinheng Liu, Hongzheng Chen, Deming Chen, Zhiru Zhang

{"title":"FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations","authors":"Yichi Zhang, Junhao Pan, Xinheng Liu, Hongzheng Chen, Deming Chen, Zhiru Zhang","doi":"10.1145/3431920.3439296","DOIUrl":null,"url":null,"abstract":"Binary neural networks (BNNs) have 1-bit weights and activations. Such networks are well suited for FPGAs, as their dominant computations are bitwise arithmetic and the memory requirement is also significantly reduced. However, compared to start-of-the-art compact convolutional neural network (CNN) models, BNNs tend to produce a much lower accuracy on realistic datasets such as ImageNet. In addition, the input layer of BNNs has gradually become a major compute bottleneck, because it is conventionally excluded from binarization to avoid a large accuracy loss. This work proposes FracBNN, which exploits fractional activations to substantially improve the accuracy of BNNs. Specifically, our approach employs a dual-precision activation scheme to compute features with up to two bits, using an additional sparse binary convolution. We further binarize the input layer using a novel thermometer encoding. Overall, FracBNN preserves the key benefits of conventional BNNs, where all convolutional layers are computed in pure binary MAC operations (BMACs). We design an efficient FPGA-based accelerator for our novel BNN model that supports the fractional activations. To evaluate the performance of FracBNN under a resource-constrained scenario, we implement the entire optimized network architecture on an embedded FPGA (Xilinx Ultra96 v2). Our experiments on ImageNet show that FracBNN achieves an accuracy comparable to MobileNetV2, surpassing the best-known BNN design on FPGAs with an increase of 28.9% in top-1 accuracy and a 2.5x reduction in model size. FracBNN also outperforms a recently introduced BNN model with an increase of 2.4% in top-1 accuracy while using the same model size. On the embedded FPGA device, FracBNN demonstrates the ability of real-time image classification.","PeriodicalId":386071,"journal":{"name":"The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3431920.3439296","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 51

Abstract

Binary neural networks (BNNs) have 1-bit weights and activations. Such networks are well suited for FPGAs, as their dominant computations are bitwise arithmetic and the memory requirement is also significantly reduced. However, compared to start-of-the-art compact convolutional neural network (CNN) models, BNNs tend to produce a much lower accuracy on realistic datasets such as ImageNet. In addition, the input layer of BNNs has gradually become a major compute bottleneck, because it is conventionally excluded from binarization to avoid a large accuracy loss. This work proposes FracBNN, which exploits fractional activations to substantially improve the accuracy of BNNs. Specifically, our approach employs a dual-precision activation scheme to compute features with up to two bits, using an additional sparse binary convolution. We further binarize the input layer using a novel thermometer encoding. Overall, FracBNN preserves the key benefits of conventional BNNs, where all convolutional layers are computed in pure binary MAC operations (BMACs). We design an efficient FPGA-based accelerator for our novel BNN model that supports the fractional activations. To evaluate the performance of FracBNN under a resource-constrained scenario, we implement the entire optimized network architecture on an embedded FPGA (Xilinx Ultra96 v2). Our experiments on ImageNet show that FracBNN achieves an accuracy comparable to MobileNetV2, surpassing the best-known BNN design on FPGAs with an increase of 28.9% in top-1 accuracy and a 2.5x reduction in model size. FracBNN also outperforms a recently introduced BNN model with an increase of 2.4% in top-1 accuracy while using the same model size. On the embedded FPGA device, FracBNN demonstrates the ability of real-time image classification.

查看原文本刊更多论文

FracBNN:精确的、fpga高效的分数激活二元神经网络

二值神经网络(bnn)具有1位的权重和激活。这种网络非常适合fpga，因为它们的主要计算是位算术，并且内存需求也显着降低。然而，与最先进的紧凑型卷积神经网络(CNN)模型相比，bnn在现实数据集(如ImageNet)上的准确率要低得多。此外，bnn的输入层逐渐成为一个主要的计算瓶颈，因为它通常被排除在二值化之外，以避免较大的精度损失。这项工作提出了FracBNN，它利用分数激活大大提高了bnn的准确性。具体来说，我们的方法采用双精度激活方案来计算多达两位的特征，使用额外的稀疏二进制卷积。我们使用一种新的温度计编码进一步二值化输入层。总体而言，FracBNN保留了传统bnn的主要优点，其中所有卷积层都是在纯二进制MAC操作(bmac)中计算的。我们设计了一个高效的基于fpga的加速器，用于支持分数激活的新型BNN模型。为了评估FracBNN在资源受限情况下的性能，我们在嵌入式FPGA (Xilinx Ultra96 v2)上实现了整个优化的网络架构。我们在ImageNet上的实验表明，FracBNN达到了与MobileNetV2相当的精度，超过了fpga上最著名的BNN设计，在前1精度上提高了28.9%，模型尺寸减少了2.5倍。在使用相同模型大小的情况下，FracBNN也优于最近推出的BNN模型，在top-1精度上提高了2.4%。在嵌入式FPGA器件上，FracBNN验证了实时图像分类的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

自引率

0.00%

发文量