Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision Pub Date : 2022-09-03 DOI:10.48550/arXiv.2209.01404

Xingrun Xing, Yangguang Li, Wei Li, Wenrui Ding, Yalong Jiang, Yufeng Wang, Jinghua Shao, Chunlei Liu, Xianglong Liu

{"title":"Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies","authors":"Xingrun Xing, Yangguang Li, Wei Li, Wenrui Ding, Yalong Jiang, Yufeng Wang, Jinghua Shao, Chunlei Liu, Xianglong Liu","doi":"10.48550/arXiv.2209.01404","DOIUrl":null,"url":null,"abstract":", Abstract. Existing Binary Neural Networks (BNNs) mainly operate on local convolutions with binarization function. However, such simple bit operations lack the ability of modeling contextual dependencies, which is critical for learning discriminative deep representations in vision models. In this work, we tackle this issue by presenting new designs of binary neural modules, which enables BNNs to learn effective contextual dependencies. First, we propose a binary multi-layer perceptron (MLP) block as an alternative to binary convolution blocks to directly model contextual dependencies. Both short-range and long-range feature dependencies are modeled by binary MLPs, where the former provides local inductive bias and the latter breaks limited receptive field in binary convolutions. Second, to improve the robustness of binary models with contextual dependencies, we compute the contextual dynamic embeddings to determine the binarization thresholds in general binary convolutional blocks. Armed with our binary MLP blocks and improved binary convolution, we build the BNNs with explicit Contextual De-pendency modeling, termed as BCDNet. On the standard ImageNet-1K classification benchmark, the BCDNet achieves 72.3% Top-1 accuracy and outperforms leading binary methods by a large margin. In particu-lar, the proposed BCDNet exceeds the state-of-the-art ReActNet-A by 2.9% Top-1 accuracy with similar operations. Our code is available at https://github.com/Sense-GVT/BCDNet .","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"8 1","pages":"536-552"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2209.01404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

, Abstract. Existing Binary Neural Networks (BNNs) mainly operate on local convolutions with binarization function. However, such simple bit operations lack the ability of modeling contextual dependencies, which is critical for learning discriminative deep representations in vision models. In this work, we tackle this issue by presenting new designs of binary neural modules, which enables BNNs to learn effective contextual dependencies. First, we propose a binary multi-layer perceptron (MLP) block as an alternative to binary convolution blocks to directly model contextual dependencies. Both short-range and long-range feature dependencies are modeled by binary MLPs, where the former provides local inductive bias and the latter breaks limited receptive field in binary convolutions. Second, to improve the robustness of binary models with contextual dependencies, we compute the contextual dynamic embeddings to determine the binarization thresholds in general binary convolutional blocks. Armed with our binary MLP blocks and improved binary convolution, we build the BNNs with explicit Contextual De-pendency modeling, termed as BCDNet. On the standard ImageNet-1K classification benchmark, the BCDNet achieves 72.3% Top-1 accuracy and outperforms leading binary methods by a large margin. In particu-lar, the proposed BCDNet exceeds the state-of-the-art ReActNet-A by 2.9% Top-1 accuracy with similar operations. Our code is available at https://github.com/Sense-GVT/BCDNet .

查看原文本刊更多论文

通过上下文依赖关系建模实现精确的二元神经网络

、抽象。现有的二值神经网络主要是在局部卷积上进行二值化运算。然而，这种简单的位操作缺乏对上下文依赖关系建模的能力，而上下文依赖关系对于学习视觉模型中的判别深度表示至关重要。在这项工作中，我们通过提出新的二元神经模块设计来解决这个问题，这使得bnn能够学习有效的上下文依赖关系。首先，我们提出了一个二元多层感知器(MLP)块作为二元卷积块的替代方案，直接对上下文依赖关系进行建模。在二元mlp模型中，前者提供了局部归纳偏置，后者打破了二元卷积中的有限接受域。其次，为了提高具有上下文相关性的二元模型的鲁棒性，我们计算了上下文动态嵌入来确定一般二进制卷积块的二值化阈值。利用我们的二进制MLP块和改进的二进制卷积，我们使用显式上下文依赖建模(称为BCDNet)构建了bnn。在标准的ImageNet-1K分类基准上，BCDNet达到了72.3%的Top-1准确率，并且大大优于领先的二值化方法。特别是，拟议的BCDNet在类似操作下比最先进的ReActNet-A精度高出2.9%。我们的代码可在https://github.com/Sense-GVT/BCDNet上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision

自引率

0.00%

发文量