Sparse Periodic Systolic Dataflow for Lowering Latency and Power Dissipation of Convolutional Neural Network Accelerators

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2022-06-30 DOI:10.1145/3531437.3539715

J. Heo, A. Fayyazi, Amirhossein Esmaili, M. Pedram

{"title":"Sparse Periodic Systolic Dataflow for Lowering Latency and Power Dissipation of Convolutional Neural Network Accelerators","authors":"J. Heo, A. Fayyazi, Amirhossein Esmaili, M. Pedram","doi":"10.1145/3531437.3539715","DOIUrl":null,"url":null,"abstract":"This paper introduces the sparse periodic systolic (SPS) dataflow, which advances the state-of-the-art hardware accelerator for supporting lightweight neural networks. Specifically, the SPS dataflow enables a novel hardware design approach unlocked by an emergent pruning scheme, periodic pattern-based sparsity (PPS). By exploiting the regularity of PPS, our sparsity-aware compiler optimally reorders the weights and uses a simple indexing unit in hardware to create matches between the weights and activations. Through the compiler-hardware codesign, SPS dataflow enjoys higher degrees of parallelism while being free of the high indexing overhead and without model accuracy loss. Evaluated on popular benchmarks such as VGG and ResNet, the SPS dataflow and accompanying neural network compiler outperform prior work in convolutional neural network (CNN) accelerator designs targeting FPGA devices. Against other sparsity-supporting weight storage formats, SPS results in 4.49 × energy efficiency gain while lowering storage requirements by 3.67 × for total weight storage (non-pruned weights plus indexing) and 22,044 × for indexing memory.","PeriodicalId":116486,"journal":{"name":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3531437.3539715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

This paper introduces the sparse periodic systolic (SPS) dataflow, which advances the state-of-the-art hardware accelerator for supporting lightweight neural networks. Specifically, the SPS dataflow enables a novel hardware design approach unlocked by an emergent pruning scheme, periodic pattern-based sparsity (PPS). By exploiting the regularity of PPS, our sparsity-aware compiler optimally reorders the weights and uses a simple indexing unit in hardware to create matches between the weights and activations. Through the compiler-hardware codesign, SPS dataflow enjoys higher degrees of parallelism while being free of the high indexing overhead and without model accuracy loss. Evaluated on popular benchmarks such as VGG and ResNet, the SPS dataflow and accompanying neural network compiler outperform prior work in convolutional neural network (CNN) accelerator designs targeting FPGA devices. Against other sparsity-supporting weight storage formats, SPS results in 4.49 × energy efficiency gain while lowering storage requirements by 3.67 × for total weight storage (non-pruned weights plus indexing) and 22,044 × for indexing memory.

查看原文本刊更多论文

用于降低卷积神经网络加速器延迟和功耗的稀疏周期收缩数据流

本文介绍了稀疏周期收缩(SPS)数据流，提出了支持轻量级神经网络的最先进的硬件加速器。具体来说，SPS数据流支持一种新的硬件设计方法，即基于周期性模式的稀疏性(PPS)。通过利用PPS的规则性，我们的稀疏感知编译器以最佳方式重新排序权重，并使用硬件中的简单索引单元在权重和激活之间创建匹配。通过编译器和硬件的协同设计，SPS数据流在没有高索引开销和模型精度损失的情况下具有更高的并行度。在VGG和ResNet等流行基准测试中，SPS数据流和附带的神经网络编译器优于先前针对FPGA设备的卷积神经网络(CNN)加速器设计。与其他支持稀疏性的权重存储格式相比，SPS的能效提高了4.49倍，同时将总权重存储(非修剪权重加上索引)的存储需求降低了3.67倍，将索引内存的存储需求降低了22,044倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design

自引率

0.00%

发文量