- Book学术

发布求助

文献互助智能选刊最新文献

2020 IEEE International Symposium on High Performance Computer Architecture (HPCA) Pub Date : 2020-02-01 DOI:10.1109/HPCA47549.2020.00011

M. Imani, Mohammad Samragh Razlighi, Yeseong Kim, Saransh Gupta, F. Koushanfar, T. Simunic

{"title":"Deep Learning Acceleration with Neuron-to-Memory Transformation","authors":"M. Imani, Mohammad Samragh Razlighi, Yeseong Kim, Saransh Gupta, F. Koushanfar, T. Simunic","doi":"10.1109/HPCA47549.2020.00011","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNN) have demonstrated effectiveness for various applications such as image processing, video segmentation, and speech recognition. Running state-of-theart DNNs on current systems mostly relies on either generalpurpose processors, ASIC designs, or FPGA accelerators, all of which suffer from data movements due to the limited on-chip memory and data transfer bandwidth. In this work, we propose a novel framework, called RAPIDNN, which performs neuron-to-memory transformation in order to accelerate DNNs in a highly parallel architecture. RAPIDNN reinterprets a DNN model and maps it into a specialized accelerator, which is designed using non-volatile memory blocks that model four fundamental DNN operations, i.e., multiplication, addition, activation functions, and pooling. The framework extracts representative operands of a DNN model, e.g., weights and input values, using clustering methods to optimize the model for in-memory processing. Then, it maps the extracted operands and their pre-computed results into the accelerator memory blocks. At runtime, the accelerator identifies computation results based on efficient in-memory search capability which also provides tunability of approximation to improve computation efficiency further. Our evaluation shows that RAPIDNN achieves 68.4×, 49.5× energy efficiency improvement and 48.1×, 10.9× speedup as compared to ISAAC and PipeLayer, the state-of-the-art DNN accelerators, while ensuring less than 0.5% quality loss.","PeriodicalId":339648,"journal":{"name":"2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA47549.2020.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

摘要

深度神经网络(DNN)已经在图像处理、视频分割和语音识别等各种应用中证明了其有效性。在当前系统上运行最先进的深度神经网络主要依赖于通用处理器、ASIC设计或FPGA加速器，由于片上内存和数据传输带宽有限，所有这些都受到数据移动的影响。在这项工作中，我们提出了一个新的框架，称为RAPIDNN，它执行神经元到存储器的转换，以便在高度并行的架构中加速dnn。RAPIDNN重新解释DNN模型并将其映射到专门的加速器中，该加速器使用非易失性存储块设计，该存储块模拟四种基本DNN操作，即乘法，加法，激活函数和池化。该框架提取DNN模型的代表性操作数，例如权重和输入值，并使用聚类方法优化模型以进行内存处理。然后，它将提取的操作数及其预计算结果映射到加速器内存块中。在运行时，加速器基于高效的内存搜索能力识别计算结果，并提供近似的可调性，进一步提高计算效率。我们的评估表明，与最先进的深度神经网络加速器ISAAC和PipeLayer相比，RAPIDNN的能效提高了68.4倍、49.5倍，加速提高了48.1倍、10.9倍，同时确保了不到0.5%的质量损失。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Learning Acceleration with Neuron-to-Memory Transformation

Deep neural networks (DNN) have demonstrated effectiveness for various applications such as image processing, video segmentation, and speech recognition. Running state-of-theart DNNs on current systems mostly relies on either generalpurpose processors, ASIC designs, or FPGA accelerators, all of which suffer from data movements due to the limited on-chip memory and data transfer bandwidth. In this work, we propose a novel framework, called RAPIDNN, which performs neuron-to-memory transformation in order to accelerate DNNs in a highly parallel architecture. RAPIDNN reinterprets a DNN model and maps it into a specialized accelerator, which is designed using non-volatile memory blocks that model four fundamental DNN operations, i.e., multiplication, addition, activation functions, and pooling. The framework extracts representative operands of a DNN model, e.g., weights and input values, using clustering methods to optimize the model for in-memory processing. Then, it maps the extracted operands and their pre-computed results into the accelerator memory blocks. At runtime, the accelerator identifies computation results based on efficient in-memory search capability which also provides tunability of approximation to improve computation efficiency further. Our evaluation shows that RAPIDNN achieves 68.4×, 49.5× energy efficiency improvement and 48.1×, 10.9× speedup as compared to ISAAC and PipeLayer, the state-of-the-art DNN accelerators, while ensuring less than 0.5% quality loss.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)

自引率

0.00%

发文量