嵌入式系统剪枝MLP的高效存储编码

Ming Xiaoman, Huang Letian
{"title":"嵌入式系统剪枝MLP的高效存储编码","authors":"Ming Xiaoman, Huang Letian","doi":"10.1109/ICET51757.2021.9451109","DOIUrl":null,"url":null,"abstract":"Deep neural network is widely used in many intelligent computing scenarios, but the storage overhead caused by its huge number of parameters limits its application in embedded systems. The neural network pruning technology can remove a large number of redundant parameters in the network, theoretically, which reduces the amount of parameter storage and calculation. However, there are two problems in deploying sparse neural network algorithms on existing embedded hardware platforms: one is that using the existing coding to store sparse neural network parameters will increase additional memory access overhead; the other is the decoding process of sparse model is highly complex, which increases the amount of calculation. To solve the two problems, this paper proposes Dynamic ELL encoding format to represent the sparse weight matrix for pruning neural network. This new coding combines the advantages of the ELL encoding and relative indexing, which compromises the storage overhead and decoding overhead. The sparse MLP that uses dynamic ELL encoding storage is deployed on a SoC built on a RISC-V processor. Experiments show that compared to dense networks that use Dynamic ELL encoding storage, the sparse network storage space is reduced by 43%, and the system running time is reduced by 37%.","PeriodicalId":316980,"journal":{"name":"2021 IEEE 4th International Conference on Electronics Technology (ICET)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Storage Coding of Pruning MLP for Embedded System\",\"authors\":\"Ming Xiaoman, Huang Letian\",\"doi\":\"10.1109/ICET51757.2021.9451109\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural network is widely used in many intelligent computing scenarios, but the storage overhead caused by its huge number of parameters limits its application in embedded systems. The neural network pruning technology can remove a large number of redundant parameters in the network, theoretically, which reduces the amount of parameter storage and calculation. However, there are two problems in deploying sparse neural network algorithms on existing embedded hardware platforms: one is that using the existing coding to store sparse neural network parameters will increase additional memory access overhead; the other is the decoding process of sparse model is highly complex, which increases the amount of calculation. To solve the two problems, this paper proposes Dynamic ELL encoding format to represent the sparse weight matrix for pruning neural network. This new coding combines the advantages of the ELL encoding and relative indexing, which compromises the storage overhead and decoding overhead. The sparse MLP that uses dynamic ELL encoding storage is deployed on a SoC built on a RISC-V processor. Experiments show that compared to dense networks that use Dynamic ELL encoding storage, the sparse network storage space is reduced by 43%, and the system running time is reduced by 37%.\",\"PeriodicalId\":316980,\"journal\":{\"name\":\"2021 IEEE 4th International Conference on Electronics Technology (ICET)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 4th International Conference on Electronics Technology (ICET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICET51757.2021.9451109\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 4th International Conference on Electronics Technology (ICET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICET51757.2021.9451109","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

深度神经网络在许多智能计算场景中得到了广泛的应用,但其庞大的参数所带来的存储开销限制了其在嵌入式系统中的应用。神经网络剪枝技术理论上可以去除网络中大量的冗余参数,减少了参数的存储和计算量。然而,在现有的嵌入式硬件平台上部署稀疏神经网络算法存在两个问题:一是使用现有的编码来存储稀疏神经网络参数会增加额外的内存访问开销;二是稀疏模型的解码过程非常复杂,增加了计算量。为了解决这两个问题,本文提出了动态ELL编码格式来表示修剪神经网络的稀疏权矩阵。这种新的编码结合了ELL编码和相对索引的优点,从而降低了存储开销和解码开销。使用动态ELL编码存储的稀疏MLP部署在基于RISC-V处理器的SoC上。实验表明,与使用Dynamic ELL编码存储的密集网络相比,稀疏网络存储空间减少了43%,系统运行时间减少了37%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficient Storage Coding of Pruning MLP for Embedded System
Deep neural network is widely used in many intelligent computing scenarios, but the storage overhead caused by its huge number of parameters limits its application in embedded systems. The neural network pruning technology can remove a large number of redundant parameters in the network, theoretically, which reduces the amount of parameter storage and calculation. However, there are two problems in deploying sparse neural network algorithms on existing embedded hardware platforms: one is that using the existing coding to store sparse neural network parameters will increase additional memory access overhead; the other is the decoding process of sparse model is highly complex, which increases the amount of calculation. To solve the two problems, this paper proposes Dynamic ELL encoding format to represent the sparse weight matrix for pruning neural network. This new coding combines the advantages of the ELL encoding and relative indexing, which compromises the storage overhead and decoding overhead. The sparse MLP that uses dynamic ELL encoding storage is deployed on a SoC built on a RISC-V processor. Experiments show that compared to dense networks that use Dynamic ELL encoding storage, the sparse network storage space is reduced by 43%, and the system running time is reduced by 37%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信