{"title":"Efficient Unaligned Memory Access of Tightly Packed Weights for Deep Neural Network Inference on Edge Devices","authors":"Ciprian Seiculescu","doi":"10.1109/SIITME53254.2021.9663723","DOIUrl":null,"url":null,"abstract":"The increase in computational power enabled complex problems to be solved by employing new techniques from the field of Artificial Intelligence (AI) based on Deep Neural Networks (DNN) and Deep Learning (DL). A recent trend is to apply these techniques that have proven to generate excellent results to Edge computing. However, Edge computing is based on simple low power devices, which are severely restricted in terms of computational power and especially by the available memory size. Being able to pack the Neural Network parameters in the available memory efficiently is a must. Normally, memory systems expect transactions to be aligned to the bus size for maximum performance. This can result in inefficient memory utilization, as the groups of parameters required to be read in parallel need to be stored aligned in memory. In this paper, I present a memory controller to provide unaligned memory transfers at full bus width. When using this controller, the memory efficiency can be increased by 25%, while preserving the memory access time.","PeriodicalId":426485,"journal":{"name":"2021 IEEE 27th International Symposium for Design and Technology in Electronic Packaging (SIITME)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 27th International Symposium for Design and Technology in Electronic Packaging (SIITME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIITME53254.2021.9663723","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The increase in computational power enabled complex problems to be solved by employing new techniques from the field of Artificial Intelligence (AI) based on Deep Neural Networks (DNN) and Deep Learning (DL). A recent trend is to apply these techniques that have proven to generate excellent results to Edge computing. However, Edge computing is based on simple low power devices, which are severely restricted in terms of computational power and especially by the available memory size. Being able to pack the Neural Network parameters in the available memory efficiently is a must. Normally, memory systems expect transactions to be aligned to the bus size for maximum performance. This can result in inefficient memory utilization, as the groups of parameters required to be read in parallel need to be stored aligned in memory. In this paper, I present a memory controller to provide unaligned memory transfers at full bus width. When using this controller, the memory efficiency can be increased by 25%, while preserving the memory access time.