{"title":"Optimizing a FPGA-based neural accelerator for small IoT devices","authors":"Seongmin Hong, Inho Lee, Yongjun Park","doi":"10.1109/isocc.2017.8368903","DOIUrl":null,"url":null,"abstract":"As neural networks have been widely used for machine-learning algorithms such as image recognition, to design efficient neural accelerators has recently become more important. However, designing neural accelerators is generally difficult because of their high memory storage requirement. In this paper, we propose an area-and-power efficient neural accelerator for small IoT devices, using 4-bit fixed-point weights through quantization technique. The proposed neural accelerator is trained through the TensorFlow infrastructure and the weight data is optimized in order to reduce the overhead of high weight memory requirement. Our FPGA-based design achieves 97.44% accuracy with MNIST 10,000 test images.","PeriodicalId":413646,"journal":{"name":"2018 International Conference on Electronics, Information, and Communication (ICEIC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Electronics, Information, and Communication (ICEIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/isocc.2017.8368903","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
As neural networks have been widely used for machine-learning algorithms such as image recognition, to design efficient neural accelerators has recently become more important. However, designing neural accelerators is generally difficult because of their high memory storage requirement. In this paper, we propose an area-and-power efficient neural accelerator for small IoT devices, using 4-bit fixed-point weights through quantization technique. The proposed neural accelerator is trained through the TensorFlow infrastructure and the weight data is optimized in order to reduce the overhead of high weight memory requirement. Our FPGA-based design achieves 97.44% accuracy with MNIST 10,000 test images.