Yuming Cheng, Chao Wang, Yangyang Zhao, Xianglan Chen, Xuehai Zhou, Xi Li
{"title":"MuDBN","authors":"Yuming Cheng, Chao Wang, Yangyang Zhao, Xianglan Chen, Xuehai Zhou, Xi Li","doi":"10.1145/3194554.3194630","DOIUrl":null,"url":null,"abstract":"With the increasing size of neural networks, state-of-the-art deep neural networks (DNNs) have hundreds of millions of parameters. Due to multiple fully-connected layers, DNNs are compute-intensive and memory-intensive, making them hard to deploy on embedded devices with limited power budgets and hardware resources. Therefore, this paper presents a deep belief network accelerator based on multi-FPGA. Two different schemes, the division between layers (DBL) and the division inside layers (DIL), are adopted to map the DBN to the multi-FPGA system. Experimental results demonstrate that the accelerator can achieve 4.24x (DBL) -6.20x (DIL) speedup comparing to the Intel Core i7 CPU and save 119x (DBL) -90x (DIL) power consumption comparing to the Tesla K40C GPU.","PeriodicalId":215940,"journal":{"name":"Proceedings of the 2018 on Great Lakes Symposium on VLSI","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 on Great Lakes Symposium on VLSI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3194554.3194630","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the increasing size of neural networks, state-of-the-art deep neural networks (DNNs) have hundreds of millions of parameters. Due to multiple fully-connected layers, DNNs are compute-intensive and memory-intensive, making them hard to deploy on embedded devices with limited power budgets and hardware resources. Therefore, this paper presents a deep belief network accelerator based on multi-FPGA. Two different schemes, the division between layers (DBL) and the division inside layers (DIL), are adopted to map the DBN to the multi-FPGA system. Experimental results demonstrate that the accelerator can achieve 4.24x (DBL) -6.20x (DIL) speedup comparing to the Intel Core i7 CPU and save 119x (DBL) -90x (DIL) power consumption comparing to the Tesla K40C GPU.