S. Venkataramanaiah, Shihui Yin, Yu Cao, Jae-sun Seo
{"title":"基于ASIC和FPGA的深度神经网络训练加速器设计","authors":"S. Venkataramanaiah, Shihui Yin, Yu Cao, Jae-sun Seo","doi":"10.1109/ISOCC50952.2020.9333063","DOIUrl":null,"url":null,"abstract":"In this invited paper, we present deep neural network (DNN) training accelerator designs in both ASIC and FPGA. The accelerators implements stochastic gradient descent based training algorithm in 16-bit fixed-point precision. A new cyclic weight storage and access scheme enables using the same off-the-shelf SRAMs for non-transpose and transpose operations during feed-forward and feed-backward phases, respectively, of the DNN training process. Including the cyclic weight scheme, the overall DNN training processor is implemented in both 65nm CMOS ASIC and Intel Stratix-10 FPGA hardware. We collectively report the ASIC and FPGA training accelerator results.","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Deep Neural Network Training Accelerator Designs in ASIC and FPGA\",\"authors\":\"S. Venkataramanaiah, Shihui Yin, Yu Cao, Jae-sun Seo\",\"doi\":\"10.1109/ISOCC50952.2020.9333063\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this invited paper, we present deep neural network (DNN) training accelerator designs in both ASIC and FPGA. The accelerators implements stochastic gradient descent based training algorithm in 16-bit fixed-point precision. A new cyclic weight storage and access scheme enables using the same off-the-shelf SRAMs for non-transpose and transpose operations during feed-forward and feed-backward phases, respectively, of the DNN training process. Including the cyclic weight scheme, the overall DNN training processor is implemented in both 65nm CMOS ASIC and Intel Stratix-10 FPGA hardware. We collectively report the ASIC and FPGA training accelerator results.\",\"PeriodicalId\":270577,\"journal\":{\"name\":\"2020 International SoC Design Conference (ISOCC)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International SoC Design Conference (ISOCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISOCC50952.2020.9333063\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International SoC Design Conference (ISOCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISOCC50952.2020.9333063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Neural Network Training Accelerator Designs in ASIC and FPGA
In this invited paper, we present deep neural network (DNN) training accelerator designs in both ASIC and FPGA. The accelerators implements stochastic gradient descent based training algorithm in 16-bit fixed-point precision. A new cyclic weight storage and access scheme enables using the same off-the-shelf SRAMs for non-transpose and transpose operations during feed-forward and feed-backward phases, respectively, of the DNN training process. Including the cyclic weight scheme, the overall DNN training processor is implemented in both 65nm CMOS ASIC and Intel Stratix-10 FPGA hardware. We collectively report the ASIC and FPGA training accelerator results.