{"title":"CNN加速器的软硬件协同设计","authors":"Changjae Yi, Donghyun Kang, S. Ha","doi":"10.1109/DSD57027.2022.00054","DOIUrl":null,"url":null,"abstract":"The explosive growth of deep learning applications based on convolutional neural network (CNN) in embedded sys-tems is spurring the development of a hardware CNN accelerator, called a neural processing unit (NPU). In this work, we present how the hardware-software codesign methodology could be applied to the design of a novel adder-type NPU. After devising a baseline datapath that enables fully-pipelined execution of layers, we define a high-level behavior model based on which a high-level compiler and a virtual prototyping system are built concurrently. Since it is easy to change the microarchitecture of an NPU by modifying the simulation models of the hardware modules, we could explore the design space of NPU microarchitecture easily. In addition, we could evaluate the effect of hardware extensions to support various types of non-convolutional operations that recent CNN models use widely. After the final datapath is determined, we design the control structure and low-level compiler and implement the NPU prototype. Implementation results on an FPGA prototype show the viability of the proposed methodology and its outcome.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hardware-Software Codesign of a CNN Accelerator\",\"authors\":\"Changjae Yi, Donghyun Kang, S. Ha\",\"doi\":\"10.1109/DSD57027.2022.00054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The explosive growth of deep learning applications based on convolutional neural network (CNN) in embedded sys-tems is spurring the development of a hardware CNN accelerator, called a neural processing unit (NPU). In this work, we present how the hardware-software codesign methodology could be applied to the design of a novel adder-type NPU. After devising a baseline datapath that enables fully-pipelined execution of layers, we define a high-level behavior model based on which a high-level compiler and a virtual prototyping system are built concurrently. Since it is easy to change the microarchitecture of an NPU by modifying the simulation models of the hardware modules, we could explore the design space of NPU microarchitecture easily. In addition, we could evaluate the effect of hardware extensions to support various types of non-convolutional operations that recent CNN models use widely. After the final datapath is determined, we design the control structure and low-level compiler and implement the NPU prototype. Implementation results on an FPGA prototype show the viability of the proposed methodology and its outcome.\",\"PeriodicalId\":211723,\"journal\":{\"name\":\"2022 25th Euromicro Conference on Digital System Design (DSD)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 25th Euromicro Conference on Digital System Design (DSD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSD57027.2022.00054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 25th Euromicro Conference on Digital System Design (DSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSD57027.2022.00054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The explosive growth of deep learning applications based on convolutional neural network (CNN) in embedded sys-tems is spurring the development of a hardware CNN accelerator, called a neural processing unit (NPU). In this work, we present how the hardware-software codesign methodology could be applied to the design of a novel adder-type NPU. After devising a baseline datapath that enables fully-pipelined execution of layers, we define a high-level behavior model based on which a high-level compiler and a virtual prototyping system are built concurrently. Since it is easy to change the microarchitecture of an NPU by modifying the simulation models of the hardware modules, we could explore the design space of NPU microarchitecture easily. In addition, we could evaluate the effect of hardware extensions to support various types of non-convolutional operations that recent CNN models use widely. After the final datapath is determined, we design the control structure and low-level compiler and implement the NPU prototype. Implementation results on an FPGA prototype show the viability of the proposed methodology and its outcome.