{"title":"NetPU:通用可重构神经网络加速器架构的原型设计","authors":"Yuhao Liu, Shubham Rai, Salim Ullah, Akash Kumar","doi":"10.1109/ICFPT56656.2022.9974206","DOIUrl":null,"url":null,"abstract":"FPGA-based Neural Network (NN) accelerator is a rapidly advancing subject in recent research. Related works can be classified as two hardware architectures: i) Heterogeneous Streaming Dataflow (HSD) architecture and ii) Processing Element Matrix (PEM) architecture. HSD architecture explores the reconfigurability of FPGAs to support the customization and optimization of hardware design to implement a complete network on FPGA for one given trained model. PEM architecture achieves relatively generic support for different network models, essentially implementing the neuron processing modules on the FPGA scheduled by the runtime software environment. In summary, the HSD architecture requires more resources with simplified runtime software control. The PEM architecture consumes fewer resources than the HSD architecture. However, the runtime software environment can be a heavy payload for lightweight systems, such as the low-power microcontroller of IoT or edge devices.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"NetPU: Prototyping a Generic Reconfigurable Neural Network Accelerator Architecture\",\"authors\":\"Yuhao Liu, Shubham Rai, Salim Ullah, Akash Kumar\",\"doi\":\"10.1109/ICFPT56656.2022.9974206\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"FPGA-based Neural Network (NN) accelerator is a rapidly advancing subject in recent research. Related works can be classified as two hardware architectures: i) Heterogeneous Streaming Dataflow (HSD) architecture and ii) Processing Element Matrix (PEM) architecture. HSD architecture explores the reconfigurability of FPGAs to support the customization and optimization of hardware design to implement a complete network on FPGA for one given trained model. PEM architecture achieves relatively generic support for different network models, essentially implementing the neuron processing modules on the FPGA scheduled by the runtime software environment. In summary, the HSD architecture requires more resources with simplified runtime software control. The PEM architecture consumes fewer resources than the HSD architecture. However, the runtime software environment can be a heavy payload for lightweight systems, such as the low-power microcontroller of IoT or edge devices.\",\"PeriodicalId\":239314,\"journal\":{\"name\":\"2022 International Conference on Field-Programmable Technology (ICFPT)\",\"volume\":\"85 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Field-Programmable Technology (ICFPT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICFPT56656.2022.9974206\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Field-Programmable Technology (ICFPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFPT56656.2022.9974206","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
NetPU: Prototyping a Generic Reconfigurable Neural Network Accelerator Architecture
FPGA-based Neural Network (NN) accelerator is a rapidly advancing subject in recent research. Related works can be classified as two hardware architectures: i) Heterogeneous Streaming Dataflow (HSD) architecture and ii) Processing Element Matrix (PEM) architecture. HSD architecture explores the reconfigurability of FPGAs to support the customization and optimization of hardware design to implement a complete network on FPGA for one given trained model. PEM architecture achieves relatively generic support for different network models, essentially implementing the neuron processing modules on the FPGA scheduled by the runtime software environment. In summary, the HSD architecture requires more resources with simplified runtime software control. The PEM architecture consumes fewer resources than the HSD architecture. However, the runtime software environment can be a heavy payload for lightweight systems, such as the low-power microcontroller of IoT or edge devices.