利用 FPGA 实现可重新配置的并行前馈神经网络

IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Mohamed El-Sharkawy , Miran Wael , Maggie Mashaly , Eman Azab
{"title":"利用 FPGA 实现可重新配置的并行前馈神经网络","authors":"Mohamed El-Sharkawy ,&nbsp;Miran Wael ,&nbsp;Maggie Mashaly ,&nbsp;Eman Azab","doi":"10.1016/j.vlsi.2024.102176","DOIUrl":null,"url":null,"abstract":"<div><p>This paper proposes a novel hardware architecture for a Feed-Forward Neural Network (FFNN) with the objective of minimizing the number of execution clock cycles needed for the network’s computation. The proposed architecture depends mainly on using two physical layers that are multiplexed and reused during the computation of the FFNN to achieve an efficient parallel design. Two physical layers are designed to handle the computation of different sizes of Neural Networks (NN). The proposed FFNN architecture hardware resources are independent of the NN’s number of layers, instead, they depend only on the number of neurons in the largest layer. This versatile architecture serves as an accelerator in Deep Neural Network (DNN) computations as it exploits parallelism by making the two physical layers work in parallel through the computations. The proposed implementation was implemented with 18-bit fixed point representation reaching 200 MHz clock speed on a Spartan7 FPGA. Furthermore, the proposed architecture achieves a lower neuron computation factor compared to previous works in the literature.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2000,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Re-configurable parallel Feed-Forward Neural Network implementation using FPGA\",\"authors\":\"Mohamed El-Sharkawy ,&nbsp;Miran Wael ,&nbsp;Maggie Mashaly ,&nbsp;Eman Azab\",\"doi\":\"10.1016/j.vlsi.2024.102176\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This paper proposes a novel hardware architecture for a Feed-Forward Neural Network (FFNN) with the objective of minimizing the number of execution clock cycles needed for the network’s computation. The proposed architecture depends mainly on using two physical layers that are multiplexed and reused during the computation of the FFNN to achieve an efficient parallel design. Two physical layers are designed to handle the computation of different sizes of Neural Networks (NN). The proposed FFNN architecture hardware resources are independent of the NN’s number of layers, instead, they depend only on the number of neurons in the largest layer. This versatile architecture serves as an accelerator in Deep Neural Network (DNN) computations as it exploits parallelism by making the two physical layers work in parallel through the computations. The proposed implementation was implemented with 18-bit fixed point representation reaching 200 MHz clock speed on a Spartan7 FPGA. Furthermore, the proposed architecture achieves a lower neuron computation factor compared to previous works in the literature.</p></div>\",\"PeriodicalId\":54973,\"journal\":{\"name\":\"Integration-The Vlsi Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Integration-The Vlsi Journal\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167926024000397\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Integration-The Vlsi Journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167926024000397","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

本文为前馈神经网络(Feed-Forward Neural Network,FFNN)提出了一种新颖的硬件架构,目的是最大限度地减少网络计算所需的执行时钟周期。所提出的架构主要依赖于使用两个物理层,在 FFNN 的计算过程中复用和重复使用这两个物理层,以实现高效的并行设计。设计两个物理层是为了处理不同规模的神经网络(NN)的计算。拟议的 FFNN 架构硬件资源与 NN 的层数无关,而只取决于最大层中神经元的数量。这种多功能架构可作为深度神经网络(DNN)计算的加速器,因为它通过计算使两个物理层并行工作,从而利用了并行性。所提出的实现方案在 Spartan7 FPGA 上以 18 位定点表示实现,时钟速度达到 200 MHz。此外,与之前的文献相比,所提出的架构实现了更低的神经元计算系数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Re-configurable parallel Feed-Forward Neural Network implementation using FPGA

Re-configurable parallel Feed-Forward Neural Network implementation using FPGA

This paper proposes a novel hardware architecture for a Feed-Forward Neural Network (FFNN) with the objective of minimizing the number of execution clock cycles needed for the network’s computation. The proposed architecture depends mainly on using two physical layers that are multiplexed and reused during the computation of the FFNN to achieve an efficient parallel design. Two physical layers are designed to handle the computation of different sizes of Neural Networks (NN). The proposed FFNN architecture hardware resources are independent of the NN’s number of layers, instead, they depend only on the number of neurons in the largest layer. This versatile architecture serves as an accelerator in Deep Neural Network (DNN) computations as it exploits parallelism by making the two physical layers work in parallel through the computations. The proposed implementation was implemented with 18-bit fixed point representation reaching 200 MHz clock speed on a Spartan7 FPGA. Furthermore, the proposed architecture achieves a lower neuron computation factor compared to previous works in the literature.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Integration-The Vlsi Journal
Integration-The Vlsi Journal 工程技术-工程:电子与电气
CiteScore
3.80
自引率
5.30%
发文量
107
审稿时长
6 months
期刊介绍: Integration''s aim is to cover every aspect of the VLSI area, with an emphasis on cross-fertilization between various fields of science, and the design, verification, test and applications of integrated circuits and systems, as well as closely related topics in process and device technologies. Individual issues will feature peer-reviewed tutorials and articles as well as reviews of recent publications. The intended coverage of the journal can be assessed by examining the following (non-exclusive) list of topics: Specification methods and languages; Analog/Digital Integrated Circuits and Systems; VLSI architectures; Algorithms, methods and tools for modeling, simulation, synthesis and verification of integrated circuits and systems of any complexity; Embedded systems; High-level synthesis for VLSI systems; Logic synthesis and finite automata; Testing, design-for-test and test generation algorithms; Physical design; Formal verification; Algorithms implemented in VLSI systems; Systems engineering; Heterogeneous systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信