多处理器SoC在FPGA上实现神经网络训练

R. Aliaga, R. Gadea, R. Colom, J. Monzó, C. Lerche, J. Martinez, A. Sebastiá, F. Mateo
{"title":"多处理器SoC在FPGA上实现神经网络训练","authors":"R. Aliaga, R. Gadea, R. Colom, J. Monzó, C. Lerche, J. Martinez, A. Sebastiá, F. Mateo","doi":"10.1109/ENICS.2008.22","DOIUrl":null,"url":null,"abstract":"Software implementations of artificial neural networks (ANNs) and their training on a sequential processor are inefficient because they do not take advantage of parallelism. ASIC and FPGA implementations employ specific hardware structures to exploit parallelism in order to improve processing speed; however, optimizing resource usage requires the use of fixed-point arithmetic, thereby losing precision, and the final system is restricted to a particular network topology. This paper presents a mixed approach based on a multiprocessor system-on-chip (SoC) on a FPGA. The use of software-driven embedded microprocessors with custom floating-point extensions for ANN related functions allows for greater precision and flexibility in the structure of the networks to be trained with no need for device reconfiguration, and parallelism is achieved by the use of a large number of processing units. Design limitations are discussed, and preliminary results are presented on the performance of the system on an Altera DE2-70 development board.","PeriodicalId":162793,"journal":{"name":"2008 International Conference on Advances in Electronics and Micro-electronics","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Multiprocessor SoC Implementation of Neural Network Training on FPGA\",\"authors\":\"R. Aliaga, R. Gadea, R. Colom, J. Monzó, C. Lerche, J. Martinez, A. Sebastiá, F. Mateo\",\"doi\":\"10.1109/ENICS.2008.22\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software implementations of artificial neural networks (ANNs) and their training on a sequential processor are inefficient because they do not take advantage of parallelism. ASIC and FPGA implementations employ specific hardware structures to exploit parallelism in order to improve processing speed; however, optimizing resource usage requires the use of fixed-point arithmetic, thereby losing precision, and the final system is restricted to a particular network topology. This paper presents a mixed approach based on a multiprocessor system-on-chip (SoC) on a FPGA. The use of software-driven embedded microprocessors with custom floating-point extensions for ANN related functions allows for greater precision and flexibility in the structure of the networks to be trained with no need for device reconfiguration, and parallelism is achieved by the use of a large number of processing units. Design limitations are discussed, and preliminary results are presented on the performance of the system on an Altera DE2-70 development board.\",\"PeriodicalId\":162793,\"journal\":{\"name\":\"2008 International Conference on Advances in Electronics and Micro-electronics\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Conference on Advances in Electronics and Micro-electronics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ENICS.2008.22\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Advances in Electronics and Micro-electronics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ENICS.2008.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

人工神经网络(ann)的软件实现及其在顺序处理器上的训练效率低下,因为它们没有利用并行性。ASIC和FPGA实现采用特定的硬件结构来利用并行性,以提高处理速度;然而,优化资源使用需要使用定点算法,从而失去精度,并且最终系统被限制在特定的网络拓扑结构中。本文提出了一种基于FPGA上多处理器片上系统(SoC)的混合方法。使用软件驱动的嵌入式微处理器,为人工神经网络相关功能提供定制的浮点扩展,允许在不需要设备重新配置的情况下训练网络结构的更高精度和灵活性,并且通过使用大量处理单元实现并行性。讨论了设计限制,并在Altera DE2-70开发板上给出了系统性能的初步结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multiprocessor SoC Implementation of Neural Network Training on FPGA
Software implementations of artificial neural networks (ANNs) and their training on a sequential processor are inefficient because they do not take advantage of parallelism. ASIC and FPGA implementations employ specific hardware structures to exploit parallelism in order to improve processing speed; however, optimizing resource usage requires the use of fixed-point arithmetic, thereby losing precision, and the final system is restricted to a particular network topology. This paper presents a mixed approach based on a multiprocessor system-on-chip (SoC) on a FPGA. The use of software-driven embedded microprocessors with custom floating-point extensions for ANN related functions allows for greater precision and flexibility in the structure of the networks to be trained with no need for device reconfiguration, and parallelism is achieved by the use of a large number of processing units. Design limitations are discussed, and preliminary results are presented on the performance of the system on an Altera DE2-70 development board.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信