R. Aliaga, R. Gadea, R. Colom, J. Monzó, C. Lerche, J. Martinez, A. Sebastiá, F. Mateo
{"title":"多处理器SoC在FPGA上实现神经网络训练","authors":"R. Aliaga, R. Gadea, R. Colom, J. Monzó, C. Lerche, J. Martinez, A. Sebastiá, F. Mateo","doi":"10.1109/ENICS.2008.22","DOIUrl":null,"url":null,"abstract":"Software implementations of artificial neural networks (ANNs) and their training on a sequential processor are inefficient because they do not take advantage of parallelism. ASIC and FPGA implementations employ specific hardware structures to exploit parallelism in order to improve processing speed; however, optimizing resource usage requires the use of fixed-point arithmetic, thereby losing precision, and the final system is restricted to a particular network topology. This paper presents a mixed approach based on a multiprocessor system-on-chip (SoC) on a FPGA. The use of software-driven embedded microprocessors with custom floating-point extensions for ANN related functions allows for greater precision and flexibility in the structure of the networks to be trained with no need for device reconfiguration, and parallelism is achieved by the use of a large number of processing units. Design limitations are discussed, and preliminary results are presented on the performance of the system on an Altera DE2-70 development board.","PeriodicalId":162793,"journal":{"name":"2008 International Conference on Advances in Electronics and Micro-electronics","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Multiprocessor SoC Implementation of Neural Network Training on FPGA\",\"authors\":\"R. Aliaga, R. Gadea, R. Colom, J. Monzó, C. Lerche, J. Martinez, A. Sebastiá, F. Mateo\",\"doi\":\"10.1109/ENICS.2008.22\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software implementations of artificial neural networks (ANNs) and their training on a sequential processor are inefficient because they do not take advantage of parallelism. ASIC and FPGA implementations employ specific hardware structures to exploit parallelism in order to improve processing speed; however, optimizing resource usage requires the use of fixed-point arithmetic, thereby losing precision, and the final system is restricted to a particular network topology. This paper presents a mixed approach based on a multiprocessor system-on-chip (SoC) on a FPGA. The use of software-driven embedded microprocessors with custom floating-point extensions for ANN related functions allows for greater precision and flexibility in the structure of the networks to be trained with no need for device reconfiguration, and parallelism is achieved by the use of a large number of processing units. Design limitations are discussed, and preliminary results are presented on the performance of the system on an Altera DE2-70 development board.\",\"PeriodicalId\":162793,\"journal\":{\"name\":\"2008 International Conference on Advances in Electronics and Micro-electronics\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Conference on Advances in Electronics and Micro-electronics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ENICS.2008.22\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Advances in Electronics and Micro-electronics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ENICS.2008.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multiprocessor SoC Implementation of Neural Network Training on FPGA
Software implementations of artificial neural networks (ANNs) and their training on a sequential processor are inefficient because they do not take advantage of parallelism. ASIC and FPGA implementations employ specific hardware structures to exploit parallelism in order to improve processing speed; however, optimizing resource usage requires the use of fixed-point arithmetic, thereby losing precision, and the final system is restricted to a particular network topology. This paper presents a mixed approach based on a multiprocessor system-on-chip (SoC) on a FPGA. The use of software-driven embedded microprocessors with custom floating-point extensions for ANN related functions allows for greater precision and flexibility in the structure of the networks to be trained with no need for device reconfiguration, and parallelism is achieved by the use of a large number of processing units. Design limitations are discussed, and preliminary results are presented on the performance of the system on an Altera DE2-70 development board.