{"title":"基于流水线FFT架构的图像分类器的FPGA实现","authors":"Shafiqul Hai;Tella Rajashekhar Reddy","doi":"10.1109/LES.2024.3500020","DOIUrl":null,"url":null,"abstract":"Deep neural network (DNN) belongs to an important class of machine learning algorithms generally used to classify digital data in the form of image and speech recognition. The computational complexity of a DNN-based image classifier is higher than traditional fully connected (FC) feed-forward NNs. Therefore, dedicated cloud servers and graphical processor units (GPUs) are utilized to achieve high-speed and large-capacity computation tasks in machine vision systems. However, a growing demand exists for real-time processing of complex machine-learning tasks on embedded systems. As FC layers consume the highest fraction of computational power and memory footprint, innovating novel power-efficient and low-footprint NN architecture for embedded systems is crucial. In this letter, a pipelined and parallel fast Fourier transform (FFT)-based FC-DNN architecture is implemented on Stratix-10 FPGA using VHDL. The footprint of the DNN is further reduced using a folded FFT network. The proposed algorithm is tested using two benchmark training set examples, the MNIST database of handwritten digits and the CIFAR-10 database. In both cases, we achieve <inline-formula> <tex-math>${\\gt }~90$ </tex-math></inline-formula>% accuracy, while the power consumption of the 2-parallel folded FFT-based network is around 45% less than the traditional series FFT-based architectures.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"17 3","pages":"188-191"},"PeriodicalIF":2.0000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FPGA Implementation of an Image Classifier Using Pipelined FFT Architecture\",\"authors\":\"Shafiqul Hai;Tella Rajashekhar Reddy\",\"doi\":\"10.1109/LES.2024.3500020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural network (DNN) belongs to an important class of machine learning algorithms generally used to classify digital data in the form of image and speech recognition. The computational complexity of a DNN-based image classifier is higher than traditional fully connected (FC) feed-forward NNs. Therefore, dedicated cloud servers and graphical processor units (GPUs) are utilized to achieve high-speed and large-capacity computation tasks in machine vision systems. However, a growing demand exists for real-time processing of complex machine-learning tasks on embedded systems. As FC layers consume the highest fraction of computational power and memory footprint, innovating novel power-efficient and low-footprint NN architecture for embedded systems is crucial. In this letter, a pipelined and parallel fast Fourier transform (FFT)-based FC-DNN architecture is implemented on Stratix-10 FPGA using VHDL. The footprint of the DNN is further reduced using a folded FFT network. The proposed algorithm is tested using two benchmark training set examples, the MNIST database of handwritten digits and the CIFAR-10 database. In both cases, we achieve <inline-formula> <tex-math>${\\\\gt }~90$ </tex-math></inline-formula>% accuracy, while the power consumption of the 2-parallel folded FFT-based network is around 45% less than the traditional series FFT-based architectures.\",\"PeriodicalId\":56143,\"journal\":{\"name\":\"IEEE Embedded Systems Letters\",\"volume\":\"17 3\",\"pages\":\"188-191\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Embedded Systems Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10755115/\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Embedded Systems Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10755115/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
FPGA Implementation of an Image Classifier Using Pipelined FFT Architecture
Deep neural network (DNN) belongs to an important class of machine learning algorithms generally used to classify digital data in the form of image and speech recognition. The computational complexity of a DNN-based image classifier is higher than traditional fully connected (FC) feed-forward NNs. Therefore, dedicated cloud servers and graphical processor units (GPUs) are utilized to achieve high-speed and large-capacity computation tasks in machine vision systems. However, a growing demand exists for real-time processing of complex machine-learning tasks on embedded systems. As FC layers consume the highest fraction of computational power and memory footprint, innovating novel power-efficient and low-footprint NN architecture for embedded systems is crucial. In this letter, a pipelined and parallel fast Fourier transform (FFT)-based FC-DNN architecture is implemented on Stratix-10 FPGA using VHDL. The footprint of the DNN is further reduced using a folded FFT network. The proposed algorithm is tested using two benchmark training set examples, the MNIST database of handwritten digits and the CIFAR-10 database. In both cases, we achieve ${\gt }~90$ % accuracy, while the power consumption of the 2-parallel folded FFT-based network is around 45% less than the traditional series FFT-based architectures.
期刊介绍:
The IEEE Embedded Systems Letters (ESL), provides a forum for rapid dissemination of latest technical advances in embedded systems and related areas in embedded software. The emphasis is on models, methods, and tools that ensure secure, correct, efficient and robust design of embedded systems and their applications.