{"title":"FPGA-based Deep-Learning Accelerators for Energy Efficient Motor Imagery EEG classification","authors":"Daniel Flood, Neethu Robinson, Shanker Shreejith","doi":"10.1109/COINS54846.2022.9854985","DOIUrl":null,"url":null,"abstract":"In recent years, Deep Learning has emerged as a powerful framework for analysing and decoding bio-signals like Electroencephalography (EEG) with applications in brain computer interfaces (BCI) and motor control. Deep convolutional neural networks have shown to be highly effective in decoding BCI signals for applications like two-class motor imagery decoding. Their deployment in real-time applications, however, requires highly parallel and capable computing platforms like GPUs to achieve high-speed inference, consuming a large amount of energy. In this paper, we explore a custom deep learning accelerator on an off-the-shelf hybrid FPGA device to achieve similar inference performance at a fraction of the energy consumption. We evaluate different optimisations at bit-level, data-path and training using a state-of-the-art deep convolutional neural network as our baseline model to arrive at our custom precision quantised deep learning model, which is implemented using the FINN compiler from Xilinx. The accelerator, deployed on a Xilinx Zynq Ultrascale+ FPGA, achieves a significant reduction in power consumption (≈ 17×), sub 2 ms decoding latency and a near-identical decoding accuracy (statistically insignificant reduction of 2.5% average) as the reported baseline subject-specific classification accuracy on an N (= 54) subject motor imagery EEG (MI-EEG) dataset compared to the Deep CNN model on GPU, making our approach more appealing for low-power real-time BCI applications. Furthermore, this design approach is transferable to other deep learning models reported in BCI research, paving the way for novel applications of real-time portable BCI systems.","PeriodicalId":187055,"journal":{"name":"2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COINS54846.2022.9854985","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
FPGA-based Deep-Learning Accelerators for Energy Efficient Motor Imagery EEG classification
In recent years, Deep Learning has emerged as a powerful framework for analysing and decoding bio-signals like Electroencephalography (EEG) with applications in brain computer interfaces (BCI) and motor control. Deep convolutional neural networks have shown to be highly effective in decoding BCI signals for applications like two-class motor imagery decoding. Their deployment in real-time applications, however, requires highly parallel and capable computing platforms like GPUs to achieve high-speed inference, consuming a large amount of energy. In this paper, we explore a custom deep learning accelerator on an off-the-shelf hybrid FPGA device to achieve similar inference performance at a fraction of the energy consumption. We evaluate different optimisations at bit-level, data-path and training using a state-of-the-art deep convolutional neural network as our baseline model to arrive at our custom precision quantised deep learning model, which is implemented using the FINN compiler from Xilinx. The accelerator, deployed on a Xilinx Zynq Ultrascale+ FPGA, achieves a significant reduction in power consumption (≈ 17×), sub 2 ms decoding latency and a near-identical decoding accuracy (statistically insignificant reduction of 2.5% average) as the reported baseline subject-specific classification accuracy on an N (= 54) subject motor imagery EEG (MI-EEG) dataset compared to the Deep CNN model on GPU, making our approach more appealing for low-power real-time BCI applications. Furthermore, this design approach is transferable to other deep learning models reported in BCI research, paving the way for novel applications of real-time portable BCI systems.