Implementation of convolutional-LSTM network based on CPU, GPU and pynq-zl board

2019 IEEE International Conference on Design & Test of Integrated Micro & Nano-Systems (DTS) Pub Date : 2019-04-01 DOI:10.1109/DTSS.2019.8915287

Amel Ben Mahjoub, Mohamed Atri

{"title":"Implementation of convolutional-LSTM network based on CPU, GPU and pynq-zl board","authors":"Amel Ben Mahjoub, Mohamed Atri","doi":"10.1109/DTSS.2019.8915287","DOIUrl":null,"url":null,"abstract":"Deep learning is among the most commonly investigated approach in computer vision area. Quite recently, considerable attention has been paid to develop an end-to-end deep learning approach for action recognition. According to the developments of these time and resource consuming deep learning models, there is now a growing interest in implementing an accelerator low-power hardware architecture. The main objective of this paper is to implement an optimized convolutional-Long Short Term Memory (LSTM) architecture based a low-cost pynq-zl design tool for human action recognition applications. Firstly, the pre-trained Convolutional Neural Network (CNN) model is applied to extract relevant features from videos. Secondly, the classification of these sequences is done using the LSTM with optimized parameters. Finally, the model testing step is performed on the ARM of the pynq-zl FPGA platform and compared with the performances obtained by the central processing unit and graphics processing tools. The experimental results, performed in UTD-MHAD dataset, prove the efficiency of our proposed approach.","PeriodicalId":342516,"journal":{"name":"2019 IEEE International Conference on Design & Test of Integrated Micro & Nano-Systems (DTS)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Design & Test of Integrated Micro & Nano-Systems (DTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DTSS.2019.8915287","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Deep learning is among the most commonly investigated approach in computer vision area. Quite recently, considerable attention has been paid to develop an end-to-end deep learning approach for action recognition. According to the developments of these time and resource consuming deep learning models, there is now a growing interest in implementing an accelerator low-power hardware architecture. The main objective of this paper is to implement an optimized convolutional-Long Short Term Memory (LSTM) architecture based a low-cost pynq-zl design tool for human action recognition applications. Firstly, the pre-trained Convolutional Neural Network (CNN) model is applied to extract relevant features from videos. Secondly, the classification of these sequences is done using the LSTM with optimized parameters. Finally, the model testing step is performed on the ARM of the pynq-zl FPGA platform and compared with the performances obtained by the central processing unit and graphics processing tools. The experimental results, performed in UTD-MHAD dataset, prove the efficiency of our proposed approach.

查看原文本刊更多论文

基于CPU、GPU和pynq-zl板的卷积lstm网络实现

深度学习是计算机视觉领域研究最多的方法之一。最近，开发端到端深度学习方法用于动作识别已经引起了相当大的关注。根据这些耗费时间和资源的深度学习模型的发展，现在人们对实现加速器低功耗硬件架构越来越感兴趣。本文的主要目标是实现基于低成本pynq-zl设计工具的优化卷积-长短期记忆(LSTM)架构，用于人体动作识别应用。首先，应用预训练的卷积神经网络(CNN)模型从视频中提取相关特征;其次，利用参数优化后的LSTM对这些序列进行分类。最后，在pynq-zl FPGA平台的ARM上进行了模型测试步骤，并与中央处理器和图形处理工具所获得的性能进行了比较。在UTD-MHAD数据集上进行的实验结果证明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE International Conference on Design & Test of Integrated Micro & Nano-Systems (DTS)

自引率

0.00%

发文量